Agentic-SDLC Knowledge Base
The flagship blueprint: a whole-SDLC system of record for an AI development team — architecture decisions, the controls that govern the code, the conventions and gotchas an agent needs at a cold start, the runbooks, the post-mortems — modeled as nine typed schemas, linked into a cross-surface knowledge graph, and recalled by hybrid search + grounded RAG. It's the blueprint we run our own engineering org on.
What it is — your team's engineering memory, recalled by meaning
The most valuable non-code asset a software team has is the reasoning around the code: why a decision was made, the rule before you touch the key boundary, the runbook for a wedged deploy, the gotcha that bit you last quarter. That knowledge usually lives as fragile, un-indexed local markdown — one bad disk from gone, and invisible to the agents doing the work.
The agentic-sdlc blueprint puts it on Vectros instead, split by a single principle: content lives as documents, structure lives as records. ADRs, designs, references, runbooks, and post-mortems are documents — the prose is the artifact, so you read it and ask it questions. Controls, conventions, gotchas, and the glossary are records — the typed fields are the artifact, so you query and enumerate them. They link into one cross-surface knowledge graph, and the whole thing is recalled by meaning: "why is it shaped this way?" returns the actual decision, cited.
This is the blueprint we dogfood internally — our own engineering org runs on it — which makes it both the most complete demonstration of the platform (typed records + deterministic lookup + a typed reference graph + hybrid search + grounded RAG + records-and-documents unified + the dual human/agent surface + governance, in one runnable use case) and the most credible.
What bootstrap provisions
One command stands up everything below — no application code.
- Nine schemas, split content vs structure:
- Documents (the markdown body is the artifact, searched + answered over):
decision(ADRs),design,reference,runbook,postmortem. Each carries typed metadata (summary, status, area, tags, date) and a stableexternalId, with a range/sort index on its date. - Records (typed fields, exact-queryable):
control(a governance boundary that records its ownevidence),convention(distinctrule/why/howToApplyfields),gotcha(symptom → cause → fix), andterm(a glossary entry with auniqueexact-lookup).
- Documents (the markdown body is the artifact, searched + answered over):
- A cross-surface knowledge graph — typed
referenceedges where records point at documents (control.verifiedBy→ therunbookthat proves it;convention.establishedBy/term.relatedDecision→ thedecisionbehind them) and documents point at documents (decision.supersedes,design.relatedDecision,runbook.bornFrom→ apostmortem). Provenance is navigable, not just searchable. - A least-privilege access profile —
records:r/c/u,search:r,schemas:r,inference:r,documents:r/c,folders:r/c. Note the deliberate absence of delete: knowledge is superseded or retired via a status flip, so the audit trail of how the team's thinking changed stays intact. - No bundled seeds. The blueprint ships seedless — it provisions the nine schemas and a scoped key, and you fill it from your own corpus (next section). So the context starts clean; there's nothing synthetic to remove.
The apply is idempotent: re-running converges rather than duplicating, because every item is keyed by its externalId — so the knowledge base is rebuildable from source at any time.
Before you start
This is an invite-only 0.x preview, so the honest prerequisites: you need an early-access invite, and from the dev portal you mint a short-lived bridge token — the human step that authenticates the CLI before it can provision anything. You also need Node (the CLI and MCP server run via npx).
1. Bootstrap the blueprint (seedless)
npm i -g @vectros-ai/cli # or prefix the commands below with: npx -y @vectros-ai/cli
vectros login # one-time, browser sign-in
vectros bootstrap --blueprint agentic-sdlc --no-seed --yes
vectros whoami # confirm tenant + scoped key
This provisions the context + the nine schemas + a least-privilege ssk_* (written once to ~/.vectros/agentic-sdlc.key.json) and safe-merges the Vectros MCP server into your Claude Desktop config — use --client code for Claude Code, or --print to emit the snippet for any other MCP client. Add --tenant test to provision into the test tenant first for a dry run.
2. Ingest your corpus (agent-driven, idempotent)
There are two ingest paths, by surface — both driven by an ingest agent pointed at your source files with the bundled orientation prompt (an LLM maps your semi-structured docs to the right type far better than a brittle parser, and it's idempotent by externalId).
- Documents — the prose artifacts keep their markdown body as-is; the agent fills the typed metadata and calls
document_ingestagainst the matching schema (decision/design/reference/runbook/postmortem), withpayloadcarryingsummary,status,area,tags,date, and any references. - Records — the structured artifacts are typed fields, not prose; the agent extracts them and calls
record_createper item (aconvention'srule/why/howToApply, acontrol'sevidence+verifiedByrunbook, aterm's unique key + definition).
Cross-surface edges resolve by the target's externalId, so ingest the referenced documents before the records that point at them. A one-shot backfill is that agent looped over your docs/, ADRs, and memory; an ongoing sync re-runs it on change — the same externalIds converge.
A bulk backfill is exactly the workload that trips the API's per-minute rate limit (per tenant, counting writes + searches, shared across all of a tenant's keys). Pace the ingest — for the free tier, roughly one record every couple of seconds is safe — and on an HTTP 429, honor the Retry-After header. Because ingest is idempotent, a backfill that pauses or restarts simply converges; it never double-writes.
3. Query it — the recall payoff
| You want… | Call |
|---|---|
| "Why did we decide X?" (grounded, cited) | rag_ask "why did we choose X?" — answers over document bodies |
| "Which critical controls are active, and how is each proven?" | record_query control { criticality:"critical", status:"active" } → follow verifiedBy to the runbook |
| "What's the active rule for area X?" | record_query convention { area:"<area>", status:"active" } |
| "Have we hit this failure before?" | hybrid_search "<symptom>" contentTypes:["documents"], typeName:"postmortem"; plus record_query gotcha { area:"deploy", status:"active" } |
| "Define X" | record_query term { term:"X" } (unique lookup) |
| "Latest decisions / search the designs" | hybrid_search "<topic>" contentTypes:["documents"], typeName:"decision" (or "design") |
Recall by meaning (hybrid_search / rag_ask) and deterministic enumeration (record_query by lookup field) are both first-class — a knowledge base needs both: search to surface the relevant decision you'd never remember by filename, lookup to enumerate a known slice ("every active control").
4. The dual surface — agents capture, your team browses
The same typed context is reachable two ways. Agents capture and recall over MCP — the bundled orientation prompt wires the recall-before-acting / capture-after loop, so a cold-start session reads the conventions before it writes code and records the new decision after. Your team browses the exact same records and documents in the data-plane app — no separate export, no second copy. One governed store, two surfaces.
5. Bridge your issue tracker — don't mirror it
Your tracker (GitLab, Jira, Linear) and this knowledge base are two planes with different jobs: the tracker owns live status (open/closed, assignee — volatile); the knowledge base owns durable recall (why/how/lessons — stable). Mirroring issues in creates a stale shadow copy that buries your decisions under issue churn. Instead, promote by reference: when you close out work that carries durable knowledge, distill it into the right type, tag it issue:<id>, and note the externalId back in the tracker. Be selective — most issues promote nothing; that selectivity is what keeps recall high-signal.
Keep it healthy
- Record the why — rationale is the most-recalled field; a statement without it is a log entry, not knowledge.
- Supersede, don't delete — flip
statusso the evolution trail survives (the access profile has no delete by design). - Re-ingest is idempotent — keyed on
externalId, so a backfill converges and the whole knowledge base can be rebuilt from source at any time.
Customize
This is a starting point — fork it for your org. Swap the area vocabulary for your subsystems; adjust the status / severity enums to your lifecycle; add or remove schemas (content-heavy types belong on the document surface, structure-heavy types are records — add a separate type when the shape differs or a first-class type strengthens references, not for a near-identical clone). Note that lookups are migration-locked — the equality-vs-range choice is fixed once a schema is live, so choose deliberately.