# SDK

## Explanation — what & why

The Vectros SDK is a typed client library for the platform's REST API. Rather
than hand-rolling HTTP requests, signing headers, and decoding paginated
envelopes, you construct one client and call typed methods grouped by resource:
`client.records.createRecord(...)`, `client.search.content(...)`,
`client.inference.ragInference(...)`, and so on.

The SDK is **generated from the API specification**, not written by hand. The
spec is the single source of truth: the same definition drives the API gateway,
the rendered API reference, and the SDK. That is why the method names, request
shapes, and response types in your editor match the API exactly — they are the
same artifact, expressed in TypeScript.

The SDK does the mechanical work you would otherwise repeat in every project:

- **Typed requests and responses** — your editor autocompletes fields and flags
  mistakes before you run anything.
- **Streaming as async iteration** — chat, RAG, and document-ask endpoints are
  server-sent-event streams; the SDK exposes them as async iterables you can
  `for await` over.
- **A uniform list envelope** — list, lookup, and version endpoints return a
  cursor-paginated envelope (see below) so draining a full result set is one
  loop, the same every time.

### Language coverage — state it honestly

The SDK is generated for **Node, Java, and Python**, and **all three are now
exercised end-to-end against staging at parity** — each ships a smoke suite
(`smoke-tests/{,sdk-python/,sdk-java/}`) covering the same MVP surface (auth,
the list envelope, the error contract, chat/RAG streaming, and cross-context
isolation) against the live API, so the call shapes that back this documentation
are verified in every language. The snippets here use Node idioms; the Java and
Python equivalents follow the same resource grouping and are validated against
the same wire contract. See the generated API reference for each language's
exact idioms.

The three clients are **generated per language, not hand-aligned** — so the
client class and its constructor differ, and methods follow each language's
casing. Don't assume the Node shape transfers verbatim:

| Language | Install | Construct |
|---|---|---|
| Node | `npm install @vectros-ai/sdk` | `new VectrosClient({ token, environment })` |
| Python | `pip install vectros` | `VectrosApi(base_url=..., token=...)` |
| Java | `ai.vectros:vectros-sdk` (Maven Central) | `VectrosApiClient.builder().token(...).url(...).build()` |

Method names mirror the API operations in each language's idiom — Node and Java
camelCase (`createRecord`), Python snake_case (`create_record`). Runnable
Python and Java equivalents of the Node snippets below are in
[Python and Java](#python-and-java).

### The list envelope (cursor pagination)

List, lookup, and version endpoints return a uniform envelope:

```ts
{ data: T[], nextCursor: string | null }
```

To drain every page, feed `nextCursor` back as `startFrom` until it comes back
`null`:

```ts
const ids: string[] = [];
let cursor: string | null | undefined;
do {
  const page = await client.schemas.listSchemas(
    cursor ? { startFrom: cursor, limit: 100 } : { limit: 100 },
  );
  ids.push(...(page.data ?? []).map((s) => s.id!));
  cursor = page.nextCursor;
} while (cursor);
```

Two endpoints are **not** enveloped and do not paginate this way:

- **`client.search.content(...)`** returns a results object directly (its hits
  are on `results`), with `limit`/`offset` for paging.
- **`getUsage`** returns its usage payload directly.

## How-to

### Install

```bash
npm install @vectros-ai/sdk
```

### Construct a client

The client takes a token and a target environment (the API base URL):

```ts
import { VectrosClient } from '@vectros-ai/sdk';

const client = new VectrosClient({
  token: process.env.VECTROS_API_KEY!,            // sk_*, ssk_*, or st_*
  environment: 'https://api.vectros.ai',          // staging: https://api.staging.vectros.ai
});
```

`token` accepts any of the three credential types — a root key (`sk_*`), a
scoped permanent key (`ssk_*`), or a short-lived scoped token (`st_*`). The
client makes no distinction; the platform enforces each credential's reach.

Sub-clients are grouped by resource tag:

| Sub-client | Covers |
|---|---|
| `client.records` | structured records |
| `client.schemas` | record/surface schemas |
| `client.documents` | documents (ingest, upload, retrieve) |
| `client.folders` | folder hierarchy |
| `client.search` | unified search |
| `client.inference` | chat, RAG, document-ask, model catalog |
| `client.identity` | users, orgs, clients |
| `client.auth` | ping, scoped-token mint, and related auth calls |

### First call — define a schema, write a record, read it back

```ts
// 1. Define a record type. HYBRID indexing = keyword + semantic.
const schema = await client.schemas.createSchema({
  typeName: 'patient',
  displayName: 'Patient Record',
  indexMode: 'HYBRID',
  allowedSurfaces: ['record'],
  fields: [
    { fieldId: 'name', fieldType: 'string', required: true, searchable: true },
    { fieldId: 'notes', fieldType: 'string', searchable: true },
    { fieldId: 'department', fieldType: 'string', filterable: true },
    { fieldId: 'email', fieldType: 'string', required: true },
  ],
  lookupFields: [{ fieldName: 'email', unique: true }],
  capabilities: { auditHistory: true },
});

// 2. Write a record. (Synthetic, clearly-fictional data only.)
const record = await client.records.createRecord({
  typeName: 'patient',
  schemaId: schema.id!,
  payload: {
    name: 'Jane Doe',
    notes: 'presents with hypertension and complains of chest pain',
    department: 'cardiology',
    email: 'jane.doe@example.test',
  },
});

// 3. Read it back.
const loaded = await client.records.getRecord({ id: record.id! });
```

**Expected result:** `createRecord` returns immediately with
`indexStatus: 'PENDING_INDEX'`; indexing is asynchronous. `getRecord` returns
the stored record, and once indexing completes `indexStatus` becomes
`'INDEXED'`.

### Search across records and documents

`searchContent` is unified — it returns both record and document hits unless you
narrow with `contentTypes`. It is not enveloped; read `results`.

```ts
const results = await client.search.content({
  query: 'hypertension chest pain',
  mode: 'HYBRID',          // TEXT | SEMANTIC | HYBRID
  limit: 50,
});
for (const hit of results.results ?? []) {
  console.log(hit.documentId, hit.semanticScore, hit.textScore);
}
```

### Stream a RAG answer

Streaming endpoints are async iterables. Each yielded event has an `event`
discriminator (`search_results`, `content_delta`, `done`, and so on):

```ts
const stream = await client.inference.ragInference({
  query: 'What treatment is recommended for stage 1 hypertension?',
  search: { mode: 'HYBRID', limit: 5 },
  maxTokens: 256,
});

let answer = '';
for await (const ev of stream) {
  if (ev.event === 'content_delta') answer += ev.delta ?? '';
  if (ev.event === 'done') console.log('charged:', ev.inferenceBalanceCentsCharged);
}
```

Plain chat (`client.inference.chatInference`) and document-scoped ask
(`client.inference.documentAsk`) follow the same async-iterable pattern.

### Drain a paginated list

See the envelope loop in the Explanation section above — that is the canonical
drain pattern for any `list*`, `lookup*`, or `get*Versions` call.

### Update strategies — full replace vs. partial patch (SDK 0.26+)

The default update is a **full replace** (PUT-style): you send the complete
desired body.

```ts
await client.records.updateRecord({
  id: record.id!,
  body: {
    typeName: 'patient',
    schemaId: schema.id!,
    payload: { name: 'Jane Doe', notes: 'updated note', department: 'cardiology', email: 'jane.doe@example.test' },
  },
});
```

A **partial patch** (RFC-7386 merge-patch) is available on records, documents,
and folders **with SDK 0.26+**: send only the fields you want to change; omitted
fields are left untouched, and an explicit `null` deletes a field.

```ts
// (SDK 0.26+) — patch only the notes field.
await client.records.patchRecord({
  id: record.id!,
  body: { payload: { notes: 'patched note' } },
});
```

### Create a record by type name or by schema id (SDK 0.26+)

With **SDK 0.26+**, `createRecord` accepts **either** `typeName` **or**
`schemaId` (you no longer need both) — supply whichever you have on hand.
Earlier SDK pins expect both; the first-call example above passes both, which
works on every version.

### Python and Java

The snippets above are Node. The Python (`vectros` on PyPI) and Java
(`ai.vectros:vectros-sdk` on Maven Central) SDKs are generated from the same
spec and cover the same surface; the install, client construction, and method
casing differ per language. Each block below is the cross-language twin of the
*Construct a client* + *first call* (define a schema, write a record, read it
back) above.

#### Python

```bash
pip install vectros
```

```python
import os
import vectros
from vectros import VectrosApi

# Note: the keyword is `base_url` in Python — NOT Node's `environment`.
client = VectrosApi(
    base_url="https://api.vectros.ai",       # staging: https://api.staging.vectros.ai
    token=os.environ["VECTROS_API_KEY"],     # sk_*, ssk_*, or st_*
)

# 1. Define a record type. HYBRID indexing = keyword + semantic.
schema = client.schemas.create_schema(
    type_name="patient",
    display_name="Patient Record",
    index_mode="HYBRID",                     # HYBRID | SEMANTIC | TEXT
    allowed_surfaces=["record"],
    fields=[
        vectros.FieldDef(field_id="name", field_type="string", required=True, searchable=True),
        vectros.FieldDef(field_id="email", field_type="string", required=True),
    ],
    lookup_fields=[vectros.LookupDef(field_name="email", unique=True)],
)

# 2. Write a record. (Synthetic, clearly-fictional data only.)
record = client.records.create_record(
    type_name="patient",
    schema_id=schema.id,
    payload={"name": "Jane Doe", "email": "jane.doe@example.test"},
)

# 3. Read it back.
loaded = client.records.get_record(record.id)
```

Streaming inference is an iterator of discriminated events (`event` =
`content_delta`, `done`, ...), the same shape as Node:

```python
stream = client.inference.chat_inference(
    messages=[vectros.ChatMessage(role="user", content="Say hello.")],
    max_tokens=64,
)
answer = "".join(ev.delta or "" for ev in stream if ev.event == "content_delta")
```

Sub-clients (`client.records`, `client.schemas`, `client.documents`,
`client.folders`, `client.search`, `client.inference`, `client.identity`,
`client.auth`) and the list envelope (`page.data` / `page.next_cursor`) match
the Node grouping. API errors raise `vectros.core.api_error.ApiError`
(`err.status_code` carries the HTTP status).

#### Java

Maven:

```xml
<dependency>
  <groupId>ai.vectros</groupId>
  <artifactId>vectros-sdk</artifactId>
  <version>0.29.9</version>
</dependency>
```

Gradle:

```groovy
implementation 'ai.vectros:vectros-sdk:0.29.9'
```

(The version above is the current published release; confirm the latest on
Maven Central.)

```java
import ai.vectros.VectrosApiClient;
import ai.vectros.types.FieldDef;
import ai.vectros.types.FieldDefFieldType;
import ai.vectros.types.LookupDef;
import ai.vectros.types.RecordRequest;
import ai.vectros.types.SchemaRequest;
import ai.vectros.types.SchemaRequestAllowedSurfacesItem;
import ai.vectros.types.SchemaRequestIndexMode;
import java.util.List;
import java.util.Map;

VectrosApiClient client = VectrosApiClient.builder()
    .token(System.getenv("VECTROS_API_KEY"))   // sk_*, ssk_*, or st_*
    .url("https://api.vectros.ai")             // staging: https://api.staging.vectros.ai
    .build();

// 1. Define a record type. HYBRID indexing = keyword + semantic.
var schema = client.schemas().createSchema(SchemaRequest.builder()
    .typeName("patient")
    .displayName("Patient Record")
    .indexMode(SchemaRequestIndexMode.HYBRID)   // HYBRID | SEMANTIC | TEXT
    .allowedSurfaces(List.of(SchemaRequestAllowedSurfacesItem.RECORD))
    .fields(List.of(
        FieldDef.builder().fieldId("name").fieldType(FieldDefFieldType.STRING).required(true).searchable(true).build(),
        FieldDef.builder().fieldId("email").fieldType(FieldDefFieldType.STRING).required(true).build()))
    .lookupFields(List.of(LookupDef.builder().fieldName("email").unique(true).build()))
    .build());

// 2. Write a record. (Synthetic, clearly-fictional data only.)
var record = client.records().createRecord(RecordRequest.builder()
    .typeName("patient")
    .schemaId(schema.getId().orElseThrow())
    .payload(Map.of("name", "Jane Doe", "email", "jane.doe@example.test"))
    .build());

// 3. Read it back.
var loaded = client.records().getRecord(record.getId().orElseThrow());
```

Java request DTOs live under `ai.vectros.types.*`; per-endpoint request wrappers
(e.g. `ChatRequest`, `ListRecordsRequest`) under
`ai.vectros.resources.<area>.requests.*`. Sub-clients are accessor methods
(`client.records()`, `client.inference()`, ...), responses use `Optional`
getters (`record.getId().orElseThrow()`), and streaming is an `Iterable` of
discriminated events (`event.isContentDelta()` / `event.getDone()`). API errors
throw `ai.vectros.core.VectrosApiApiException` (`ex.statusCode()`).

## Reference

### Client construction

| Option | Type | Notes |
|---|---|---|
| `token` | string | `sk_*`, `ssk_*`, or `st_*`. Required. |
| `environment` | string | API base URL, e.g. `https://api.vectros.ai` (production) or `https://api.staging.vectros.ai` (staging). Required. |

### Sub-clients (by tag)

`records`, `schemas`, `documents`, `folders`, `search`, `inference`,
`identity`, `auth`. Method names mirror the API operations
(`createRecord`, `getRecord`, `updateRecord`, `patchRecord`, `deleteRecord`,
`listRecords`, `lookupRecords`, `getRecordVersions`, `getRecordTombstone`,
`createSchema`, `ingestDocument`, `uploadDocument`, `getDocument`,
`createFolder`, `chatInference`, `ragInference`, `documentAsk`,
`listInferenceModels`, `createUser`/`createOrg`/`createClient`, `ping`, and so
on). The complete, exhaustive list of methods, parameters, and field types is in
the generated API reference — link to it by name rather than recreating it here.

### Envelope shape

| Field | Type | Meaning |
|---|---|---|
| `data` | `T[]` | The page of items. |
| `nextCursor` | `string \| null` | Pass back as `startFrom` for the next page; `null` means no more pages. |

Applies to `list*`, `lookup*`, and `get*Versions`. **Does not** apply to
`search.content` or `getUsage`.

### Index modes

`HYBRID` (keyword + semantic, the general-purpose default), `SEMANTIC`
(meaning-based only), `TEXT` (keyword only). Declared per schema (records) or
per document at ingest.

### Concurrency

Record and document updates accept an optional `expectedVersion` in the body for
optimistic concurrency; a mismatch is rejected with HTTP 409. Omit it for
last-writer-wins.

### Version reality (read before pinning)

- The API spec / SDK is at version **0.29.9** (this figure is sourced from
  the live release at build time, so it always reflects the current version). The
  generated Node SDK is a continuous-integration artifact built from that spec — it
  is **not committed** to this repository, so you install it from the package
  registry.
- Among the in-repo consumers, the **CLI** and **MCP server** bundle the current
  **`0.29.x` staging** SDK build, as do the **admin-app** and **app-vectros-ai** SPAs.
  (The React toolkit — an auth/UI library, not a data client — accepts any SDK `>=0.9.0`
  via its peer dependency and is dev-tested against the same staging pin.) If you depend
  on a consumer's bundled SDK, check its pin rather than assuming the latest. The
  consumer pins are kept in lockstep with the SDK version on each release.
- **0.26+ surface** — these require SDK 0.26 or newer and are absent on older pins:
  - `patchRecord` / `patchDocument` / `patchFolder` (RFC-7386 merge-patch).
  - `createRecord` with **`typeName` *or* `schemaId`** (the either-or form).
    Passing both works on all versions.
  - Because the **reference web apps** still pin a 0.23 build, these calls are
    reachable from the SDK, CLI, and MCP server but **not** through those apps'
    own UI. If you fork a reference app, plan to wire these calls in yourself.
- **0.27 added** an optional lower-cost global-region path for inference
  (`allowGlobalRegion` on chat / RAG / document-ask) for tenants entitled to it;
  see [Search and RAG reference](../search-rag/reference.md).

### Notes & limits — what the SDK does NOT do

- **Schemas have no patch.** Schema updates are full-replace (PUT) only; there is
  no `patchSchema`.
- **Folders cannot be moved or reparented** after creation. There is no move
  operation.
- **There is no identity patch.** Users, orgs, and clients update by full
  replace.
- **`search.content` and `getUsage` are not paginated by the cursor envelope** —
  do not look for `nextCursor` on them.
- **Indexing is asynchronous.** A freshly created/ingested item is
  `PENDING_INDEX` and not immediately searchable; poll its status (or its
  appearance in search) before asserting it is queryable.
- **All three SDKs (Node, Java, Python) are smoke-tested end-to-end against
  staging at parity** — see `smoke-tests/{,sdk-python/,sdk-java/}`. The staging
  pipeline smokes all three on every change, so a cross-language wire-contract
  drift surfaces on staging before a release is cut.

## Where to go next

- [cli.md](cli.md) — provision schemas, contexts, and scoped keys from the
  terminal instead of writing this by hand.
- [mcp.md](mcp.md) — expose these same operations to an agent as tools.
- [blueprints.md](blueprints.md) — declare a whole schema set + access profile in
  one file.
- The generated API reference — the exhaustive, per-endpoint contract the SDK is
  generated from.