# Pricing & usage

## Explanation — what & why

Vectros is **credit-based, pay-as-you-go**: you are metered on what you actually
do, and a single usage report shows where you stand. There is no seat licence to
buy before you can make a call, and reads are not metered — so exploring and
serving your own data is not what drives cost.

Usage is metered on **two independent axes**, kept separate on purpose because
data-plane work and model inference have very different cost structures:

- **A monthly credit allowance** covers **data-plane work** — record writes,
  document ingests (including the indexing they trigger), and searches. It
  **resets each calendar month**, and the usage report tells you how much of the
  period's allowance you have consumed.
- **A pre-paid inference balance** covers **AI calls** — chat, RAG, and
  document-ask. It is a ledger, denominated in cents, that **draws down** as you
  make inference calls and is **topped up out of band**. It does **not** reset
  monthly the way the credit allowance does.

A property worth designing around: **reads are free** — they do not draw down the
monthly credit allowance, and they are exempt from the per-tenant business rate
limit. Only **writes and searches** count against the allowance; only
**inference** draws down the balance. So a read-heavy application is cheap to run
on the data plane, and the cost you should reason about is the cost of *writing*
and *asking*, not the cost of *reading back*.

Usage is also **broken down per environment**. Your account totals are the sum of
your **live** tenant and your **test** tenant, each reported separately, so you
can see what production traffic costs versus what your test traffic costs — and
the two reconcile up to the account total. Test traffic is metered like
production, so a noisy test loop shows up in the report rather than hiding.

## How-to — read what you are using

Pull the current period's consumption — the monthly credit allowance and the
pre-paid inference balance — in one call:

```ts
const usage = await client.auth.getUsage();
console.log(usage.credits.used);          // whole credits used this period
console.log(usage.credits.usedMilli);     // exact figure in milli-credits (1 credit = 1000)
```

`getUsage` returns the report object directly — it is **not** wrapped in the list
envelope and does not paginate. Pass a `{ year, month }` to read a prior period.
The full report shape (per-environment breakdown, the inference balance, and the
limits) is in the [Operations & trust reference](operations-trust/reference.md);
the step-by-step is in the [Operations & trust how-to](operations-trust/how-to.md).

Access to the report is itself scoped: a token needs the **`billing:r`**
permission to read it. That lets you hand a read-only billing view to an internal
dashboard without granting it any data access — a token scoped to, say,
`records:r` cannot read your billing figures.

## Notes & limits

- **Qualitative here by design.** This page describes the *model* — what is
  metered, on which axis, and how to read it. Current rates, the monthly credit
  allowance, and inference pricing are quoted at sign-up / in your account, not
  pinned in the docs, so a stale number never ships here.
- **Reads are free; writes, searches, and inference are not.** Architect for that
  asymmetry — cache nothing you can re-read, but batch writes where you can.
- **Two axes never net against each other.** Running out of monthly data-plane
  credits does not consume your inference balance, and vice versa; each is
  reported and topped up on its own terms.

## Where to go next

- [Operations & trust — how-to](operations-trust/how-to.md) — read the usage
  report and hand a billing-only view to a dashboard.
- [Operations & trust — concept](operations-trust/explanation.md) — the usage and
  billing model in full, alongside the rest of the trust posture.
- [Quickstart](getting-started/quickstart.md) — make your first metered call.
