Pricing & usage

Explanation — what & why

Vectros is credit-based, pay-as-you-go: you are metered on what you actually do, and a single usage report shows where you stand. There is no seat licence to buy before you can make a call, and reads are not metered — so exploring and serving your own data is not what drives cost.

Usage is metered on two independent axes, kept separate on purpose because data-plane work and model inference have very different cost structures:

  • A monthly credit allowance covers data-plane work — record writes, document ingests (including the indexing they trigger), and searches. It resets each calendar month, and the usage report tells you how much of the period's allowance you have consumed.
  • A pre-paid inference balance covers AI calls — chat, RAG, and document-ask. It is a ledger, denominated in cents, that draws down as you make inference calls and is topped up out of band. It does not reset monthly the way the credit allowance does.

A property worth designing around: reads are free — they do not draw down the monthly credit allowance, and they are exempt from the per-tenant business rate limit. Only writes and searches count against the allowance; only inference draws down the balance. So a read-heavy application is cheap to run on the data plane, and the cost you should reason about is the cost of writing and asking, not the cost of reading back.

Usage is also broken down per environment. Your account totals are the sum of your live tenant and your test tenant, each reported separately, so you can see what production traffic costs versus what your test traffic costs — and the two reconcile up to the account total. Test traffic is metered like production, so a noisy test loop shows up in the report rather than hiding.

How-to — read what you are using

Pull the current period's consumption — the monthly credit allowance and the pre-paid inference balance — in one call:

const usage = await client.auth.getUsage();
console.log(usage.credits.used);          // whole credits used this period
console.log(usage.credits.usedMilli);     // exact figure in milli-credits (1 credit = 1000)

getUsage returns the report object directly — it is not wrapped in the list envelope and does not paginate. Pass a { year, month } to read a prior period. The full report shape (per-environment breakdown, the inference balance, and the limits) is in the Operations & trust reference; the step-by-step is in the Operations & trust how-to.

Access to the report is itself scoped: a token needs the billing:r permission to read it. That lets you hand a read-only billing view to an internal dashboard without granting it any data access — a token scoped to, say, records:r cannot read your billing figures.

Notes & limits

  • Qualitative here by design. This page describes the model — what is metered, on which axis, and how to read it. Current rates, the monthly credit allowance, and inference pricing are quoted at sign-up / in your account, not pinned in the docs, so a stale number never ships here.
  • Reads are free; writes, searches, and inference are not. Architect for that asymmetry — cache nothing you can re-read, but batch writes where you can.
  • Two axes never net against each other. Running out of monthly data-plane credits does not consume your inference balance, and vice versa; each is reported and topped up on its own terms.

Where to go next