Inference

Simple pricing. No surprises.

Pay per token. No minimums on Standard. No hidden fees. One blended rate for input and output.

Standard

For teams getting started

$4.00

per 1M tokens

No commitment
No minimum volume
1M token context
All reasoning modes
Best-effort SLA
Email support

Get Started

Best Value

Volume

For production workloads

$3.00

per 1M tokens

5B+ tokens/month
3-month minimum term
1M token context
All reasoning modes
99.5% uptime SLA
Priority email support

Contact Sales

Enterprise

For regulated industries

Custom

dedicated capacity

Dedicated GPU allocation
12-month term
1M token context
All + custom fine-tuned models
99.9% uptime SLA
Dedicated account manager

Contact Sales

How much could you save?

Most teams save 60–80% by switching from Anthropic or OpenAI.

Savings calculator

Enter your current monthly spend on Anthropic or OpenAI.

Current monthly spend$10,000

$1,000$100,000

Current (Anthropic/OpenAI)$10,000/mo

Continuum$2,000/mo

$96,000

saved per year

$8,000/mo saving80% reduction

Estimate assumes equivalent token volume migrated from Claude Opus 4.7 (blended AUD ~$20/1M tokens) to Continuum Standard at $4/1M. Actual savings depend on your model mix and input/output ratio.

Price comparison

Same API format. Same capability tier. Dramatically different price.

Provider	Model	Blended / 1M tok	Data Location	Sovereign
Continuum	V4-Flash	$4.00	Australia
Anthropic	Claude Opus 4.7	~$20	US	—
Anthropic	Claude Sonnet 4.6	~$12	US	—
OpenAI	GPT-5.5	~$22	US	—
OpenAI	GPT-4o	~$8	US	—

Included on every tier

OpenAI-compatible API

NVIDIA H200 GPUs

1M token context window

Three reasoning modes

Function calling (128 tools)

JSON structured output

Streaming (SSE)

Zero data retention

Australian sovereign infra

Billing details

Blended pricing

Single rate per 1M tokens covering both input and output. No separate input/output billing. Simplifies budgeting and removes the need to optimise input/output ratios.

Thinking tokens

Billed at the same per-token rate. Think High and Think Max generate additional reasoning tokens that count toward your usage. Monitor reasoning token consumption via your dashboard if cost control is critical.

Zero data retention

Prompts and completions are processed and immediately discarded. Nothing is stored, logged, or used for training. Your data is yours.

Billing cycle

Monthly in arrears. Usage tracked in real-time via your dashboard. Invoiced on the first business day of the following month.

Volume tier qualification

5B+ tokens in any calendar month qualifies you for Volume pricing for that month and the following two months. Sustained volume over 3 months moves you to the Volume rate permanently.

Start in five minutes

Get your API key and start saving immediately.

Get API Key Read the Docs