Inference

Simple pricing. No surprises.

Pay per token. No minimums on Standard. No hidden fees. One blended rate for input and output.

Standard

For teams getting started

$4.00
per 1M tokens
  • No commitment
  • No minimum volume
  • 1M token context
  • All reasoning modes
  • Best-effort SLA
  • Email support
Get Started
Best Value

Volume

For production workloads

$3.00
per 1M tokens
  • 5B+ tokens/month
  • 3-month minimum term
  • 1M token context
  • All reasoning modes
  • 99.5% uptime SLA
  • Priority email support
Contact Sales

Enterprise

For regulated industries

Custom
dedicated capacity
  • Dedicated GPU allocation
  • 12-month term
  • 1M token context
  • All + custom fine-tuned models
  • 99.9% uptime SLA
  • Dedicated account manager
Contact Sales

How much could you save?

Most teams save 60–80% by switching from Anthropic or OpenAI.

Savings calculator

Enter your current monthly spend on Anthropic or OpenAI.

$10,000
$1,000$100,000
Current (Anthropic/OpenAI)$10,000/mo
Continuum$2,000/mo
$96,000
saved per year
$8,000/mo saving80% reduction

Estimate assumes equivalent token volume migrated from Claude Opus 4.7 (blended AUD ~$20/1M tokens) to Continuum Standard at $4/1M. Actual savings depend on your model mix and input/output ratio.

Price comparison

Same API format. Same capability tier. Dramatically different price.

ProviderModelBlended / 1M tokData LocationSovereign
ContinuumV4-Flash$4.00Australia
AnthropicClaude Opus 4.7~$20US
AnthropicClaude Sonnet 4.6~$12US
OpenAIGPT-5.5~$22US
OpenAIGPT-4o~$8US

Included on every tier

OpenAI-compatible API
NVIDIA H200 GPUs
1M token context window
Three reasoning modes
Function calling (128 tools)
JSON structured output
Streaming (SSE)
Zero data retention
Australian sovereign infra

Billing details

Blended pricing

Single rate per 1M tokens covering both input and output. No separate input/output billing. Simplifies budgeting and removes the need to optimise input/output ratios.

Thinking tokens

Billed at the same per-token rate. Think High and Think Max generate additional reasoning tokens that count toward your usage. Monitor reasoning token consumption via your dashboard if cost control is critical.

Zero data retention

Prompts and completions are processed and immediately discarded. Nothing is stored, logged, or used for training. Your data is yours.

Billing cycle

Monthly in arrears. Usage tracked in real-time via your dashboard. Invoiced on the first business day of the following month.

Volume tier qualification

5B+ tokens in any calendar month qualifies you for Volume pricing for that month and the following two months. Sustained volume over 3 months moves you to the Volume rate permanently.

Start in five minutes

Get your API key and start saving immediately.