💎 Transparent Pricing

Pay only for what you use.
Route smarter, spend less.

No hidden fees. No vendor lock-in. NexToken routes your requests to the optimal provider — automatically.

Monthly
Annual Save 20%
Developer
$0
Free forever. Start building immediately.
Start Free
$5 one-time credit · no card needed
100 RPM rate limit
3 API keys
Standard routing
REST API access
Custom routing rules
SLA guarantee
Priority support
Pro
$29/mo
For indie developers and growing projects.
Start Pro
$10 monthly credit included
200 RPM rate limit
20 API keys
Smart routing + nex-auto
Prompt-cache savings auto-applied
Streaming + batch + vision
Usage analytics
SLA guarantee
Enterprise
Custom
Negotiated pricing for large-scale deployments.
Contact Sales
Unlimited credits (prepaid)
Custom RPM limits
Dedicated routing cluster
Volume discounts negotiable
99.99% uptime SLA
Dedicated Slack channel
Custom invoicing / PO
Reserved throughput (custom RPM floor)
Fine-tune endpoint + DPA
On-premise option
Platform fee + pass-through model costs. Subscription fees cover your NexToken platform access, rate limits, and features. Model usage is billed separately at cost-plus-markup rates shown in the table below. All prices in USD. Singapore users are charged 9% GST on platform fees per IRAS regulations.
🚀 NEW · MAY 2026

23 upgrades shipped — same API, lower bills.

Every chat completion now passes upstream prompt-cache savings through automatically. Near-duplicate prompts bill at 5% of normal retail via our self-hosted semantic cache. Switch model: "nex-auto" and let the gateway pick the right tier per prompt. No client code changes required.

Cost
Prompt-cache pass-through · semantic cache 95% off · nex-auto smart router · batch endpoint 30% off · tokenize + estimate-cost
Compliance
Content moderation · PII redaction (PDPA / GDPR) · prompt-injection defence · context-window pre-flight
Reliability
Redis-backed circuit breaker · in-provider retry + backoff · region-aware routing · multi-key load balance · live P95 scoring
Reach
DALL-E 3 · Whisper · TTS · vision token math · Cohere · Perplexity · xAI Grok · prompt templates · fine-tune API
All additive. Zero breaking changes. View API docs →

Model Token Pricing

NexToken charges provider cost + a small routing markup. Prices per 1,000,000 tokens (1M tokens). Input / Output priced separately.

NexToken Native Models Proprietary
Built for Asia-Pacific developers. NexToken Native models are our proprietary offerings, optimised for cost, latency, and regulatory compliance. Internal architecture is confidential — you get a single stable API regardless of upstream changes.
ModelInput ($/1M)Output ($/1M)Best ForCapabilities
nex-pro32k ★ Default $0.10 $0.40 The default Nex model. Chat, code, content, summarisation. Self-hosted Singapore GPU. Strong Chinese + English. ~95% cheaper than GPT-4o stream tools
nex-autosmart $0.30 $1.20 Don't want to pick a model? Network picks per-request between nex-pro / nex-reasoning. Actual target surfaced in nex.smart_router. stream tools
nex-reasoning128k $1.20 $4.80 Multi-step math, logic, structured analysis. No tool calling. ~90% cheaper than o1 stream
nex-embed-zh512 $0.01 Chinese-strong embeddings, 1024-dim. Self-hosted, marginal cost ~0. ~50% cheaper than text-embedding-3-small /v1/embeddings
Which one?
💬 Chat, code, anything — nex-pro
🧮 Math / multi-step logic — nex-reasoning
🤖 Let the gateway decide — nex-auto
🇨🇳 Chinese embeddings — nex-embed-zh
Legacy IDs nex-smart and nex-coder still work — they are transparent aliases for nex-pro. No code changes needed.

Why NexToken Native? Single stable API, no vendor lock-in, optimised cost-performance. Powered by NexToken's intelligent routing infrastructure. Underlying inference architecture details are proprietary.

OpenAI
ModelInput ($/1M)Output ($/1M)MarkupCapabilities
gpt-4o128k$2.60$10.40+8%stream vision tools
gpt-4o-mini128k$0.16$0.65+8%stream vision tools
o1200k$16.20$64.80+8%tools
o3-mini200k$1.13$4.40+8%stream tools
Anthropic
ModelInput ($/1M)Output ($/1M)MarkupCapabilities
claude-opus-4200k$15.00$75.00+7%stream vision tools
claude-sonnet-4200k$3.00$15.00+7%stream vision tools
claude-haiku-4200k$0.80$4.00+7%stream vision tools
Google DeepMind
ModelInput ($/1M)Output ($/1M)MarkupCapabilities
gemini-2.5-pro1M$1.25$10.00+8%stream vision tools
gemini-2.5-flash1M$0.075$0.30+8%stream vision tools
Meta / DeepSeek / Mistral
ModelInput ($/1M)Output ($/1M)MarkupCapabilities
llama-3.3-70b128k$0.59$0.79+10%stream tools
deepseek-v364k$0.27$1.10+10%stream tools
mistral-large-2128k$2.00$6.00+10%stream tools

Volume Billing Tiers

The higher your monthly token spend, the lower your effective markup. Tiers reset on the 1st of each month.

TierMonthly Token SpendMarkup RateEffective SavingUnlocks
Developer$0 – $500Standard3 keys, 20 RPM
Pro$500 – $5,000−1%Up to $50/mo200 RPM, analytics
Business$5,000 – $50,000−2.5%Up to $1,250/moCustom routing, SLA
Enterprise$50,000+NegotiatedUp to 15%+Dedicated cluster, custom terms

* Billing tiers are independent from Loyalty tiers. Billing tiers reflect monthly spend volume; Loyalty tiers reflect cumulative top-up history.

Loyalty Tiers

Cumulative top-up milestones unlock permanent wallet bonuses. Tiers do not reset — they track your total lifetime top-up.

🥉
Bronze
≥ $500 cumulative top-up
Bonus credits on every top-up
+3%
🥈
Silver
≥ $2,000 cumulative top-up
Bonus credits + priority routing queue
+5%
🥇
Gold
≥ $10,000 cumulative top-up
Bonus credits + dedicated routing + SLA
+8%
💎
Platinum
≥ $50,000 cumulative top-up
Maximum bonus + enterprise SLA + custom terms
+12%

Loyalty bonus credits are applied to your wallet at time of top-up. Example: Gold user tops up $1,000 → receives $1,080 wallet balance (+8% bonus). Bonuses do not stack with promotional codes.

Add-Ons

Optional extras available on Pro and Business plans.

🔒
Extended Audit Logs
$19 / month
Retain full request/response logs for 365 days. Export to S3 or GCS. Required for SOC 2 audits.
📊
Advanced Analytics
$29 / month
Cost attribution by team, project, or custom labels. CSV export, Grafana-compatible metrics endpoint.
🚨
Spend Alerts & PagerDuty
$9 / month
Multi-channel alerts (Slack, email, SMS, PagerDuty) with configurable thresholds and escalation policies.
🌐
Dedicated IP Egress
$49 / month
Route all traffic through a static IP pool for firewall whitelisting. Required for some financial and healthcare orgs.
🤝
Shared Slack Support
$99 / month
Join a shared Slack Connect channel with the Nex engineering team. <4h response time, Mon–Fri 9–6 SGT.
Higher Rate Limits
From $49 / month
Burst capacity packs: 5k, 20k, or 50k additional RPM. Stacks on top of your base plan limit.

Full Feature Comparison

Detailed breakdown of what's included in each plan.

FeatureDeveloperPro BusinessEnterprise
Limits & Access
Monthly free credits$5 one-time$10 incl.Custom
Requests per minute (RPM)20200Custom
API keys320Unlimited
Sub-keys per key3Unlimited
Context window supportUp to 128kUp to 200kUp to 1M+
Routing & Intelligence
Smart auto-routing
Provider fallback
Custom routing rules
Dedicated routing cluster
Streaming (SSE)
Budget & Controls
Per-key budget limits
Auto top-up
Spend alertsEmail only
Multi-wallet (teams)
Observability
Request logs retention7 days30 days365 days
Usage analytics dashboardBasicStandardAdvanced + export
Cost attribution labels
SLA & Support
Uptime SLA99.99%
Support channelCommunityEmailDedicated Slack
Response timeBest effort<48h<2h
Custom invoicing / PO
🇸🇬

Singapore GST Notice (9%)

Cete Ventures Pte. Ltd. (UEN: 202421160G) is GST-registered in Singapore. Platform subscription fees are subject to 9% GST for Singapore-based customers. Model usage credits for SG GST-registered businesses with a valid GST number may qualify for input tax claim — contact us to provide your registration number. Non-SG customers are invoiced without GST. A valid GST invoice is issued for every transaction.

Frequently Asked Questions

Do I pay provider API costs separately?
No. NexToken handles all provider API relationships on your behalf. You top up your NexToken wallet and we pay the providers. Your wallet is debited at our cost-plus-markup rates shown in the pricing table above. You never need to sign up with OpenAI, Anthropic, or Google directly.
What happens if my wallet balance hits zero?
API calls will return HTTP 402 (Payment Required) immediately. No debt is accumulated — NexToken operates on a strict prepaid model. You can enable auto top-up on Pro and Business plans to avoid interruptions. Existing in-flight streaming requests will complete (up to 120 seconds) before being terminated.
How does smart routing decide which provider to use?
NexToken's routing engine evaluates real-time provider health scores, current latency, your requested model, and your configured routing preferences. On Business plans, you can pin specific providers per API key or set custom fallback chains. The router runs sub-5ms decisions before proxying your request.
Can I get a refund on unused wallet credits?
Yes. Unused top-up credits (excluding free promotional credits) are refundable within 30 days of the top-up transaction. Refunds are processed back to your original payment method via Stripe within 5–10 business days. Loyalty bonus credits are non-refundable.
What's the difference between Billing Tiers and Loyalty Tiers?
Billing Tiers reflect your monthly token spend volume and reduce your per-token markup — they reset on the 1st of each month. Loyalty Tiers reflect your cumulative total top-up history and add bonus credits to your wallet when you top up — they never reset. The two systems are completely independent.
Is there a free trial for paid plans?
Every new account gets $5 in one-time free credits — no credit card required. Simply sign up and start making API calls immediately. Pro and Business plans offer a 14-day money-back guarantee on the subscription fee. Enterprise plans are negotiated individually.
Do prices include streaming responses?
Yes. Streaming (SSE) is supported at no additional cost. Token pricing is identical for streaming and non-streaming requests. Token counts are calculated using the provider's reported usage field where available; otherwise NexToken uses tiktoken-based estimation.

Start routing smarter today

Join hundreds of developers and teams using NexToken to reduce LLM costs and improve reliability.