Transparent pricing

Token-based pricing, small markup

Table prices = upstream × 1.25. Per 1K tokens. Pay with any credit card via Stripe.

Best value

🇨🇳 Chinese value models

Production-grade, 1/10 – 1/100 the cost of GPT-4o

Model	Input	Output
`deepseek-v3.1` ≈ gpt-4o quality at 1/18 the price	$0.00014	$0.00028
`deepseek-r1` reasoning, ≈ o1 quality	$0.00055	$0.00220
`qwen-plus` Alibaba Qwen, strong general	$0.00040	$0.00120
`glm-4.6` Zhipu AI, great at code	$0.00060	$0.00220
`glm-5` Zhipu 5 series	$0.00100	$0.00300
`kimi-k2` Moonshot, 200K context	$0.00060	$0.00250

🌎 Western frontier

Full access, no cross-border hassle

Model	Input	Output
`gpt-5` OpenAI flagship	$0.01250	$0.05000
`gpt-4.1` OpenAI strong	$0.00200	$0.00800
`gpt-4o` OpenAI balanced	$0.00250	$0.01000
`claude-opus-4-7` Anthropic flagship	$0.01500	$0.07500
`claude-sonnet-4-6` Anthropic general	$0.00300	$0.01500
`claude-haiku-4-5-20251001` Anthropic budget	$0.00100	$0.00500
`gemini-3-pro-preview` Google flagship	$0.00500	$0.02000
`gemini-2.5-flash` Google fast/cheap	$0.00015	$0.00060
`grok-4` real-time search	$0.00500	$0.01500

No per-seat. No minimums.

Pay for forwarded traffic. Balance never expires.

Starter

Free

Rate: 1.25× upstream

$5 credit on signup. No card required.

Scale

$49 /mo

Rate: 1.15× upstream

Lower rate, priority email support.

Enterprise

Custom

Rate: Negotiated

SLA, SSO, dedicated support.

FAQ

Are these the real, production Chinese models?+

Yes — DeepSeek v3.1, Qwen Plus, GLM 4.6, Kimi K2 are the same versions the labs deploy in their own consumer apps. No distillations, no fine-tunes.

How reliable is this?+

We run redundant contracts with multiple upstream providers. If a model goes down, requests failover transparently to an equivalent model (opt-in per tenant).

Do I need a Chinese entity, phone number, or visa?+

No. Sign up with any email. We handle provider-side compliance. Pay via Stripe in USD.

Can I still use GPT-4o / Claude / Gemini if I need them?+

Yes — same API. Charged at standard upstream rates × 1.25. Use the `model` parameter to pick, or set `model="auto"` and let our router choose the cheapest.

Data retention?+

Request bodies are forwarded, not persisted. Only anonymized metadata (token counts, model used, latency) is stored for billing and analytics.

How do I migrate from OpenAI?+

Change `base_url` in your OpenAI SDK to `https://promptoll.com/v1`. That's it. Your existing code keeps working. Switch `model` to a cheaper one when you're ready.