Token-based pricing, small markup
Table prices = upstream × 1.25. Per 1K tokens. Pay with any credit card via Stripe.
🇨🇳 Chinese value models
Production-grade, 1/10 – 1/100 the cost of GPT-4o
| Model | Input | Output |
|---|---|---|
deepseek-v3.1≈ gpt-4o quality at 1/18 the price | $0.00014 | $0.00028 |
deepseek-r1reasoning, ≈ o1 quality | $0.00055 | $0.00220 |
qwen-plusAlibaba Qwen, strong general | $0.00040 | $0.00120 |
glm-4.6Zhipu AI, great at code | $0.00060 | $0.00220 |
glm-5Zhipu 5 series | $0.00100 | $0.00300 |
kimi-k2Moonshot, 200K context | $0.00060 | $0.00250 |
🌎 Western frontier
Full access, no cross-border hassle
| Model | Input | Output |
|---|---|---|
gpt-5OpenAI flagship | $0.01250 | $0.05000 |
gpt-4.1OpenAI strong | $0.00200 | $0.00800 |
gpt-4oOpenAI balanced | $0.00250 | $0.01000 |
claude-opus-4-7Anthropic flagship | $0.01500 | $0.07500 |
claude-sonnet-4-6Anthropic general | $0.00300 | $0.01500 |
claude-haiku-4-5-20251001Anthropic budget | $0.00100 | $0.00500 |
gemini-3-pro-previewGoogle flagship | $0.00500 | $0.02000 |
gemini-2.5-flashGoogle fast/cheap | $0.00015 | $0.00060 |
grok-4real-time search | $0.00500 | $0.01500 |
No per-seat. No minimums.
Pay for forwarded traffic. Balance never expires.
FAQ
Are these the real, production Chinese models?+
Yes — DeepSeek v3.1, Qwen Plus, GLM 4.6, Kimi K2 are the same versions the labs deploy in their own consumer apps. No distillations, no fine-tunes.
How reliable is this?+
We run redundant contracts with multiple upstream providers. If a model goes down, requests failover transparently to an equivalent model (opt-in per tenant).
Do I need a Chinese entity, phone number, or visa?+
No. Sign up with any email. We handle provider-side compliance. Pay via Stripe in USD.
Can I still use GPT-4o / Claude / Gemini if I need them?+
Yes — same API. Charged at standard upstream rates × 1.25. Use the `model` parameter to pick, or set `model="auto"` and let our router choose the cheapest.
Data retention?+
Request bodies are forwarded, not persisted. Only anonymized metadata (token counts, model used, latency) is stored for billing and analytics.
How do I migrate from OpenAI?+
Change `base_url` in your OpenAI SDK to `https://promptoll.com/v1`. That's it. Your existing code keeps working. Switch `model` to a cheaper one when you're ready.