A thin, OpenAI-compatible proxy in front of Cloudflare Workers AI. Drop the base URL into any OpenAI SDK and you can call gpt-oss-120b, Llama 3.3, DeepSeek R1, Qwen Coder, and more — billed against a shared budget with hard caps and live observability.
Live usage
A single lifetime cap; once it's hit, the service returns HTTP 429 until the cap is raised.
Spend over time
Spend by model
Catalog
Use the short alias in the model field. Pricing is per million tokens; rough estimates based on Cloudflare Workers AI public pricing.
| Alias | Cloudflare slug | $ / 1M in | $ / 1M out | Spent | Requests | |
|---|---|---|---|---|---|---|
| loading… | ||||||
Quickstart
Anywhere you can configure an OpenAI base_url + API key, you can use this. Pick your stack: