cf-ai-proxy
Live · auto-refresh 8s
OpenAI-compatible · /v1/chat/completions

One endpoint. 11 open-source LLMs. Zero infra.

A thin, OpenAI-compatible proxy in front of Cloudflare Workers AI. Drop the base URL into any OpenAI SDK and you can call gpt-oss-120b, Llama 3.3, DeepSeek R1, Qwen Coder, and more — billed against a shared budget with hard caps and live observability.

Base URL
https://…
Auth
Bearer token — ask the admin for a key

Live usage

Spend against the shared budget

A single lifetime cap; once it's hit, the service returns HTTP 429 until the cap is raised.

Spent
Remaining
of $ total
Requests
across all users
Status
last activity —

Spend over time

Last 24 hours · hourly buckets

Spend by model

Lifetime · top consumers

Catalog

Available models

Use the short alias in the model field. Pricing is per million tokens; rough estimates based on Cloudflare Workers AI public pricing.

Alias Cloudflare slug $ / 1M in $ / 1M out Spent Requests
loading…

Quickstart

Plug in from any language

Anywhere you can configure an OpenAI base_url + API key, you can use this. Pick your stack: