Drop-in compatibility. Point Claude Code or Codex at one endpoint. Cut inference costs without rewriting your stack.
One API key.
Every frontier model.
Built for developers, founders, and agencies routing production LLM traffic. Klint handles provider auth, cheapest-route selection, and metering — you ship instead of wiring.
p50 latency
68ms
Requests today
2.4M
Uptime
99.97%
Works with your stack
Compatible tools and models — not partnerships or sponsorships. Point your existing workflow at Klint.
Drop-in compatible
Point any OpenAI SDK at api.tryklint.com
Claude Code, Codex, and custom agents — swap the base URL, keep your code. Cut provider fragmentation on day one.
1curl https://api.tryklint.com/v1/chat/completions \
2 -H "Authorization: Bearer $KLINT_API_KEY" \
3 -H "Content-Type: application/json" \
4 -d '{
5 "model": "gpt-5.5-mini",
6 "messages": [{"role": "user", "content": "Hello!"}]
7 }'Process
Production routing in under two minutes
Three steps from signup to routed inference — no stack rewrite required.
Select plan
Pick Pro, 5Max, or 20Max based on token volume. Agencies scale without renegotiating provider contracts.
Generate API key
Create production and staging keys from the dashboard. One key routes to every model.
Point Claude Code or Codex
Set base_url to api.tryklint.com/v1. Your existing SDK and agent configs keep working.
Observability
Every request logged, traced, and ready to inspect.
Debug routing decisions and inference spend without leaving your workflow.
Models
One key across every major provider
GPT-5.5, Claude Opus, Grok 3, Gemini 2.5 Pro — route by model name, not provider account.
| Model | Provider | Description | Weight | Context |
|---|---|---|---|---|
| GPT-5.5 | OpenAI flagship multimodal model | 1× | 256K | |
| GPT-5.5 Mini | Fast Codex-optimized variant | 0.3× | 128K | |
| Claude Opus | Anthropic flagship reasoning model | 1.4× | 200K | |
| Claude Sonnet | Balanced coding and reasoning for Claude Code | 0.9× | 200K | |
| Grok 3 | xAI frontier model with real-time context | 1.1× | 131K | |
| Gemini 2.5 Pro | Google advanced multimodal reasoning | 1.2× | 2000K | |
| Llama 4 70B | Open-weight model via Klint proxy | 0.4× | 128K | |
| Mistral Large 3 | Mistral flagship model | 0.8× | 128K |
Full catalog and pricing weights in model documentation.
Platform
Infrastructure that pays for itself
Every route includes observability, throttling, and security — built for teams optimizing inference spend.
Rate limiting
Per-key throttling with burst tokens. Protect infrastructure without blocking production traffic.
Learn moreCost analytics
Token usage, latency percentiles, and spend breakdown by model. See where inference budget goes.
Learn moreSecurity
Hashed keys at rest, IP allowlists, webhook signing, and audit logs. Enterprise SSO on Max tiers.
Learn morePricing
Cut inference costs at scale
Token-based billing for developers, founders, and agencies. Compare volume, routing tier, and cost-saving features side by side.
| Feature | Pro | 5MaxMost popular | 20Max |
|---|---|---|---|
| Monthly price | $39 | $89 | $199 |
| Token volume | 500M tokens/mo | 1.2B tokens/mo | 5B tokens/mo |
| Claude Code + Codex | |||
| Enterprise routing | Standard | Optimized | Optimized |
| Agency / multi-project | — | Team billing | Unlimited keys |
| Token rollover | 30 days | 90 days | 90 days |
| Includes | 500M tokens per month | 1.2B tokens per month | 5B tokens per month |
| Claude Code + Codex routing | Enterprise-grade routing | Custom SLAs | |
| Team billing and access controls | Dedicated edge nodes | Dedicated support | |
| Priority support | Advanced rate limits | Unlimited API keys | |
| Usage analytics | Audit logs | Enterprise SSO |
Infrastructure
Global edge routing with sub-50ms gateway overhead.
- 24
- 8
- <50ms
- 99.97%
Cost routing
Cheapest healthy path, every request
Stop overpaying for direct provider access. Klint routes to the lowest-cost endpoint that meets your latency SLA.
GPT-5.5
OpenAI flagship multimodal model
1×
256K
GPT-5.5 Mini
Fast Codex-optimized variant
0.3×
128K
Claude Opus
Anthropic flagship reasoning model
1.4×
200K
Claude Sonnet
Balanced coding and reasoning for Claude Code
0.9×
200K
Teams
Built for production inference at scale
“We cut provider billing from four dashboards to one. Klint paid for itself in the first week.”
“Agency clients run Claude Code and Codex through one key. No more per-project provider setup.”
“Latency matches direct calls. Cheapest-route selection actually shows up in our cost reports.”
“Swapped base_url in our agent stack in ten minutes. Inference spend dropped 18% month one.”
FAQ
Common questions about Klint
Still have questions? support@tryklint.com