Klint is a unified AI API gateway with one key and one endpoint.

How does token counting work?

Klint uses weighted tokens based on model cost multipliers.

Do unused tokens roll over?

Yes, depending on your plan tier.

What are the rate limits?

Rate limits scale with your plan from 10 to 500+ req/min.

Which SDKs are supported?

Any OpenAI-compatible SDK pointed at api.tryklint.com.

What is your refund policy?

7-day money-back guarantee on paid plans.

All gateways operational

Drop-in compatibility. Point Claude Code or Codex at one endpoint. Cut inference costs without rewriting your stack.

One API key.
Every frontier model.

Built for developers, founders, and agencies routing production LLM traffic. Klint handles provider auth, cheapest-route selection, and metering — you ship instead of wiring.

Get API key Claude Code setup

base_url https://api.tryklint.com/v1

klint — bash

Status 200

Time 142ms

Route opus

p50 latency

68ms

Requests today

2.4M

Uptime

99.97%

Works with your stack

Compatible tools and models — not partnerships or sponsorships. Point your existing workflow at Klint.

VS Code

Terminal

GitHub

Anthropic

OpenAI

Google

Drop-in compatible

Point any OpenAI SDK at api.tryklint.com

Claude Code, Codex, and custom agents — swap the base URL, keep your code. Cut provider fragmentation on day one.

curl https://api.tryklint.com/v1/chat/completions \
  -H "Authorization: Bearer $KLINT_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-5.5-mini",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

Process

Production routing in under two minutes

Three steps from signup to routed inference — no stack rewrite required.

Select plan

Pick Pro, 5Max, or 20Max based on token volume. Agencies scale without renegotiating provider contracts.

Generate API key

Create production and staging keys from the dashboard. One key routes to every model.

Point Claude Code or Codex

Set base_url to api.tryklint.com/v1. Your existing SDK and agent configs keep working.

Observability

Every request logged, traced, and ready to inspect.

Debug routing decisions and inference spend without leaving your workflow.

Live request streamConnected

POST/v1/chat/completions200142ms

GET/v1/models20024ms

POST/v1/embeddings20089ms

POST/v1/chat/completions42912ms

POST/v1/chat/completions200138ms

POST/v1/messages200201ms

Models

One key across every major provider

GPT-5.5, Claude Opus, Grok 3, Gemini 2.5 Pro — route by model name, not provider account.

Model	Provider	Description	Weight	Context
GPT-5.5	openai	OpenAI flagship multimodal model	1×	256K
GPT-5.5 Mini	openai	Fast Codex-optimized variant	0.3×	128K
Claude Opus	anthropic	Anthropic flagship reasoning model	1.4×	200K
Claude Sonnet	anthropic	Balanced coding and reasoning for Claude Code	0.9×	200K
Grok 3	xai	xAI frontier model with real-time context	1.1×	131K
Gemini 2.5 Pro	google	Google advanced multimodal reasoning	1.2×	2000K
Llama 4 70B	meta	Open-weight model via Klint proxy	0.4×	128K
Mistral Large 3	mistral	Mistral flagship model	0.8×	128K

Full catalog and pricing weights in model documentation.

Platform

Infrastructure that pays for itself

Every route includes observability, throttling, and security — built for teams optimizing inference spend.

Rate limiting

Per-key throttling with burst tokens. Protect infrastructure without blocking production traffic.

Learn more

Cost analytics

Token usage, latency percentiles, and spend breakdown by model. See where inference budget goes.

Learn more

Security

Hashed keys at rest, IP allowlists, webhook signing, and audit logs. Enterprise SSO on Max tiers.

Learn more

Pricing

Cut inference costs at scale

Token-based billing for developers, founders, and agencies. Compare volume, routing tier, and cost-saving features side by side.

MonthlyAnnual −10%

Feature	Pro	5MaxMost popular	20Max
Monthly price	$39	$89	$199
Token volume	500M tokens/mo	1.2B tokens/mo	5B tokens/mo
Claude Code + Codex
Enterprise routing	Standard	Optimized	Optimized
Agency / multi-project	—	Team billing	Unlimited keys
Token rollover	30 days	90 days	90 days
Includes	500M tokens per month	1.2B tokens per month	5B tokens per month
	Claude Code + Codex routing	Enterprise-grade routing	Custom SLAs
	Team billing and access controls	Dedicated edge nodes	Dedicated support
	Priority support	Advanced rate limits	Unlimited API keys
	Usage analytics	Audit logs	Enterprise SSO

Pro

For teams running AI in production at scale.

Get started

5Max

High-volume workloads with optimized infrastructure scaling.

Get started

20Max

Maximum throughput for mission-critical AI systems.

Contact sales

Infrastructure

Global edge routing with sub-50ms gateway overhead.

Edge nodes: 24
Regions: 8
Avg overhead: <50ms
Uptime: 99.97%

Cost routing

Cheapest healthy path, every request

Stop overpaying for direct provider access. Klint routes to the lowest-cost endpoint that meets your latency SLA.

GPT-5.5

OpenAI flagship multimodal model

Weight

1×

Context

256K

GPT-5.5 Mini

Fast Codex-optimized variant

Weight

0.3×

Context

128K

Claude Opus

Anthropic flagship reasoning model

Weight

1.4×

Context

200K

Claude Sonnet

Balanced coding and reasoning for Claude Code

Weight

0.9×

Context

200K

Teams

Built for production inference at scale

“We cut provider billing from four dashboards to one. Klint paid for itself in the first week.”
Priya Sharma
Staff Engineer

“Agency clients run Claude Code and Codex through one key. No more per-project provider setup.”
Jordan Ellis
Founder, AI Automation Agency

“Latency matches direct calls. Cheapest-route selection actually shows up in our cost reports.”
Zara Ahmed
CTO, Series A Startup

“Swapped base_url in our agent stack in ten minutes. Inference spend dropped 18% month one.”
Marcus Chen
Indie Developer

FAQ

Common questions about Klint

Still have questions? support@tryklint.com

One API key.Every frontier model.

Point any OpenAI SDK at api.tryklint.com

Production routing in under two minutes

Select plan

Generate API key

Point Claude Code or Codex

Every request logged, traced, and ready to inspect.

One key across every major provider

Infrastructure that pays for itself

Rate limiting

Cost analytics

Security

Cut inference costs at scale

Global edge routing with sub-50ms gateway overhead.

Cheapest healthy path, every request

GPT-5.5

GPT-5.5 Mini

Claude Opus

Claude Sonnet

Built for production inference at scale

Common questions about Klint

What is Klint?

How does Claude Code integration work?

Do unused tokens roll over?

What are the rate limits?

Which SDKs are supported?

What is your refund policy?

One API key.
Every frontier model.