Documentation
Get started in minutes.
Forge is a drop-in replacement. If you've used the OpenAI or Anthropic API, you already know how to use Forge.
Get your API key
Sign up and generate an API key from your dashboard. Keys start with rrt-burst-
Point your SDK at Forge
# pip install openai from openai import OpenAI client = OpenAI( base_url="https://forge.lanaai.io/v1", api_key="your-forge-api-key", ) response = client.chat.completions.create( model="auto", # or "fast", "gpt-4o", "claude-sonnet-4-20250514", etc. messages=[{"role": "user", "content": "Hello from Forge!"}], ) print(response.choices[0].message.content)
curl https://forge.lanaai.io/v1/chat/completions \ -H "Authorization: Bearer $FORGE_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "auto", "messages": [{"role": "user", "content": "Hello"}], "stream": true }'
# pip install anthropic import anthropic client = anthropic.Anthropic( base_url="https://forge.lanaai.io/v1", api_key="your-forge-api-key", ) message = client.messages.create( model="claude-sonnet-4-20250514", max_tokens=1024, messages=[{"role": "user", "content": "Hello from Forge!"}], )
That's it.
Forge handles routing, cost optimization, and failover automatically. Monitor your usage from the dashboard.
API Endpoints
/v1/chat/completions
OpenAI-compatible chat completions. Supports streaming.
/v1/messages
Anthropic Messages API compatible. Full streaming SSE support.
/v1/embeddings
Generate embeddings with the same interface as OpenAI.
/v1/models
List available models and their capabilities.
/dashboard/usage
Your token usage, cost breakdown, and forecasted spend.
Model Aliases
Let Forge pick the best model for you, or request a specific one.
| Alias | Routes to | Best for |
|---|---|---|
"auto" |
Cost-optimized pick | General use — cheapest that works |
"fast" |
Smallest, fastest model | Low latency, simple tasks |
"reasoning" |
Strongest available model | Complex analysis, coding, research |
"embedding" |
Embedding model | Vector embeddings for RAG |
You can also use specific model names like "gpt-4o", "claude-sonnet-4-20250514", "llama-3.1-70b", etc.