Pricing
Predictable AI costs.
Flat monthly rate with generous token limits. No per-token billing. No surprises.
Starter
first month
then $49/month
- 5M tokens/month
- 60 requests/min
- 8K context window
- OpenAI + Anthropic compatible
- Usage dashboard
Pro
/month
- 30M tokens/month
- 200 requests/min
- 8K context window
- $2.00/M overage tokens
- Priority routing
- Usage dashboard + API
Scale
/month
- 150M tokens/month
- 600 requests/min
- 16K context window
- $1.50/M overage tokens
- Higher priority routing
- Dedicated support
Enterprise
contact us
- 500M+ tokens/month
- 3,000+ requests/min
- 32K context window
- $1.00/M overage tokens
- Highest priority + SLA
- Custom model hosting
Frequently asked
What happens if I go over my token limit?
On Pro and above, you can keep going at the overage rate. On Starter, requests are rate-limited until the next billing cycle.
Can I use my existing OpenAI SDK code?
Yes. Just change the base_url to forge.lanaai.io/v1 and use your Forge API key. Everything else stays the same.
Which models are available?
Self-hosted open-source models (Llama, Qwen, Mistral), plus OpenAI (GPT-4o, GPT-4o-mini), Anthropic (Claude), and Google (Gemini) as fallbacks.
What does "auto" model do?
It lets Forge pick the best model based on cost and quality. Most requests go to fast, cheap self-hosted models. Complex requests route to commercial models automatically.
Can I upgrade or downgrade at any time?
Yes. Plan changes take effect immediately. When upgrading, you get access to higher limits right away. When downgrading, limits are adjusted at the start of your next billing cycle.