Pricing

Predictable AI costs.

Flat monthly rate with generous token limits. No per-token billing. No surprises.

Starter

$20

$49

first month

then $49/month

5M tokens/month
60 requests/min
8K context window
OpenAI + Anthropic compatible
Usage dashboard
Audit log egress
Zero-retention mode (optional +$15/mo)

Sovereign inference requires Pro or above

Get Started

Pro

$149

/month

30M tokens/month
200 requests/min
8K context window
$2.00/M overage tokens
Priority routing
Usage dashboard + API
Compliance audit log
Audit log egress
Zero-retention mode
Sovereign mode

Get Started

Scale

$499

/month

150M tokens/month
600 requests/min
16K context window
$1.50/M overage tokens
Higher priority routing
Dedicated support
Sovereign mode
Compliance audit log
Audit log egress
Zero-retention mode

Get Started

Coming Soon

Enterprise

Custom

500M+ tokens/month
3,000+ requests/min
32K context window
$1.00/M overage tokens
Highest priority + SLA
Org-level sovereign lock
Dedicated inference isolation
Audit log egress + zero-retention
Custom retention + SLA

Coming Soon

Frequently asked

What happens if I go over my token limit?

On Pro and above, you can keep going at the overage rate. On Starter, requests are rate-limited until the next billing cycle.

Can I use my existing OpenAI SDK code?

Yes. Just change the base_url to forge-api.lanaai.io/v1 and use your Forge API key. Everything else stays the same.

Which models are available?

Self-hosted open-source models running on our sovereign infrastructure, plus commercial providers including OpenAI, Anthropic, Google, DeepSeek, Perplexity, and xAI as intelligent fallbacks.

What does "auto" model do?

It lets Forge pick the best model based on cost and quality. Most requests go to fast, cheap self-hosted models. Complex requests route to commercial models automatically.

Can I upgrade or downgrade at any time?

Yes. Plan changes take effect immediately. When upgrading, you get access to higher limits right away. When downgrading, limits are adjusted at the start of your next billing cycle.

What is sovereign mode?

Sovereign mode ensures your prompts and data never leave LANA-controlled infrastructure. All inference runs on our self-hosted models — no data is sent to OpenAI, Anthropic, or any third-party API. Available on Pro plans and above, org-wide on Enterprise.

Is there an audit trail for compliance?

Yes. Every request is logged with its routing decision, provider used, sovereign enforcement status, and timestamp. Pro plans and above can query the audit log via API. Enterprise plans support custom data retention policies.

What is audit log egress?

Audit log egress forwards your compliance events to your own infrastructure in real-time — a webhook URL you control. You hold the record, not us. Available on Starter and above.

What is zero-retention mode?

When enabled, no request content, prompts, or response data is stored anywhere on LANA infrastructure. Only billing counters (token counts, cost) are retained. Included on Pro and above, or available as a $15/mo add-on for Starter.

What is dedicated inference isolation?

Enterprise customers can run on their own isolated GPU infrastructure — no shared compute, no noisy neighbors, no cross-tenant exposure. Your requests never touch hardware shared with other organizations.