Works with the AI you already pay for
OpenAI, Anthropic, Google, Mistral, Groq, xAI — and more. Your provider keys go directly to them. Lexi never holds them.
Up and running in two minutes
One URL change in your configuration. Streaming, tool calls, structured output — all supported. Nothing else in your stack changes.
Pay only when you save
Lexi earns a share of what it reduces on each request. When there's no saving, there's no Lexi fee. You cannot pay more than going direct.
Conversations that go further
Long AI sessions hit a wall — the context fills up and the model loses track of earlier decisions. Lexi keeps context bounded so sessions stay coherent for much longer. Context is restructured, not truncated — facts and decisions survive.
Powered by STONE
Semantic Token Optimization and Natural Encoding. A purpose-built engine that restructures context into a bounded representation — the amount sent to your provider stays constant regardless of conversation length.
Live end-to-end tests, GPT-4o-mini, March 2026. Results vary by content and conversation pattern.
As conversations grow, the amount sent to your provider stays bounded. A 9,000-token conversation was reduced to under 900 — and stayed there.
Every cent. In the response headers.
No estimates, no opaque bills. Every response carries the exact cost breakdown — savings, margin, balance — in HTTP headers you can log, alert on, or display to your users.
const openai = new OpenAI({
baseURL: 'https://api.lexisaas.com/v1',
apiKey: 'lx_live_yourkey:sk-your-openai-key',
});
// Anthropic:
const anthropic = new Anthropic({
baseURL: 'https://api.lexisaas.com',
apiKey: 'lx_live_yourkey:sk-ant-your-key',
});