Tessera developer documentation
Everything to start, route, audit.
SDKs · architecture · engineering blog.
Four SDK packages cover the most-used Python and Node entry points to LLM providers — vanilla SDK, LangChain, Vercel AI SDK, LlamaIndex. The full mechanic stack, quality SLA, and audit-grade cost provenance docs live one click away. Free Dev tier: 60M tokens/month, no card, 30-second signup.
SDK packages
Four packages, same proxy, same key.
Pick whichever matches your codebase. Install side by side without conflict — all four resolve to the same `tk_…` API key and same billing record.
Core SDK
tessera-sdk↗
Python + Node packages. One-line tessera.activate(key) patches OpenAI / Anthropic / Mistral / Groq / Cohere clients. Use without a framework.
LangChain
tessera-langchain↗
Python + Node. tessera_openai_config / tessera_anthropic_config etc. wire ChatModel constructors to the Tessera proxy. Drop-in for any LangChain pipeline.
Vercel AI SDK
@tessera-llm/vercel-ai↗
Node / TypeScript. tesseraOpenAIConfig spread into createOpenAI; generateText / streamText / generateObject unchanged underneath.
LlamaIndex
tessera-llamaindex↗
Python. tessera_openai_config etc. wire the LlamaIndex LLM constructors. RAG, query engines, agents route through the proxy unchanged.
Architecture & trust
How the proxy works + what we do with your data.
The mechanic stack, the quality contract, and the security posture. The non-prose part of the story behind every savings number.
How it works
Mechanics + quality SLA + reliability primitives
The nine cost mechanics (auto-route, caches, prompt-cache headers, compress, context prune, output ceiling, batch arbitrage), the composition cap, the quality canary, and the per-stack auto-rollback.
Security
Data handling + retention + SOC 2 posture
What we log, what we never store, retention by class, encryption at rest via Supabase Vault, planned SOC 2 Type 1 in Q3 2026.
Engineering blog
Worked examples + architecture deep dives.
Posts pulled from the proxy data. Concrete numbers, real workloads, source-grep-verifiable claims. Full archive at /blog.
Engineering blog
Audit immutability for AI cost claims — what snapshot-pinning actually buys you
Every savings number references an immutable pricing_snapshot_id. Multi-source verified (LiteLLM + tokencost + OpenRouter, consensus ≥ 0.95). Two engineers, three hours, can re-derive any month.
Engineering blog
How an AI customer-support workload cuts its GPT-4 bill 38% without quality regression
Worked example: $24k → $9.4k via auto-route, exact + semantic cache, prompt cache headers, context pruning, output-length predictor, batch arbitrage. Quality canary mean 0.96.
Engineering blog
The prompt bloat you don’t see — and what it’s costing you
M3 chars-capture telemetry reveals three patterns: stable bloat, drift bloat, silent regression. Per-role compression + per-stack canary as the operational response.
Engineering blog
Using Tessera with LangChain in 30 seconds — drop-in cost optimization
One install, one line of config. Tools, streaming, eval, structured output pass through. Per-provider one-liners for OpenAI / Anthropic / Mistral / Groq / Cohere.
Try it on your workload.
60M tokens/month free. No card. 30-second signup. Production tier is 20% of measured savings only — zero savings, zero fee.
Get free API key→