Skip to main content
← Tessera Blog
·7 min read

Using Tessera with LangChain in 30 seconds — drop-in cost optimization

If your application is on LangChain and your LLM bill is real enough to care about, here is the shortest path from "I am paying for every token" to "I am paying for measured savings only." It is two lines of code, your existing ChatModel keeps working, and the proxy underneath does the optimization.

The integration (Python)

One install. One line of config in your ChatOpenAI / ChatAnthropic / ChatMistralAI constructor. That is the entire integration.

pip install tessera-langchain

# In your existing app code:

from langchain_openai import ChatOpenAI
from tessera_langchain import tessera_openai_config

llm = ChatOpenAI(
    model="gpt-4o",
    openai_api_key="sk-...",                          # your OpenAI key
    **tessera_openai_config(api_key="tk_..."),       # routes through Tessera
)

# Everything you already wrote with this llm — agents, chains,
# tool calling, structured output, streaming — works unchanged.
response = llm.invoke("Summarize this support ticket in 2 sentences.")

Get a free Tessera API key (60M tokens/month, no card) at tesseraai.io/dev — sign-up takes about 30 seconds and returns an instant tk_… key.

The integration (TypeScript / LangChain.js)

npm install @tessera-llm/langchain

// In your existing app code:

import { ChatOpenAI } from "@langchain/openai";
import { tesseraOpenAIConfig } from "@tessera-llm/langchain";

const llm = new ChatOpenAI({
  model: "gpt-4o",
  apiKey: process.env.OPENAI_API_KEY!,
  ...tesseraOpenAIConfig({ apiKey: process.env.TESSERA_API_KEY! }),
});

// Existing chains, agents, tool calls, streaming — all unchanged.
const response = await llm.invoke("Summarize this support ticket in 2 sentences.");

What changes underneath

Every call your LangChain ChatModel makes now goes to api.tesseraai.io first. The Tessera proxy applies a stack of cost-optimization mechanics in real time before forwarding the request to the provider:

On top of these, a per-provider circuit breaker tracks rolling 5xx rates per upstream and skips degraded providers in auto-route decisions until they recover. (Cross-provider failover — re-routing to a different provider entirely — is on the roadmap, not shipped yet. See /how-it-works → Reliability primitives for current status.)

What does NOT change

Worked example: a customer-support LangChain agent

Concrete numbers from a real beta workload — customer-support agent on gpt-4o with a top-5 RAG retrieval per turn, 1.2 billion tokens/month aggregate, OpenAI list prices.

StageCost / moSaved
LangChain → OpenAI direct (baseline)$24,000
+ LangChain → Tessera (auto-route + cache + prompt-cache headers)$11,520$12,480
+ context pruning + M9 ceiling + compression + batch$9,400$2,120
Tessera-optimized total$9,400$14,600
Tessera fee (20% × savings)$2,920
Customer net pay$12,320$11,680

Quality canary across the full mechanic stack: mean-score 0.96 (floor 0.95) — 0.95 SLA held all 30 days. The application code is one LangChain ChatOpenAI constructor + the eight-line agent setup they already had. No prompt rewrites. No model swap by hand. Full breakdown by mechanic in the companion post.

Per-provider one-liners

from tessera_langchain import (
    tessera_openai_config,      # → langchain_openai.ChatOpenAI(...)
    tessera_anthropic_config,   # → langchain_anthropic.ChatAnthropic(...)
    tessera_mistral_config,     # → langchain_mistralai.ChatMistralAI(...)
    tessera_groq_config,        # → langchain_groq.ChatGroq(...)
    tessera_cohere_config,      # → langchain_cohere.ChatCohere(...)
)

# Or the generic dispatcher when provider is runtime-parameterized:
from tessera_langchain import tessera_config

cfg = tessera_config("anthropic", api_key="tk_...")

All five constructors accept the same shape: **tessera_<provider>_config(api_key=...) (or spread in TypeScript). The package ships separate functions per provider because each LangChain ChatModel uses a slightly different field name for the upstream base URL — the per-provider functions return the correct keyword for each.

FAQ

Q: Does this break my eval set?

No. Your eval runs against the proxied output unchanged. Tessera additionally fires its own daily quality canary against a 10% production sample for SLA enforcement — that is independent of your eval pipeline.

Q: My agent uses tool calling. Does that still work?

Yes. The proxy passes tools through. Auto-route gates on tool-calling capability — if a cheaper fallback model does not support function calling, the request stays on the original model for that call.

Q: Streaming?

Streamed responses pass through. Cache hits still stream from the proxy edge (fast). Provider failures mid-stream surface a terminal error marker rather than silently retrying.

Q: Can I use this with the LangChain `init_chat_model()` helper?

Should work — pass the Tessera config kwargs as the LangChain init kwargs: init_chat_model("gpt-4o", model_provider="openai", **tessera_openai_config(api_key="tk_...")). This routes through the same constructor path as a direct ChatOpenAI(...) call, but we have not exhaustively tested every model_provider value. File an issue at tessera-llm/tessera-langchain if a specific provider/init combination misbehaves and we will ship a patch.

Q: What if I am using a LangChain provider class not in the list?

The Tessera proxy accepts the OpenAI wire format on api.tesseraai.io/v1/openai — any LangChain provider class that accepts a base_url override and default_headers works (Together, DeepSeek, Fireworks, OpenRouter pass-through, etc.). File an issue if your provider class is missing a first-class helper — we ship them in patch releases.

Q: How is this different from the main `tessera-sdk`?
Same proxy. Same mechanics. Same billing record. `tessera-sdk` patches the underlying OpenAI / Anthropic / etc. SDK constructors via a one-line tessera.activate(key). `tessera-langchain` wires into LangChain's ChatModel constructors directly. Use whichever fits your codebase. Side-by- side install is supported.

References

Try Tessera + LangChain on your workload

60M tokens free, no card, one line of config

pip install tessera-langchain (or npm install @tessera-llm/langchain). Wire your existing ChatModel through the Tessera proxy. Kill-switch any time. Pay 20% only on measured savings.

Get free API key