Skip to main content

Tessera × Vercel AI SDK

@tessera-llm/vercel-ai is a thin adapter for the Vercel AI SDK. Spread tesseraOpenAIConfig() (or the per-provider sibling) into the createOpenAI / createAnthropic / createMistral options bag and every generateText / streamText / generateObject / streamObject call routes through the Tessera substrate proxy.

Same Tessera mechanic stack — auto-route, exact + semantic cache, provider prompt cache, compression, output-length ceiling, batch arbitrage — fires server-side on every request. ESM/CJS dual package, no runtime dependencies of its own — compatible with the Next.js App Router, Server Actions, the Edge runtime, and the Node runtime.

Install

npm install @tessera-llm/vercel-ai
# Plus whichever provider package you use:
npm install @ai-sdk/openai          # or @ai-sdk/anthropic / @ai-sdk/mistral / @ai-sdk/groq / @ai-sdk/cohere

Get a free Tessera API key at tesseraai.io/dev — 60M tokens/month, no card up front.

Quickstart

import { generateText } from "ai";
import { createOpenAI } from "@ai-sdk/openai";
import { tesseraOpenAIConfig } from "@tessera-llm/vercel-ai";

const openai = createOpenAI({
  apiKey: process.env.OPENAI_API_KEY!,
  ...tesseraOpenAIConfig({ apiKey: process.env.TESSERA_API_KEY! }),
});

const { text } = await generateText({
  model: openai("gpt-4o"),
  prompt: "Summarize this customer support ticket in 2 sentences.",
});

Scenario 1 — Streaming chat API route

The most common pattern: a Next.js App Router /api/chat route that streams responses back to a useChat() hook. Tessera preserves SSE delta semantics — no buffering, no chunk reshape — so user-perceived latency on cache-miss paths is unchanged.

// app/api/chat/route.ts
import { streamText } from "ai";
import { tesseraOpenAI } from "@tessera-llm/vercel-ai";

export async function POST(req: Request) {
  const { messages } = await req.json();

  const openai = await tesseraOpenAI({
    openaiApiKey: process.env.OPENAI_API_KEY!,
    tesseraApiKey: process.env.TESSERA_API_KEY!,
  });

  const result = streamText({ model: openai("gpt-4o"), messages });
  return result.toDataStreamResponse();
}

Scenario 2 — Structured output with zod schema

generateObject with a zod schema for classification, extraction, or routing. Auto-route can swap gpt-4o for gpt-4o-mini on schema-constrained tasks where the canary confirms equivalent output — often the single biggest savings on high-volume classification workloads.

import { generateObject } from "ai";
import { tesseraOpenAI } from "@tessera-llm/vercel-ai";
import { z } from "zod";

const openai = await tesseraOpenAI({
  openaiApiKey: process.env.OPENAI_API_KEY!,
  tesseraApiKey: process.env.TESSERA_API_KEY!,
});

const { object } = await generateObject({
  model: openai("gpt-4o-mini"),
  schema: z.object({
    intent: z.enum(["bug", "feature", "billing", "other"]),
    severity: z.enum(["low", "medium", "high", "critical"]),
    summary: z.string().max(140),
  }),
  prompt: `Classify this support ticket: ${ticket.body}`,
});

Scenario 3 — Edge runtime + Anthropic tool use

Tessera is a Cloudflare Worker on api.tesseraai.io — so running your Next.js handler on the Edge runtime keeps the entire request path on edge infrastructure. Auto-route gates on tool-calling capability, so an agent using tools never gets routed to a non-tool-capable model.

// app/api/agent/route.ts — Next.js App Router on the Edge runtime
export const runtime = "edge";

import { streamText } from "ai";
import { createAnthropic } from "@ai-sdk/anthropic";
import { tesseraAnthropicConfig } from "@tessera-llm/vercel-ai";

const anthropic = createAnthropic({
  apiKey: process.env.ANTHROPIC_API_KEY!,
  ...tesseraAnthropicConfig({ apiKey: process.env.TESSERA_API_KEY! }),
});

export async function POST(req: Request) {
  const { prompt } = await req.json();
  const result = streamText({
    model: anthropic("claude-sonnet-4-6"),
    prompt,
    tools: yourTools,
  });
  return result.toDataStreamResponse();
}

Worked savings example

Customer-support workflow on gpt-4o running on Next.js Edge runtime, 5B tokens/month, OpenAI list prices.

StageCost / moSaved
Baseline — OpenAI direct via Vercel AI SDK$24,000
+ Tessera (route, cache, prompt-cache, compress, M9 ceiling, batch)$9,400$14,600
Tessera fee (20% × measured savings)$2,920
You net pay$12,320$11,680 / mo saved

Quality canary mean-score 0.96 across the full mechanic stack (floor 0.95) — the 0.95 SLA held all 30 days.

Architecture and quality contract

Open-source adapter (Apache-2.0) ↔ closed-source proxy. The wire format is open so you can audit what we send; the mechanic implementations live in the Tessera Cloudflare Worker proxy at api.tesseraai.io.

Composition cap (max 2 content-mutators per request), per-stack 0.95 quality floor with auto-rollback + 10% credit on breach, and audit-immutable measurement (two pricing snapshots per request — the model you asked for, the model that actually ran) all enforced on the proxy. The verified-savings ledger lives at ledger.tesseraai.io.

FAQ

Does this work with the Next.js App Router and Server Actions?
Yes. The adapter is a thin ESM/CJS dual package with no runtime dependencies on its own — same compatibility surface as the Vercel AI SDK itself. App Router routes, Server Actions, Edge runtime, and Node runtime all supported.
Does streaming still work the same way?
Yes. Tessera passes SSE chunks through unchanged. streamText / streamObject / toDataStreamResponse() / useChat() all behave identically. Provider prompt cache and exact cache fire on the request shape and stream the cached response back with the same SSE framing.
What if I already use multiple providers in one app?
Same Tessera key covers them all. Wire each @ai-sdk/* package up via the matching tesseraXConfig() helper — tesseraOpenAIConfig, tesseraAnthropicConfig, tesseraMistralConfig, tesseraGroqConfig, tesseraCohereConfig. Mechanic decisions and quality canary run per-workload, so each provider stack gets its own per-stack quality SLA.

Where to go next