Skip to main content

Tessera × AutoGen 0.4+

tessera-autogen wires Tessera's substrate proxy into AutoGen 0.4+ multi-agent teams. Factory functions return a pre-wired AutoGen ChatCompletionClient ready to pass into AssistantAgent, SelectorGroupChat, Swarm, or any other AutoGen team primitive.

v0.1 ships OpenAI + Anthropic — covers ~85% of customer LLM traffic per our outreach research. Mistral / Gemini queued for v0.2. AutoGen reflection loops, tool-use exchanges, and group-chat selector decisions all re-issue prompts often enough that exact + semantic cache contribute the bulk of savings on this framework specifically.

Install

pip install tessera-autogen autogen-core autogen-agentchat autogen-ext

Get a free Tessera API key at tesseraai.io/dev — 60M tokens/month, no card up front.

Quickstart — OpenAI

from autogen_agentchat.agents import AssistantAgent
from tessera_autogen import tessera_openai_client

client = tessera_openai_client(
    model="gpt-4o",
    openai_api_key="sk-...",   # your OpenAI key
    tessera_api_key="tk_...",  # get a free one at tesseraai.io/dev
)

agent = AssistantAgent(name="researcher", model_client=client)

# Rest of your AutoGen code runs unchanged — single-agent calls,
# SelectorGroupChat, Swarm, tool use all route through Tessera.

Scenario 1 — SelectorGroupChat with tiered model assignment

A two-agent team where the selector / triage agent runs on a cheap model and the reasoner runs on a capable one. One Tessera key powers both clients; per-workload auto-route + quality canary fire independently for each agent. The selector workload, which routinely outputs short routing decisions, often qualifies for further auto-route swaps that the reasoner cannot tolerate.

from autogen_agentchat.agents import AssistantAgent
from autogen_agentchat.teams import SelectorGroupChat
from autogen_agentchat.conditions import MaxMessageTermination
from tessera_autogen import tessera_openai_client, tessera_anthropic_client

triage_client = tessera_openai_client(
    model="gpt-4o-mini", openai_api_key="sk-...", tessera_api_key="tk_...",
)
reasoner_client = tessera_anthropic_client(
    model="claude-sonnet-4-6", anthropic_api_key="sk-ant-...", tessera_api_key="tk_...",
)

triage = AssistantAgent(name="triage", model_client=triage_client,
    system_message="Decide ticket priority and routing.")
reasoner = AssistantAgent(name="reasoner", model_client=reasoner_client,
    system_message="Investigate root cause and draft a fix proposal.")

team = SelectorGroupChat(
    [triage, reasoner],
    model_client=triage_client,  # cheap selector
    termination_condition=MaxMessageTermination(8),
)
await team.run(task="Investigate ticket #7733 and propose a fix.")

Scenario 2 — Swarm with handoffs

AutoGen Swarm agents hand off to each other based on a shared context. The same handoff topology runs through one Tessera ChatCompletionClient — cache hits on identical handoff messages return at zero upstream cost.

from autogen_agentchat.agents import AssistantAgent
from autogen_agentchat.teams import Swarm
from tessera_autogen import tessera_openai_client

shared_client = tessera_openai_client(
    model="gpt-4o", openai_api_key="sk-...", tessera_api_key="tk_...",
)

researcher = AssistantAgent(
    name="researcher", model_client=shared_client,
    handoffs=["writer"], system_message="Research and hand off to writer.",
)
writer = AssistantAgent(
    name="writer", model_client=shared_client,
    handoffs=["researcher"], system_message="Draft, hand back to researcher if more facts needed.",
)

swarm = Swarm([researcher, writer])
await swarm.run(task="Write a 300-word brief on substrate-layer LLM cost optimization.")

Scenario 3 — Explicit ChatCompletionClient with custom kwargs

When the factory function hides a kwarg you need (custom timeout, response_format, parallel tool calls), build the client yourself and spread tessera_openai_config() into the kwargs.

# Fine-grained client kwargs when the factory functions don't expose what
# you need — e.g. response_format, custom timeout, parallel_tool_calls.

from autogen_ext.models.openai import OpenAIChatCompletionClient
from tessera_autogen import tessera_openai_config

client = OpenAIChatCompletionClient(
    model="gpt-4o",
    api_key="sk-...",
    **tessera_openai_config(api_key="tk_..."),
)

Worked savings example

Three-agent research team on AutoGen (planner → researcher → writer), 5B tokens/month, mix of gpt-4o-mini and claude-sonnet-4-6.

StageCost / moSaved
Baseline — direct providers via AutoGen$24,000
+ Tessera (route, cache, prompt-cache, compress, M9 ceiling, batch)$9,400$14,600
Tessera fee (20% × measured savings)$2,920
You net pay$12,320$11,680 / mo saved

AutoGen workloads see particularly high M2 + M5 cache hit rates because SelectorGroupChat re-issues the agent-selection prompt on every turn.

Architecture and quality contract

autogen-core + autogen-ext are peer dependencies — install them alongside this package. The factory builds an OpenAIChatCompletionClient (or Anthropic equivalent) with base_url + default_headers pointed at the Tessera proxy.

Composition cap, per-stack 0.95 quality floor with auto-rollback, audit-immutable measurement — all enforced on the proxy. Verified-savings ledger at ledger.tesseraai.io.

FAQ

Does this work with AutoGen 0.2 (the old asyncio-based API)?
No. v0.1 targets AutoGen 0.4+ (the autogen-core / autogen-agentchat / autogen-ext architecture). The 0.2 API surface is different enough that it would need a separate adapter. Open an issue if you need 0.2 compatibility.
Does the per-stack canary fire correctly on group chats?
Yes — but the canary keys on workload, not on conversation. Each AutoGen agent is one workload from Tessera's perspective; multi-agent conversations create multiple per-workload eval streams that get aggregated to a per-stack score. Auto-rollback fires per-workload, not per-conversation.
Can I use this with Microsoft's Magnetic-One example?
Yes — Magnetic-One uses the same ChatCompletionClient abstraction. Wire each constituent agent (the Orchestrator, WebSurfer, FileSurfer, Coder, ComputerTerminal) through its matching Tessera factory. Tool-using agents benefit from auto-route's tool-capability gate.

Where to go next