Status: Draft v1.1 Last updated: 2026-05-08 Owner: your name
v1.1 changes:
AdapterCapabilitiesextended withsupports_tools,supports_system_prompt,supports_structured_output(§7.2).block_droppeddowngraded from “trace event” to log-line only (§7.3), consistent with bus diagnostics.provider_overridesfield removed fromToolDefinition(unused across all specs).RoutingDecisionRecord.modemapping to routing-engine’s chain enum documented (§4.3).
Throughout: paths shown use
~/.yourtool/as a placeholder for the final config directory.
This document specifies the canonical internal representation of messages, content blocks, tool calls, and related metadata used throughout the system. All provider adapters (Anthropic, OpenAI, Ollama, OpenRouter, etc.) translate between this canonical format and their respective wire formats. All persistence layers store messages in this format. All routing, tooling, and analytics code consumes this format.
The canonical format is the single most load-bearing data contract in the system. Changes to it cascade everywhere; getting it wrong forces rewrites; getting it right makes mid-session model swapping, cross-provider replay, schema migration, and future provider additions tractable.
logit_bias) live in adapter-level options, not the canonical format.Both Anthropic and OpenAI converged on representing messages as ordered lists of typed content blocks rather than as strings. The shape is:
Message := role + ordered list of ContentBlock
ContentBlock := tagged union over { text, tool_use, tool_result, image, thinking, ... }
The canonical format adopts this shape directly. Every message — including system messages and tool result messages — is a role + content list.
Anthropic places tool results inside a user-role message; OpenAI uses a dedicated tool role. Internally we treat tool results as their own first-class message type with role: TOOL. Adapters figure out which on-the-wire shape to emit.
This decision avoids leaky abstractions: the rest of the system (routing, tracing, evaluation) can ask “what role is this message?” without knowing about per-provider conventions.
Anthropic carries system prompts as a top-level request parameter; OpenAI uses a system-role message. Internally, system prompts are role: SYSTEM messages stored in the same message list. This lets context-assembly code compose system prompts (concatenate base instructions + memory + skills + workspace info) without coupling to one provider’s API shape. Adapters handle hoisting.
class Message:
id: str # ULID, monotonic
session_id: str # FK to Session
role: Role
content: list[ContentBlock] # ordered, non-empty for non-system messages
metadata: MessageMetadata
created_at: datetime # microsecond precision UTC
schema_version: int # current: 1
class Role(StrEnum):
USER = "user"
ASSISTANT = "assistant"
SYSTEM = "system"
TOOL = "tool"
type)class TextBlock:
type: Literal["text"] = "text"
text: str
class ToolUseBlock:
type: Literal["tool_use"] = "tool_use"
id: str # canonical id we generate; see §6
name: str # canonical tool name (snake_case, no provider prefix)
input: dict # JSON-Schema-validated against tool definition
class ToolResultBlock:
type: Literal["tool_result"] = "tool_result"
tool_use_id: str # FK to a ToolUseBlock.id
content: list[ContentBlock] # usually [TextBlock]; may include ImageBlock
is_error: bool = False
class ImageBlock:
type: Literal["image"] = "image"
source: ImageSource
media_type: str # IANA media type, e.g. "image/png"
class ImageSource:
kind: Literal["base64", "url", "file_ref"]
data: str # base64 string, URL, or workspace-relative path
class ThinkingBlock:
type: Literal["thinking"] = "thinking"
text: str
signature: str | None = None # opaque provider token (Anthropic uses this for verifiability)
class RedactedThinkingBlock:
type: Literal["redacted_thinking"] = "redacted_thinking"
data: str # opaque provider-encoded blob
text and tool_use/tool_result: the conversational core. Universal.image: vision support. Both major providers support it; canonical form must too.thinking and redacted_thinking: extended-reasoning models (Anthropic specifically). If we don’t preserve these, mid-session swap from a thinking model loses signal needed for replay.New block types are additive only. They get a new type discriminator. Existing block types are never overloaded. If a future provider introduces audio or video as first-class content, those become audio and video blocks.
When an adapter encounters a canonical block type its provider can’t represent, it MUST write a structured log entry at WARN level (not a bus event — consistent with how bus.overflow and bus.handler_error are handled per event-bus-and-trace-catalog.md §3.5 and §5.2) with: session_id, message_id, block type, adapter name, reason. The adapter then either drops the block (default) or fails the request (if marked critical via metadata). The block dropping is a lossy projection by design — see §7.
class MessageMetadata:
# Provenance — who/what produced this message
model: str | None = None # canonical model id, e.g. "anthropic:claude-sonnet-4-6"
provider: str | None = None # provider key, e.g. "anthropic"
# Routing decision context (for ASSISTANT messages)
routing: RoutingDecisionRecord | None = None
# Resource accounting
usage: Usage | None = None
# Tool linkage (for TOOL messages)
parent_tool_use_id: str | None = None
# Status
status: MessageStatus = MessageStatus.COMPLETE
# Provider-specific opaque payload (round-trip aid; see §6.4)
provider_raw: dict | None = None
# Multi-user identity dimensions (multi-user.md §3, §4.4). Stable
# principal ids resolved from the gateway key at request entry; both
# `None` for agent-loop traffic and for pre-multi-user gateway keys.
# Pseudonymous — no plaintext PII; emails live in `users.json` only.
user_id: str | None = None
team_id: str | None = None
class MessageStatus(StrEnum):
COMPLETE = "complete"
PARTIAL = "partial" # streaming in progress
CANCELLED = "cancelled" # user cancelled mid-generation
ERROR = "error" # generation failed
class RoutingDecisionRecord:
"""Compact summary of a routing decision attached to an assistant
message's metadata. The full chain trace lives in the corresponding
`route.decided` event (see event-bus-and-trace-catalog.md §6.5).
The `mode` enum here is a coarse summary; it projects the routing
engine's chain enum (per routing-engine.md §4.1) down to the
user-facing classes most useful for analytics and `/model show`."""
mode: RoutingMode # see RoutingMode below
chosen_model: str # echoes metadata.model
reason: str # human-readable
rule_name: str | None = None # for RULE mode
confidence: float | None = None # for PATTERN mode (0..1)
alternatives_considered: list[str] = []
class RoutingMode(StrEnum):
"""Coarse summary of why this turn picked this model. The chain enum
in routing-engine.md §4.1 is finer-grained; this projection is what
persists on each assistant message. Mapping:
Chain policy → RoutingMode
─────────────────────────────────────
PER_MESSAGE_OVERRIDE → OVERRIDE
MANUAL_STICKY → MANUAL
RULE → RULE
PATTERN → PATTERN
DELEGATE_REQUEST → DELEGATE
WORKSPACE_DEFAULT → DEFAULT
GLOBAL_DEFAULT → DEFAULT
WORKSPACE_DEFAULT and GLOBAL_DEFAULT collapse to DEFAULT because the
user-facing 'why this model' rarely cares which level the default
came from. Inspect the route.decided event for that detail.
"""
OVERRIDE = "override" # per-message @model
MANUAL = "manual" # session sticky
RULE = "rule" # configured rule matched
PATTERN = "pattern" # pattern store recommendation
DELEGATE = "delegate" # planner delegation
DEFAULT = "default" # workspace or global default
class Usage:
input_tokens: int
output_tokens: int
cached_input_tokens: int = 0
cache_creation_input_tokens: int = 0
cost_usd: Decimal # computed from local price table; see §6.3
pricing_version: str # FK to price table entry
latency_ms: int
Tool definitions are not messages but are referenced by ToolUseBlock.name. Defined here for completeness:
class ToolDefinition:
name: str # canonical, snake_case, globally unique
description: str # used in system prompt when tool is exposed
input_schema: dict # JSON Schema (subset; see §5.4)
side_effects: SideEffects
requires_workspace: bool = True
class SideEffects(StrEnum):
NONE = "none" # pure read or computation
READ = "read" # filesystem or network read
WRITE = "write" # filesystem write
EXECUTE = "execute" # shell or arbitrary code execution
NETWORK = "network" # outbound network mutation (POST, etc.)
side_effects drives confirmation prompts, routing constraints, and trace classification. Every tool MUST declare it honestly.
These rules must hold at all times. Adapter implementations and core code violating them are bugs.
status: PARTIAL messages may violate other invariants. A streaming message in progress may have empty content, malformed tool input, etc. Validation runs only at status: COMPLETE.tool_input_invalid events.model and provider are set on every ASSISTANT message at status: COMPLETE. Required for replay correctness and analytics.routing is set on every ASSISTANT message at status: COMPLETE. Always traceable.usage is set on every ASSISTANT message at status: COMPLETE. Cost accounting depends on this.parent_tool_use_id is set on every TOOL message. Always non-null.provider_raw is opaque. Core code never reads it; only the adapter that wrote it reads it on round-trip.The input_schema of every ToolDefinition is a subset of JSON Schema:
string, number, integer, boolean, null, object, array), enum, required, properties, items, description, basic format annotations.$ref, oneOf, anyOf, allOf, not, if/then/else, patternProperties, additionalProperties: <schema> (boolean is OK).Rationale: this is the intersection of what Anthropic and OpenAI accept reliably. Tools using disallowed constructs fail at registration time with a clear error.
ULIDs. Monotonic per session. Generated by the core, never by adapters.
System-generated, format: tu_<ulid>. The core generates these when an adapter parses a streaming tool_use_start event from any provider. Adapters maintain a per-session bidirectional map between canonical tool ids and provider-issued tool ids:
canonical_id ↔ provider_id
"tu_01HZ..." ↔ "toolu_01ABC..." (Anthropic)
"tu_01HZ..." ↔ "call_xyz789" (OpenAI)
When serializing canonical → wire, the adapter looks up or generates a provider-side id. When parsing wire → canonical, it looks up the canonical id from the provider-side id.
This design lets canonical content survive provider switches: the model sees its own tool_use ids reflected back in tool_result content even after a session swaps providers.
Canonical form: <provider>:<model_name>.
anthropic:claude-sonnet-4-6
anthropic:claude-opus-4-7
anthropic:claude-haiku-4-5
openai:gpt-5
openai:gpt-5-mini
ollama:llama-3.3-70b
openrouter:deepseek/deepseek-v3
Model ids appear in:
MessageMetadata.modelsonnet → anthropic:claude-sonnet-4-6)Costs are computed by the core from a maintained price table, never parroted from the provider. Reasons:
pricing_version.Pricing table shape (illustrative):
pricing_version: "2026-05-08"
models:
anthropic:claude-sonnet-4-6:
input_per_mtok_usd: 3.00
output_per_mtok_usd: 15.00
cached_read_per_mtok_usd: 0.30
cache_write_per_mtok_usd: 3.75
openai:gpt-5:
input_per_mtok_usd: 2.50
output_per_mtok_usd: 10.00
ollama:llama-3.3-70b:
input_per_mtok_usd: 0.0
output_per_mtok_usd: 0.0
MessageMetadata.usage.pricing_version records which version was active when the cost was computed. Retroactive reprice is possible by walking the trace store.
provider_raw fieldRound-trip aid. When an adapter parses a provider response into canonical form, it MAY stash adapter-specific data in provider_raw — for example, Anthropic’s exact signature on a thinking block, or stop-reason strings. When that adapter later serializes the message back to the same provider, it consults provider_raw if present.
Rules:
provider_raw reads it.provider_raw entirely.provider_raw for any reason.provider_raw is not part of equality comparisons or hashing.This is the escape hatch that handles provider-specific quirks without polluting the canonical schema.
Every adapter implements:
class Adapter:
capabilities: AdapterCapabilities
def to_wire(self, messages: list[Message], tools: list[ToolDefinition], options: AdapterOptions) -> WireRequest:
"""Serialize canonical messages to provider wire format."""
def from_wire_response(self, response: WireResponse, request_context: RequestContext) -> Message:
"""Parse a non-streaming response into a canonical Message."""
async def stream_response(self, response_stream: AsyncIterator[WireChunk], context: RequestContext) -> AsyncIterator[StreamEvent]:
"""Translate provider stream chunks into canonical StreamEvents."""
def estimate_input_tokens(self, messages: list[Message], tools: list[ToolDefinition]) -> int:
"""Pre-flight estimate for routing and budget decisions."""
class AdapterCapabilities:
# Content type support
supports_thinking: bool
supports_images: bool
supports_tools: bool # added v1.1; required by routing-engine §4.4
supports_system_prompt: bool # added v1.1; required by routing-engine §4.4
supports_structured_output: bool # added v1.1; required by routing-engine §4.4
# Streaming
supports_streaming: bool
supports_streaming_tool_calls: bool
supports_parallel_tool_calls: bool
# Caching
supports_prompt_caching: bool # added v1.1; matches provider-adapter §3.4
# System prompt placement (provider-specific quirk)
supports_system_messages_in_list: bool # vs hoisted
# Limits
max_context_tokens: int
max_output_tokens: int
# Image format support (only meaningful if supports_images)
accepted_image_media_types: list[str]
Capability declarations are consulted by:
routing-engine.md §4.4).Declarations MUST be honest. If a model technically supports a feature but the adapter implementation doesn’t expose it, declare false. The capability surface is the substitutability boundary; lying about it breaks mid-session swaps.
The three fields added in v1.1 (supports_tools, supports_system_prompt, supports_structured_output) are required for honest validation of turns that genuinely need those capabilities — without them, a turn with tools could be routed to a tool-incapable model and fail at the adapter rather than cleanly fall through during routing.
When canonical content cannot be represented in a provider’s wire format, the adapter MUST:
event-bus-and-trace-catalog.md §3.5.)Examples:
ThinkingBlock sent to OpenAI: dropped, logged.ImageBlock sent to a text-only model: dropped, logged. (Routing should prevent this from happening; this is the safety net.)RedactedThinkingBlock cross-provider: dropped, logged.Adapters translate provider-specific stream chunks into canonical stream events:
class StreamEvent: # tagged union, type discriminator
type: Literal["text_delta", "tool_use_start", "tool_use_input_delta",
"tool_use_end", "thinking_delta", "message_complete",
"usage_update", "error"]
# type-specific payload fields
Specifics live in the streaming protocol spec (companion document). The canonical event shape is stable across providers.
User asks a question, assistant answers. Two messages.
[
Message(
id="01HZ001",
session_id="sess_42",
role=USER,
content=[TextBlock(text="What's a ULID?")],
metadata=MessageMetadata(),
created_at=...,
schema_version=1,
),
Message(
id="01HZ002",
session_id="sess_42",
role=ASSISTANT,
content=[TextBlock(text="A ULID is a 128-bit identifier...")],
metadata=MessageMetadata(
model="anthropic:claude-sonnet-4-6",
provider="anthropic",
routing=RoutingDecisionRecord(
mode=DEFAULT,
chosen_model="anthropic:claude-sonnet-4-6",
reason="workspace default",
),
usage=Usage(
input_tokens=8,
output_tokens=42,
cost_usd=Decimal("0.000654"),
pricing_version="2026-05-08",
latency_ms=820,
),
),
...
),
]
User asks the assistant to read a file. Assistant calls a tool, tool returns content, assistant responds.
[
# Turn 1: user message
Message(
role=USER,
content=[TextBlock(text="Summarize README.md")],
...
),
# Turn 1: assistant uses tool
Message(
role=ASSISTANT,
content=[
TextBlock(text="I'll read the file."),
ToolUseBlock(
id="tu_01HZ100",
name="read_file",
input={"path": "README.md"},
),
],
metadata=MessageMetadata(
model="anthropic:claude-sonnet-4-6",
provider="anthropic",
routing=...,
usage=...,
),
),
# Turn 1: tool result
Message(
role=TOOL,
content=[
ToolResultBlock(
tool_use_id="tu_01HZ100",
content=[TextBlock(text="# Project Foo\n\nA tool for...")],
),
],
metadata=MessageMetadata(
parent_tool_use_id="tu_01HZ100",
),
),
# Turn 1: assistant final response (same turn, second LLM call)
Message(
role=ASSISTANT,
content=[TextBlock(text="The README describes Project Foo, a tool for...")],
metadata=MessageMetadata(
model="anthropic:claude-sonnet-4-6",
provider="anthropic",
...
),
),
]
After turn 2, user runs /model openai:gpt-5. Turn 3 uses the new model.
[
# Turns 1-2 use anthropic (as above) ...
# Turn 3: user message
Message(role=USER, content=[TextBlock(text="What about the LICENSE file?")], ...),
# Turn 3: assistant — different provider, same canonical form
Message(
role=ASSISTANT,
content=[
ToolUseBlock(
id="tu_01HZ200", # new canonical id
name="read_file",
input={"path": "LICENSE"},
),
],
metadata=MessageMetadata(
model="openai:gpt-5",
provider="openai",
routing=RoutingDecisionRecord(
mode=MANUAL,
chosen_model="openai:gpt-5",
reason="user swap via /model command",
),
...
),
),
]
The OpenAI adapter, when serializing this list to GPT-5’s wire format:
role: tool message with tool_call_id mapped from the canonical id via the per-session id map.tool_calls array with function.arguments as JSON-stringified input.User has images in turn 3 history. User attempts /model anthropic:claude-haiku-4-5-text-only (hypothetical).
Adapter capability check fails: supports_images: false, but history contains ImageBlocks.
Routing engine refuses the swap. Emits routing_constraint_failure event. TUI shows: Cannot swap to Haiku-text-only: session contains images that this model can't process. Active model unchanged.
CREATE TABLE sessions (
id TEXT PRIMARY KEY,
workspace_path TEXT NOT NULL,
active_model TEXT,
routing_policy_json TEXT,
schema_version INTEGER NOT NULL,
created_at INTEGER NOT NULL,
updated_at INTEGER NOT NULL
);
CREATE TABLE messages (
id TEXT PRIMARY KEY,
session_id TEXT NOT NULL,
role TEXT NOT NULL,
content_json TEXT NOT NULL,
metadata_json TEXT NOT NULL,
created_at INTEGER NOT NULL,
schema_version INTEGER NOT NULL,
FOREIGN KEY (session_id) REFERENCES sessions(id)
);
CREATE INDEX idx_messages_session_created ON messages(session_id, created_at);
CREATE TABLE tool_calls (
id TEXT PRIMARY KEY, -- canonical tool_use id
session_id TEXT NOT NULL,
message_id TEXT NOT NULL, -- the ASSISTANT message containing the tool_use
result_message_id TEXT, -- the TOOL message answering it (nullable)
name TEXT NOT NULL,
status TEXT NOT NULL, -- pending | succeeded | failed | cancelled
provider_id TEXT, -- the provider's id at the time of generation
provider TEXT,
created_at INTEGER NOT NULL,
completed_at INTEGER,
FOREIGN KEY (session_id) REFERENCES sessions(id),
FOREIGN KEY (message_id) REFERENCES messages(id),
FOREIGN KEY (result_message_id) REFERENCES messages(id)
);
CREATE INDEX idx_tool_calls_session_status ON tool_calls(session_id, status);
tool_calls denormalized. “Find unanswered tool calls” and “count failed tools per session” are common queries that would require JSON extraction otherwise.messages.content_json is reserved for a future schema bump if needed.Message.schema_version records which version of this spec the row conforms to. Current: 1.
Code reading messages MUST tolerate unknown fields in metadata and unknown content block types (skip with warning rather than crash). This protects against schema drift during partial deployments.
HTTP cassettes (via pytest-recording or vcr.py) are committed to the repo. Re-record when:
Cassettes are reviewed in PRs the same as code.
The following are deliberately deferred and tracked here:
provider_raw size and retention. Currently unbounded. May need a size cap or TTL if it grows pathologically.| Date | Decision | Rationale |
|---|---|---|
| 2026-05-08 | TOOL as first-class role | Avoids leaky abstractions; routing/tracing don’t need to know per-provider role conventions |
| 2026-05-08 | SYSTEM as first-class role | Lets context assembly compose system prompts without coupling to provider API shape |
| 2026-05-08 | Generated tool ids, not provider ids | Tool calls survive cross-provider replay; canonical id stable across the session lifetime |
| 2026-05-08 | provider_raw opaque escape hatch |
Handles per-provider quirks without polluting canonical schema or making it lossy on round-trip |
| 2026-05-08 | JSON Schema subset for tool input | Intersection of what Anthropic and OpenAI reliably accept; fail loud rather than ship-then-discover |
| 2026-05-08 | Cost computed locally, not parroted | Pricing changes; some providers don’t return cost; need historical accuracy via pricing_version |
| 2026-05-08 | Content as JSON in single column | Query patterns are session-ordered reads; normalization adds complexity without benefit at our scale |
| 2026-05-08 | Lossy projection writes WARN logs, never crashes; not a bus event | Mid-session swap must remain resilient; observability over hard failure; consistent with bus diagnostics. |
| 2026-05-08 | AdapterCapabilities extended (supports_tools, supports_system_prompt, supports_structured_output, supports_prompt_caching) |
Required by routing-engine §4.4 capability validation; substitutability needs honest declaration of these. |
| 2026-05-08 | provider_overrides removed from ToolDefinition |
Field was unused across all specs; removing rather than carrying dead surface area. |
| 2026-05-08 | RoutingDecisionRecord.mode is a coarse summary; chain enum lives in event |
Persisted message metadata stays compact; full chain accessible via the route.decided event. |