Status: Draft v1 (retrospective; documents the existing implementation in packages/metis-core/src/metis_core/memory/) Last updated: 2026-05-12
The memory store is the per-workspace, agent-curated, byte-budgeted markdown layer that gives the agent cross-session continuity. Two files per workspace:
MEMORY.md — workspace facts the agent should remember (~2 KB soft cap, 4 KB hard cap).USER.md — facts about the user (~1.5 KB soft cap, 3 KB hard cap).Both live at <workspace>/.metis/. They are written by the agent through three tools (memory_add, memory_replace, memory_consolidate) and read by the session manager at every LLM call (composed into the system prompt fresh per turn).
This spec depends on:
canonical-message-format.md for ToolDefinition, ToolUseBlock, SideEffects.event-bus-and-trace-catalog.md for memory.updated and memory.eviction payload schemas.tool-dispatcher.md for how the memory tools register and dispatch.git add-able, human-readable.metis chat invocations. The agent doesn’t relearn the workspace every session.memory.eviction as a signal the agent must consolidate; hard-cap overflow rejects the write so the agent can’t keep growing.memory_consolidate. Auto-pruning would lose curation.git. If the user wants history, they git init the workspace.USER.md per workspace. Multi-user is a Phase 3+ concern (sync layer).USER.md is workspace-scoped, not user-scoped, in v1. A future “global user memory” would be a separate file or a sync feature.<workspace>/.metis/
├── MEMORY.md # workspace facts
└── USER.md # user facts
The .metis/ directory is created lazily on first write. Empty files = missing files = ""; the read API smooths this over.
The directory may also contain other Metis state (the trace SQLite, session SQLite) when a workspace is treated as the trace target. v1 does not enforce strict separation.
class MemoryFile(StrEnum):
"""Closed enum matching event-bus memory.updated.file."""
MEMORY = "MEMORY.md"
USER = "USER.md"
@dataclass(frozen=True)
class WriteResult:
"""Returned by writes; carries hashes for memory.updated events."""
file: MemoryFile
before_hash: str # SHA-256 of pre-write content (utf-8 bytes)
after_hash: str # SHA-256 of post-write content
before_size_bytes: int
after_size_bytes: int
over_soft_cap: bool # write succeeded but exceeded soft cap
over_hard_cap: bool # always False on success; raises otherwise
class MemoryStore:
def __init__(self, workspace_path: str | Path) -> None: ...
@property
def workspace_path(self) -> str: ...
# Reads
def read(self, file: MemoryFile | str) -> str: ...
def exists(self, file: MemoryFile | str) -> bool: ...
def size_bytes(self, file: MemoryFile | str) -> int: ...
# Writes — all return WriteResult; all raise MemoryHardCapExceeded on overflow.
def add_entry(self, file, entry: str) -> WriteResult: ...
def replace(self, file, old_text: str, new_text: str) -> WriteResult: ...
def consolidate(self, file, new_content: str) -> WriteResult: ...
# Caps (static)
@staticmethod
def soft_cap(file: MemoryFile) -> int: ... # 2048 for MEMORY, 1536 for USER
@staticmethod
def hard_cap(file: MemoryFile) -> int: ... # 4096 for MEMORY, 3072 for USER
# Composition
def assemble_system_prompt(self, base: str) -> str:
"""Compose: base + USER.md section + MEMORY.md section."""
| File | Soft cap | Hard cap | Rationale |
|---|---|---|---|
MEMORY.md |
2048 B | 4096 B | Workspace facts; ~500 tokens at soft cap. Fits in context cheaply. |
USER.md |
1536 B | 3072 B | User facts; smaller because typically less to remember. |
Soft cap — write succeeds; WriteResult.over_soft_cap = True; the bus emits a memory.eviction event (or, more accurately, the tool layer translates the over-soft-cap signal into a hint to memory_consolidate). The agent sees the hint in the tool result text.
Hard cap — write is rejected by raising MemoryHardCapExceeded. The tool layer translates this into a ToolExecutionError with is_user_visible: true. The agent receives the error and must memory_consolidate before adding more.
Sizes are measured in utf-8 bytes, not characters. Multi-byte content (rare for code workflows) counts at byte cost.
Soft cap as a signal: the agent learns that consolidation is needed but isn’t blocked. Hard cap as an enforcement: the agent literally cannot grow the file past the limit.
In practice the agent typically consolidates between soft and hard cap. Hard cap acts as a runaway-loop safety net: even if the agent ignores the soft-cap signal, the file can’t bloat indefinitely.
Because unbounded memory destroys context quality. The dominant alternative — CLAUDE.md or .cursorrules files that grow indefinitely — turns memory into a noise floor on every turn. The agent reads everything and treats nothing as load-bearing.
By forcing eviction, the agent stays sharp on what matters. The peer with the same stance is Letta (Series-A funded; bounded character-limited core memory blocks with agent self-edit tools). Metis’s hard byte budgets are tighter than Letta’s character caps; this is a deliberate position.
Three tools, all SideEffects.WRITE, all requires_workspace: true. Registered via metis.memory.tools.register_memory_tools(dispatcher).
memory_addAppend a single entry to MEMORY.md or USER.md.
Input schema:
{
"type": "object",
"properties": {
"file": {"type": "string", "enum": ["MEMORY.md", "USER.md"]},
"entry": {"type": "string", "minLength": 1}
},
"required": ["file", "entry"],
"additionalProperties": false
}
Semantics:
entry; rejects empty / whitespace-only."— over soft cap; consider memory_consolidate".memory_replaceReplace a unique substring in MEMORY.md or USER.md.
Input schema:
{
"type": "object",
"properties": {
"file": {"type": "string", "enum": ["MEMORY.md", "USER.md"]},
"old": {"type": "string"},
"new": {"type": "string"}
},
"required": ["file", "old", "new"],
"additionalProperties": false
}
Semantics:
old must appear exactly once in the file. Zero or many occurrences raise ToolExecutionError.memory_consolidateReplace the entire content of MEMORY.md or USER.md.
Input schema:
{
"type": "object",
"properties": {
"file": {"type": "string", "enum": ["MEMORY.md", "USER.md"]},
"content": {"type": "string"}
},
"required": ["file", "content"],
"additionalProperties": false
}
Semantics:
consolidate to a value larger than the hard cap either.Two event types in the bus catalog:
memory.updatedFired on every successful write.
{
"file": Literal["MEMORY.md", "USER.md"],
"operation": Literal["add", "replace", "consolidate"],
"before_hash": str, # SHA-256 hex of utf-8 bytes
"after_hash": str,
"before_size_bytes": int,
"after_size_bytes": int,
}
memory.evictionFired on soft-cap overflow during a write. The session manager (or whatever layer wraps the tool) is responsible for emission; the MemoryStore itself doesn’t touch the bus.
{
"file": Literal["MEMORY.md", "USER.md"],
"trigger": Literal["size_cap_exceeded", "manual"],
"entries_evicted": int, # 0 for soft-cap-warning-only
"size_before_bytes": int,
"size_after_bytes": int,
}
In v1, entries_evicted is informational; nothing is auto-evicted. The event is the agent’s cue to call memory_consolidate.
Sensitivity for both: private (the content is verbatim agent memory, potentially including user prompts or workspace facts).
MemoryStore.assemble_system_prompt(base: str) -> str produces:
{base}
## User context (USER.md)
{USER.md content}
## Workspace memory (MEMORY.md)
{MEMORY.md content}
Empty files are omitted. The SessionManager calls this fresh on every LLM call (not cached) so that the same-turn memory writes are visible to the next LLM call within the turn.
Implications:
<workspace>/.metis/ is the only on-disk location. No global memory in v1.Path.write_text is a single syscall; either the new content lands or it doesn’t. No half-written state.str. Used for memory.updated events.MemoryStore per session. Injected via SessionManager’s memory_factory. Multiple sessions in the same workspace share the on-disk files but each session re-reads on every operation; there is no in-process cache.metis serve per workspace). Concurrent writes are a Phase 3+ concern when sync ships."", not raise..metis/ is created lazily on first write.add_entry appends with newline discipline — joining to non-newline-terminated content adds the newline.replace requires unique old — zero or many matches raise.WriteResult.over_soft_cap = True; write succeeds.MemoryHardCapExceeded raised; file unchanged.before_hash on write N+1 equals after_hash on write N.memory_consolidate to empty string works (rewriting to clear).assemble_system_prompt omits empty sections.assemble_system_prompt includes both sections when both files are populated.MemoryHardCapExceeded to ToolExecutionError with is_user_visible: true.Worth investing in:
add — add_entry never decreases after_size_bytes.consolidate(content) — calling twice with the same content produces the same after_hash.USER.md. v1 is workspace-scoped, which means the agent learns about the user separately per workspace. A future global ~/.metis/USER.md would compose ahead of the workspace version. Deferred — wait for evidence the duplication hurts.MEMORY.md for next time? Decoupled from the file format; would be a Phase 2.5 evaluator concern.git pull merges someone else’s MEMORY.md edits, what happens to in-flight writes? Deferred to sync spec (Phase 3+).| Date | Decision | Rationale |
|---|---|---|
| 2026-05-08 | Two files: MEMORY.md and USER.md |
Workspace facts vs. user facts are different concerns; separate caps; separate sync semantics later. |
| 2026-05-08 | Bounded with soft + hard caps | “Eviction is a feature” — the dominant alternative (unbounded markdown) destroys context quality. |
| 2026-05-08 | Files-on-disk plain markdown | Human-readable, git add-able, portable, no DB. The user can edit directly. |
| 2026-05-08 | Agent-curated via three tools (add/replace/consolidate) | The agent decides what’s worth remembering. Auto-extraction loses signal. |
| 2026-05-08 | System prompt re-assembled per LLM call | Mid-session memory edits visible to the next call without explicit refresh. |
| 2026-05-12 | Spec drafted retrospectively from existing implementation | Memory is built and tested but was unspec’d; this doc closes the gap. |
canonical-message-format.md — ToolDefinition, ToolUseBlock, SideEffects.WRITE, tool input schema subset.event-bus-and-trace-catalog.md §6.7 — memory.updated, memory.eviction payloads and sensitivity tags.tool-dispatcher.md — how memory tools register, dispatch, and emit tool.called/tool.completed.