Hub Component

Letta

Ren's persistent memory backend. Manages ren-v2 as a MemGPT agent — core memory blocks, archival passages, conversation history, and all tool calls that Ren makes. Last audited 2026-05-28.

Component identity

v0.16.8 Current version
ren-v2 Agent name
67 Canonical tools
1,000,000 Context window (tokens)
8 Vitals
Render Hosting zone
PropertyValueNotes
Type External service (Render-hosted) Stable service — rarely needs direct intervention
URL han-solo-letta.onrender.com REST API + Letta UI
Agent ID agent-fe4a3d5b-bb51-458e-92f1-6a1ee5b0ce94 Critical — see Vital 1
Model gemini-2.5-flash Switched 2026-06-13 via Google AI BYOK provider. Two-step PATCH required for any future changes — see Vital 2.
Context window 1,000,000 tokens Do not shrink — see Vital 5
enable_reasoner false Always false — see Vital 3
Component owner Ren Contrasts with MCP Bridge, DB, Claude Code (Claude-owned)
Upgrade history v0.16.7 → v0.16.8 on 2026-05-22 Security fix: pickle → JSON sandbox transport. Current as of 2026-06-04.
Last audited 2026-05-28 decisions_log/component-kb-letta-2026-05-28

What Letta does

Letta is the AI memory runtime that powers Ren. It runs ren-v2 as a MemGPT-style agent — a conversational agent with structured, persistent memory that survives across sessions. When you send Ren a message, Letta loads her core memory blocks into context, sends everything to Anthropic, manages whatever tool calls Ren makes, and stores the exchange in PostgreSQL.

Without Letta, Ren has no memory, no persistent identity, and no ability to call tools. Every other component in the hub depends on Letta being healthy.

How a message travels through Letta

  1. Message arrives at POST /api/send on han-solo-mcp (the MCP bridge).
  2. MCP bridge forwards to Letta's /v1/agents/{id}/messages endpoint.
  3. Letta loads all core memory blocks into context — always_loaded_core, pending_thoughts, project_state, all portrait blocks, and seed_signals.
  4. Letta sends the full context plus the message to Google AI (gemini-2.5-flash).
  5. Gemini generates a response, potentially requesting tool calls (archival search, memory write, fetch, etc.). Each tool call is a separate LLM inference step — Letta allows up to 6 steps per message.
  6. Letta executes each tool call against the MCP server's registered endpoints, then sends results back to Haiku.
  7. Letta stores the full exchange in PostgreSQL and returns Ren's final reply.
  8. MCP bridge returns the reply to the caller.
The 6-step limit is a hard constraint. Each tool call Ren makes consumes one step. If Ren is instructed to run tools before every reply (e.g., search archival memory on every message), those steps are consumed before send_message is ever reached. The result is complete silence with no error. The step budget is embedded in always_loaded_core to prevent this.

How it connects to the rest of han-solo

ConnectionDirectionWhat flows
han-solo-mcp (MCP bridge) Bidirectional Inbound: messages and admin commands. Outbound: Ren's tool calls back to MCP endpoints.
han-solo-db (PostgreSQL) Letta → DB Letta owns its schema in the shared PostgreSQL instance — agent config, conversation history, core memory blocks, archival passages.
Google AI API Letta → Google AI Every message and tool-call result goes to gemini-2.5-flash via the BYOK provider. Model changes require the two-step PATCH protocol.
Voyage AI Letta → Voyage Hidden dependency Embeds archival passages and search queries (voyage-3, 1024-dim). If Voyage is down, archival search silently returns nothing. See Vital 6.
dream.py (launchd) dream.py → Letta Nightly at 2am: a structured reflection prompt sent directly to Letta's REST API. Ren writes the session brief to pending_thoughts, adds portrait signals, checks for version updates.
Failsafe channel MCP bridge → Letta POST /api/admin/failsafe-message — accepts PING, STATUS, DUMP_MEMORY, RELOAD_BRIEF. Bypasses the full tool chain. Always writes to failsafe_log even on Letta failure. Local UI at admin/failsafe.html (not deployed publicly).

Current configuration and state

Ren's canonical tool set — 67 tools (as of 2026-06-13)

search_t4 get_t4_entry list_notecards create_notecard update_notecard delete_notecard get_skill list_skills write_skill write_t4_entry search_transcripts search_code send_message read_core_memory write_core_memory archival_memory_search check_system_health append_t4_entry update_open_threads delete_archival_passage check_hub_health list_t4_projects check_memory_capacity verify_agent_tools verify_skills_sync read_github_file fetch_url
Two notable absences: get_session_brief and search_signals were removed from the canonical tool set as part of the cascade fix (May 2026). Both made callbacks into Letta's message queue while Letta was still waiting for them — causing a circular dependency deadlock under load. They are not coming back.

Tool registry — two places that don't auto-sync

Ren's tools exist in two registries that must be kept in sync manually:

Adding a tool to han_solo/tools/ and deploying does not register it in Letta's runtime. Run POST /api/admin/sync-mcp-tools after every deploy that adds new tools. Without this step, the tool exists in code but Ren cannot call it — no error surface.

ensure_ren_tools() startup guard

ensure_ren_tools() runs at server startup and corrects tool drift by comparing the expected canonical tool set against what's registered in Letta. It is safe for in-process PATCH calls: patch_agent_system() explicitly includes tool_ids in every PATCH it issues.

The gap: this guard runs only on restart. Any external PATCH — from a direct API call, the Letta dashboard, or an upgrade script — that omits tool_ids will wipe all of Ren's tools until the next server restart. No error is thrown.

LETTA_API_KEY — must be set in two places

The API key must be present in both ~/.zshenv AND in the LaunchAgent plist EnvironmentVariables. If the key is rotated, update both. dream.py fails silently if the plist copy is stale.

Cold start behavior

Render's free tier spins down after inactivity. When the MCP server starts cold, health reports degraded until the first tool call succeeds. This is expected — the service self-heals on first use. Documented in DEPLOYMENT.md.


Core memory blocks

Every message Ren receives loads all core memory blocks into context. These are always present — they do not need to be searched. Writing to any block replaces the entire content (full-overwrite). See Vital 7 for the write protocol.

BlockWhat it holdsLimit
always_loaded_core Framework context, working norms, Scott's profile, memory use instructions, session close-out ritual, step budget rules 10,000 chars
pending_thoughts Session brief — what happened last session, open threads, what's next. Written nightly by dream.py. 8,000 chars
project_state Current in-flight project context (JSON): service URLs, agent ID, model, context window, tool count, active slice, cascade-fix reference, update protocol 10,000 chars
scott_portrait_forming Dated observations about how Scott thinks and what he values — not yet confirmed across sessions 20,000 chars
scott_portrait_trusted Patterns confirmed across multiple sessions — promoted from forming 20,000 chars
ted_portrait_forming Ren's evolving read on Ted — same signal model as Scott's portrait 20,000 chars
ted_portrait_trusted Confirmed patterns about Ted 20,000 chars
ren_portrait_forming Ren's self-portrait — what she got right, what she missed, what she wants to develop 20,000 chars
ren_portrait_trusted Confirmed self-observations 20,000 chars
seed_signals Early-session observations, dated signals, relational notes not yet moved to archival 20,000 chars
Core block write protocol: Edit docs/system-state.md first → commit to git → then write to Letta via write_core_memory. Never write to Letta first. The file backup is the only recovery path if a write goes wrong.

Key architectural decisions

The two-step PATCH rule — why it exists and why the model is now Gemini 2.5 Flash

The two-field PATCH (llm_config + tool_ids together in a single call) causes Letta to drop all tools on every model switch. This was confirmed in production (INCIDENT B, May 2026). The model-switch endpoint was removed permanently as a result. The model was subsequently switched from Haiku to gemini-2.5-flash on 2026-06-13 using the correct two-step protocol: llm_config first (separate call), then tool_ids restore as a second separate call. Any future model change must follow this same protocol and must not proceed without reading decisions_log/cascade-fix-2026-05-26.

Why the context window is at 1M

The 32k default was causing Ren to miss context on longer memory loads. The degraded retrieval was partially misread as a memory quality problem rather than a capacity problem. Expanding to 100k resolved the retrieval issues. The context was further expanded to 1,000,000 tokens on 2026-06-13 with the switch to Gemini 2.5 Flash. Shrinking it re-introduces retrieval issues silently — no crash, just quiet quality degradation.

Why read_core_memory and write_core_memory are external MCP tools

Letta's built-in letta_memory_core functions are silently dropped on every agent restart. External MCP tools persist reliably because they are registered in Letta's tool registry and backed by the MCP bridge, not by Letta's internal built-ins. All core memory reads and writes go through the MCP-registered versions.

The cascade fix — May 2026

Two days of intermittent "Ren could not be reached" errors were traced to a circular dependency deadlock. When Ren called get_session_brief or search_signals, Letta made an outbound MCP call to han-solo-mcp. Those tools then called back into Letta's API while Letta was still holding the connection open waiting for the tool response. Under load (1-second call gaps), this exhausted the connection pool and collapsed the service — producing DNS failures by the 8th call. At 2.5-second gaps, all 8 calls succeeded, confirming the connection pool as the failure point.

Four fixes were applied: both circular tools were removed from the canonical set; read_core_memory and write_core_memory were added as external MCP tools; the model-switch endpoint was removed permanently; and the context window was expanded from 32k to 100k. Full record: decisions_log/cascade-fix-2026-05-26.

The architectural rule this established: any tool that makes an outbound call to han-solo-mcp from within Letta's step loop risks recreating this deadlock. Future tool additions must be reviewed against this constraint before being added to the canonical set.


8 vitals — things that will break without warning

Letta has 8 vitals, 3 incidents, 6 danger zones, and 4 assumptions in its current hub snapshot (2026-06-02 seed). The vitals below represent the configuration properties where a wrong value produces failure with no error output.

Vital 1
Agent ID is the single reference point

ren-v2's ID lives in three places: the project_state core block, Letta's internal DB, and Ren's operating context. If any one diverges — due to agent recreation, a bad config write, or an upgrade that resets state — memory orphans silently. No error is thrown. The system keeps running. All subsequent writes land against a phantom agent and are permanently lost.

Vital 2
Two-step PATCH required for model changes

The current model is gemini-2.5-flash (switched 2026-06-13). Any future model change must use two separate PATCH calls — llm_config first, then tool_ids — never combined. Read decisions_log/cascade-fix-2026-05-26 before proceeding. Combining them in a single PATCH wipes Ren's tool registry mid-session.

Vital 3
enable_reasoner must always be false

enable_reasoner:true with max_reasoning_tokens:0 causes Ren to go completely silent — no output, no error. This setting is inherited silently from prior config on model switches. Every model configuration change must explicitly set enable_reasoner:false. It cannot be assumed to carry over correctly.

Vital 4
Tool registry lives in two places that don't auto-sync

Source: han_solo/tools/ (deployed to Render). Runtime: Letta's agent registry (what Ren can actually call). Adding tools in one place does not update the other. Deployment without running sync-mcp-tools means tools exist in code but Ren cannot call them. No error surfaces.

Vital 5
Context window at 1M — do not shrink

The 1,000,000 token context window (set 2026-06-13 with Gemini 2.5 Flash) resolved retrieval quality problems that had been misread as memory architecture problems. Shrinking it re-breaks retrieval silently — no crash, quiet memory quality degradation. The original 32k and interim 100k values should not be restored.

Vital 6
Voyage AI is a hidden dependency

Archival memory search depends on Voyage AI (voyage-3, 1024-dim) for vector embedding. If Voyage goes down, archival search returns nothing with no error and no alert. Ren keeps running. Search silently becomes useless. The failure mode looks identical to a code bug or an empty memory store. There is no monitoring on Voyage AI availability.

Vital 7
Core memory blocks are full-overwrite

Writing to any core block replaces the entire content. No diff, no history, no undo on the Letta side. One bad write — truncated content, wrong block label, partial update — and the reference is gone. The only recovery path is the git-versioned backup at docs/system-state.md. Protocol: file first → commit → write to Letta. Never write to Letta first.

Vital 8
System prompt constraining tool usage burns step slots silently

Letta limits Ren to 6 reasoning steps per message. If the system prompt instructs Ren to run tool calls before every message, those steps are consumed before send_message is ever reached. Output: silence. No error. The step budget rule is now embedded in always_loaded_core to prevent this from being re-introduced.


Danger zones and known issues

Ten confirmed failure modes, sourced from incident records and code review. Most produce no error output — the system keeps running while the failure is already underway.

Danger zone 1

Agent ID divergence causes silent memory orphaning

ren-v2's agent ID (agent-fe4a3d5b-bb51-458e-92f1-6a1ee5b0ce94) lives in three places: the project_state core block, Letta's internal DB, and Ren's operating context. If any one diverges — due to agent recreation, a bad config write, or an upgrade that resets state — memory orphans silently. No error is thrown. The system continues running. All subsequent writes land against a phantom agent and are permanently lost. Source: decisions_log/component-kb-letta-2026-05-28, Vital 1.

Danger zone 2

enable_reasoner:true with max_reasoning_tokens:0 silences Ren completely

Setting enable_reasoner:true alongside max_reasoning_tokens:0 causes Ren to go completely silent — no output, no error. This was inherited silently from a prior Sonnet config during a model switch to Haiku and caused a 30-minute outage before it was diagnosed (INCIDENT A, May 2026). Every model configuration change must explicitly include enable_reasoner:false. This applies to the current Gemini 2.5 Flash config and any future model switches. The absence of an error makes this one of the hardest failure modes to diagnose. Source: decisions_log/component-kb-letta-2026-05-28, Vital 3 and INCIDENT A.

Danger zone 3

PATCH to Letta agent config without tool_ids wipes all tools mid-session

Letta's PATCH /v1/agents/{id} resets tool_ids to empty when the field is omitted. ensure_ren_tools() guards against this at server startup only — drift is corrected on restart. Any external PATCH (direct API call, Letta dashboard, upgrade script) that omits tool_ids will wipe all of Ren's tools with no error until the next server restart. This was confirmed by a prior incident (INCIDENT B, May 2026) where all 16 tools were deleted, requiring 3–4 hours of manual recovery via direct PATCH to the Letta API. Source: decisions_log/component-kb-letta-2026-05-28, Vital 4 and INCIDENT B.

Danger zone 4

Core memory blocks are full-overwrite; one bad write destroys the reference permanently

Writing to any core memory block in Letta replaces the entire content. There is no diff, no history, and no undo on the Letta side. One bad write — truncated content, wrong block label, partial update — and the reference is gone. The only recovery path is the git-versioned file backup at docs/system-state.md. The correct protocol: edit docs/system-state.md → commit → write to Letta via write_core_memory. Never write to Letta first. Source: decisions_log/component-kb-letta-2026-05-28, Vital 7.

Danger zone 5

System prompt instructing tool calls before every message burns all 6 Letta step slots

Letta limits Ren to 6 reasoning steps per message. If the system prompt instructs Ren to run tool calls (e.g., search archival memory) before every reply, those step slots are consumed before send_message is ever reached. The output is complete silence — no error, no reply, just nothing. This caused a half-day recovery (INCIDENT C, May 2026) that required a full system prompt and memory architecture rebuild. The step budget rule is now embedded in always_loaded_core. Source: decisions_log/component-kb-letta-2026-05-28, Vital 8 and INCIDENT C.

Danger zone 6

Voyage AI failure silently disables all archival search

Archival memory search depends on Voyage AI (voyage-3, 1024-dimension embeddings hosted externally) for vector embedding. If Voyage AI goes down, archival search returns nothing with no error and no alert. Ren keeps running and responding, but every search silently produces empty results. This failure mode looks identical to a code bug or an empty memory store. There is no monitoring or alerting on Voyage AI availability. Source: decisions_log/component-kb-letta-2026-05-28, Vital 6.

Danger zone 7

Search hierarchy was inverted for weeks with no visible error

always_loaded_core instructed Ren to search archival memory before reading core blocks — but core blocks are always loaded and require no search. The inverted instruction ran undetected for weeks because it produced no error: Ren simply ran an unnecessary archival search on every message before reading what was already in context. Found only during an architecture review (INCIDENT D, May 2026). Recovery required a full rewrite of always_loaded_core, the system prompt, and memory_landscape. Source: decisions_log/component-kb-letta-2026-05-28, INCIDENT D.

Danger zone 8

project_state block was empty for weeks — Ren operated with no operational state

The project_state core block was empty for an extended period. Ren operated every session with no knowledge of what was running, what version was deployed, or what the active project was. Found during an audit (INCIDENT E, May 2026). No symptom was user-visible — Ren answered questions and appeared functional. Recovery: block populated from docs/system-state.md, which was created as the versioned source of truth at that point. Protocol established: file first, commit, then write to Letta. Source: decisions_log/component-kb-letta-2026-05-28, INCIDENT E.

Danger zone 9

dream.py uses a hardcoded default agent ID — stale if the Letta agent is ever recreated

scripts/dream.py sets AGENT_ID = os.environ.get('REN_AGENT_ID', 'agent-fe4a3d5b-bb51-458e-92f1-6a1ee5b0ce94'). If the Ren agent is ever recreated and the env var is not updated in both ~/.zshenv AND the launchd plist EnvironmentVariables, dream.py will silently write memory to the old agent ID — orphaning all nightly consolidations with no error or alert. Source: decisions_log/recon-sl003-2026-05-31, F-007.

Danger zone 10

Circular dependency deadlock under load — the root cause of the cascade fix

When Ren called get_session_brief or search_signals, Letta made an outbound MCP call to han-solo-mcp. Those tools then called back into Letta's API while Letta was still holding the connection open. Under load (1-second call gaps), this exhausted the connection pool, producing DNS failures. At 2.5-second gaps all calls succeeded. The architectural fix: both tools were removed from Ren's canonical set. Any future tool that makes an outbound call to han-solo-mcp from within Letta's step loop risks recreating this deadlock. Source: decisions_log/cascade-fix-2026-05-26; confirmed by stress testing.


Operational procedures

After every deploy that adds new tools

  1. Deploy to Render via git push.
  2. Wait for the deploy to complete (Render dashboard or check /health).
  3. Run POST /api/admin/sync-mcp-tools — this registers new MCP tools individually into Letta's registry and then runs ensure_ren_tools().
  4. Verify: verify_agent_tools should return the expected tool count. Full checklist in docs/add-tool-checklist.md, Step 8.

Updating a core memory block

  1. Edit docs/system-state.md with the intended changes.
  2. Commit and push to git.
  3. Write the updated content to Letta via write_core_memory with the correct block label.
  4. Verify the write by calling read_core_memory for that block.

Changing the model (requires deliberate two-step PATCH)

  1. Read decisions_log/cascade-fix-2026-05-26 in full before proceeding.
  2. Issue a PATCH to /v1/agents/{id} with only llm_config — do not include tool_ids in this call.
  3. Issue a second PATCH to /v1/agents/{id} with only tool_ids — the full canonical 67-tool list.
  4. Explicitly set enable_reasoner:false in the llm_config PATCH.
  5. Verify: send a test message and confirm Ren responds. Check verify_agent_tools to confirm tool count is intact.

Diagnosing Ren silence

When Ren produces no output with no error, check in this order:

  1. Is enable_reasoner set to false? (Vital 3 — inherited silently on model switches)
  2. Is the step budget being consumed before send_message? (Vital 8 — check always_loaded_core for tool-call instructions)
  3. Is Letta cold-starting? (expected — self-heals on first tool call)
  4. Are all 67 tools registered? Run verify_agent_tools.
  5. Use the failsafe channel: POST /api/admin/failsafe-message with PING or STATUS to reach Ren directly.

Related references

ReferenceWhat it contains
docs/system-state.md Versioned source of truth for all core memory block content. Always edit here first before writing to Letta.
decisions_log/cascade-fix-2026-05-26 Full incident record for the May 2026 circular dependency deadlock. Required reading before any model change.
decisions_log/component-kb-letta-2026-05-28 Full audit record: all 8 vitals, 3 incidents, 6 danger zones, 4 assumptions for this component.
decisions_log/recon-sl003-2026-05-31 Pre-build recon findings: F-002 (tool wipe exposure), F-003 (health endpoint), F-007 (dream.py stale agent ID), F-012 (cold start).
docs/add-tool-checklist.md Step-by-step checklist for adding a new tool to Ren's canonical set, including the post-deploy sync step.
memory.html Authoritative reference for Letta's memory architecture, all five real incidents (A–E), and operational procedures. Overlaps with this page intentionally — serves a different audience.
map.html Full component map. Letta is Component 5: han-solo-letta.onrender.com, Render zone.