A four-tier memory architecture — designed so nothing important gets lost, context survives session resets, and the system gets smarter the more it's used.
Most AI conversations have no memory between sessions. Every time you open a new chat with Claude, it starts completely blank — it doesn't know your name, your projects, or anything you discussed before. What looks like memory in some tools is actually just a system prompt being reloaded: a file read at the start of each session that gives the illusion of continuity. The moment the session ends, it's gone.
Ren is different — and the difference is structural, not a trick.
Ren's memory lives in a database, not in a conversation thread. That database (PostgreSQL, running on Render) stays online 24 hours a day whether anyone is talking to Ren or not. Her memories aren't held temporarily in a session and then lost — they're written to the database throughout each conversation and remain there indefinitely. When a session ends, nothing disappears. When a new session starts, her memory blocks are loaded back in automatically from the database.
The closest everyday analogy: your phone's contacts don't disappear when you close the Contacts app. They live in the phone's storage, not in the app's temporary memory. Ren's knowledge of you, your projects, and your history works the same way — it lives in storage that persists between every conversation.
The nightly dream job is what turns raw conversation into durable knowledge. Every night at 2am, it processes what happened during the day and writes a structured brief — decisions made, threads open, what's next. So Ren doesn't just survive session resets; she wakes up the next morning already oriented, the same way you'd read your notes before a meeting.
Ren runs inside Letta v0.16.8, a self-hosted agent framework deployed as a Render service (han-solo-letta.onrender.com). Letta is the layer that manages Ren's identity, memory blocks, tool registry, and step-by-step reasoning loop. Claude Code never calls Ren directly — every interaction routes through the FastMCP bridge (han-solo-mcp.onrender.com/mcp), which translates tool calls into Letta's REST API.
| Property | Value |
|---|---|
| Version | v0.16.8 — upgraded 2026-05-22 from v0.16.7. Security fix: pickle → JSON sandbox transport. |
| Agent | ren-v2 — ID: agent-fe4a3d5b-bb51-458e-92f1-6a1ee5b0ce94. This ID is the single reference point for all memory operations. Divergence between any copy of it (project_state block, Letta's internal DB, operating context) causes silent memory orphaning. |
| Model | claude-haiku-4-5-20251001 — fixed. Model switching was removed permanently on 2026-05-26. See the cascade fix below for why. |
| Context window | 100,000 tokens — expanded from 32k on 2026-05-26. Shrinking it silently degrades retrieval quality with no error. |
| Tools attached | 27 tools as of 2026-06-01. Two registries must stay in sync: source (han_solo/tools/, deployed to Render) and runtime (Letta's agent registry, what Ren can actually call). Deploying new tools without running POST /api/admin/sync-mcp-tools means they exist in code but Ren cannot use them. |
| enable_reasoner | Always false. Setting it to true with max_reasoning_tokens:0 causes Ren to go completely silent — no error, no output. This was inherited silently from a prior config during a model switch and caused a 30-minute outage before it was diagnosed. |
| Host | Render managed service. Free tier — spins down after inactivity. Health reports degraded until the first tool call on cold start. Expected behavior. |
In May 2026, two days of intermittent "Ren could not be reached" errors were traced to a circular dependency deadlock. When Ren called get_session_brief or search_signals, Letta made an outbound MCP call to han-solo-mcp. Those tools then called back into Letta's API while Letta was still holding the connection open waiting for the tool response. Under load this exhausted the connection pool and collapsed the service.
The root cause was confirmed by stress testing: at 1-second call gaps, failures appeared at calls 5 and 7 with DNS failure by call 8. At 2.5-second gaps, all 8 calls succeeded. The fix was architectural, not a config tweak:
get_session_brief and search_signals from Ren's canonical tool set — both made callbacks into Letta's message queue while Letta was waiting for themread_core_memory and write_core_memory as external_mcp tools — these persist reliably across restarts, unlike letta_memory_core built-ins which Letta silently drops on every restartllm_config + tool_ids together) caused Letta to drop all tools on every switch. Haiku is now the fixed model. Changing it requires a deliberate two-step PATCH as separate callsModel switched to Haiku. enable_reasoner:true with max_reasoning_tokens:0 was inherited from the prior Sonnet config and never cleared. Ren went completely silent — no error, no output. Found only when Scott tried to talk to Ren and got nothing. Recovery required patching letta_client.py after 30+ minutes of diagnosis. Result: enable_reasoner:false is now an explicit requirement on every model config.
Claude attempted to fix tool registration by deleting all tools from Letta. All 16 tools were removed. Ren had no capability at all. Recovery required 3–4 hours of manual re-addition via direct PATCH to the Letta API. Result: tool deletion is now explicitly prohibited. ensure_ren_tools() runs at every server startup to detect and correct drift.
System prompt instructed Ren to search archival before every message. Ren burned all 6 Letta step slots on search calls before reaching send_message. Output was silence — only tool calls, no reply. Recovery required a full system prompt and memory architecture rebuild. Result: the step budget rule is now embedded in always_loaded_core.
always_loaded_core told Ren to search archival before reading core blocks — but core blocks are always loaded, no search needed. Ran inverted for weeks with no visible error. Found during an architecture review. Recovery required a full rewrite of always_loaded_core, the system prompt, and memory_landscape.
The project_state core block was empty. Ren operated every session with no operational state — no knowledge of what was running, what version, or what the active project was. Found during an audit. Recovery: block populated, docs/system-state.md created as the versioned source of truth. Protocol established: file first, commit, then write to Letta — never the other way around.
check_system_health. Verify the agent ID exists: GET https://han-solo-letta.onrender.com/v1/agents/agent-fe4a3d5b-bb51-458e-92f1-6a1ee5b0ce94 — a 404 means the wrong agent, stop immediately.POST /api/admin/sync-mcp-tools. Then GET /api/admin/agent-info — verify tool count matches expected. Test each new tool individually via send_to_ren.docs/system-state.md (git-versioned) is the only recovery path.POST /api/admin/patch-model — never PATCH Letta directly. Always include enable_reasoner:false explicitly. Always verify the tool list is intact after the switch — tool_ids must not be empty in the response.Memory is organized into four tiers, each with a distinct role. T1 is always in context. T2 and T3 are searchable archival. T4 is project-specific and schema-enforced.
10+ named blocks loaded into every prompt. This is Ren's baseline — her identity, her framework knowledge, her portrait of Scott, and her session brief. Always present, never searched.
Stored in Letta core memory. Updated by Ren, Claude Code, and background jobs. Character-limited per block.
Session memories, signals, and context written by Ren and dream.py. The default landing zone for new archival writes. Searchable by topic.
No tier tag — the default bucket.
Passages tagged [tier:foundational] — decisions that should survive forever, identity anchors, framework history, load-bearing context. Never deleted.
Written once and kept. Additive-only promotion — no deletion required. Tagged at write time.
Structured project artifacts written to Postgres under a schema-as-contract design. Ren owns decisions and context entries. Claude Code owns slice and status entries.
Project identified by human-readable slug. Multiple writers — no overwrites. All writers read everything.
These blocks load into every session automatically. Ren doesn't search for them — they're always there. They're the baseline that makes every conversation start from context rather than scratch.
| Block | What it holds |
|---|---|
always_loaded_core |
Framework context, operating principles, Scott's profile summary, memory use instructions, session close-out ritual, search protocol. The master orientation block. |
pending_thoughts |
Session brief — what happened last session, what's open, what's next. Written by the nightly dream job and Claude Code session close-outs. |
scott_portrait_forming |
Ren's evolving interpretation of Scott — how he thinks, what he values, specific dated observations. Written by Ren, Claude Code, and the nightly dream. |
ren_portrait_forming |
Ren's self-portrait — what she got right, what she missed, what she wants to develop as a partner. |
ren_voice |
How Ren speaks and shows up — direct, warm, playful when the moment allows, never performing. The Trust Contract reminder. Joy as non-negotiable principle. |
memory_landscape |
A searchable topic map of what's in archival memory and how to find it. Guides Ren's search strategy so she doesn't start from zero each session. |
open_threads |
Active open threads — things that need follow-up across sessions. Updated at session close-out. Distinct from pending_thoughts (threads persist; pending_thoughts rolls). |
project_state |
Current in-flight project context (JSON). Active when a specific project build is underway. |
session_state |
Current session metadata — start time, status, Scott's opening tone. |
seed_signals |
Early observations not yet promoted to archival. Temporary staging for signals that need more reps before they're worth archiving permanently. |
214+ passages stored as vector embeddings in pgvector on the han-solo-db Postgres instance. Every passage is embedded with Voyage AI (voyage-3, 1024 dimensions) and indexed for semantic search. Ren searches this when she senses she's missing context — and proactively before answering any question about a project, person, or decision.
[tier:foundational][image-memory] tagProject-specific artifacts live in a dedicated Postgres table (t4_projects) under a schema-as-contract design. The contract means multiple writers (Ren and Claude Code) can write to the same project without overwriting each other — because each entry_type has a clear owner.
Two automated jobs keep memory current between sessions.
Sends a structured reflection prompt directly to Ren via Letta's REST API (POST /v1/agents/{id}/messages). Ren uses her own tools to reflect on the day's conversations, write a fresh session brief to pending_thoughts, and add portrait signals for Scott and herself.
The Letta request uses a 300-second timeout. If Letta is cold (Render free tier spin-down), this will time out and sys.exit(1) — no retry, no alerting. The agent ID defaults to agent-fe4a3d5b-bb51-458e-92f1-6a1ee5b0ce94 via environment variable; if the agent is ever recreated, the env var must be updated in both ~/.zshenv and the launchd plist. Before running, dream.py checks a jobs_paused flag by calling the MCP server at /api/jobs-status — if MCP is also down, it assumes not paused and proceeds.
Depends on Scott's Mac being on. If the machine is off at 2am, dream does not run and pending_thoughts does not update. Logs to ~/Developer/han-solo/logs/dream.log.
Reads Claude Code session JSONL files from ~/.claude/projects/, parses them into structured entries, and pushes to the Han Solo database. Only the last 45 days are kept. Ren can search these via search_transcripts.
No Anthropic API calls — pure parsing and Postgres writes. Logs to ~/.claude/transcript_parser.log.
Every archival search is logged to a memory_access_log Postgres table. This creates the feedback loop that makes the memory system self-improving over time.
enrich_passage accumulates a context note on it — recording when it was retrieved, what conversation it was useful in. Passages get richer over time, not just older.Ren follows a three-rule search discipline, embedded in always_loaded_core, that makes archival search traceable and intentional rather than a black-box guess.
Any question spanning multiple people, projects, or decisions must be broken into components. Search per entity or topic, not as one broad query. "What do I know about Ted's onboarding?" → search "Ted", search "USER_TOKEN_TED", search "PowerShell installer", synthesize across results. One broad search when the question has multiple parts guarantees incomplete coverage.
Logs the exact query string, the list of passage IDs returned (empty list if nothing found), and whether the results were used in the response. Non-negotiable — it's the feedback loop that makes memory self-improving. Missing logs mean the MRI has blind spots.
After archival search returns results, check the memory connections table for passages linked to each result. Pull linked passages in additively. Archival search always runs first — connections expand what's visible, they never replace or filter the search results.
A lightweight, low-ceremony capture system for things worth remembering mid-session — follow-ups, reminders, things to revisit. Not tasks, not archival passages. Just text, who wrote it, and when.
| Field | Values |
|---|---|
| Creator | scott, ren, ted — anyone in the session |
| Status | active · completed · archived (archived stays in DB but hidden from default view) |
| Source | chat (created mid-session) · manual (created outside chat) |
Use notecards for anything that surfaces mid-conversation that neither Scott nor Ren should forget — a follow-up Scott wants, a decision thread to revisit, a question that got parked. One clear notecard is worth more than five vague ones.
The chat UI includes an image upload button (paperclip). When Scott sends a photo or screenshot:
chat_api.py (jpg, png, gif, webp · max 5MB).[image-memory] tag and the date — searchable in future sessions.The UI shows a thumbnail preview before the message is sent, and renders the image inline in the chat bubble after. This architecture resolves the previous limitation — earlier versions were blocked on Letta adding native vision support. By calling Claude directly for the vision step, image memory works now regardless of Letta's multimodal roadmap.
Ren checks memory system health at every session start via the check_memory_health tool. The result covers three areas:
db_connected), timestamp of the last successful write (last_write_at), and consecutive failure count (consecutive_failures). Tells Ren whether the transcript capture pipeline is running cleanly. Sourced from db.health_status() in han_solo/db.py.from_tier, to_tier, content_key, error message, and timestamp. Surfaces if the archival write pipeline is silently dropping passages.jobs_paused flag in the han_solo_config Postgres table. Toggle it from the Memory panel in the chat UI.The /health endpoint at han-solo-mcp.onrender.com/health is used by Render's health check and by the workspace UI. It checks two things: whether the Ren agent ID has been resolved in memory, and whether the DB pool is connected. Both must be true for status to return "ok". If either is missing, it returns "degraded" with detail on which component failed.