Part 11: Teaching the System to Remember

The hardest design problem I’ve hit so far isn’t security — it’s memory. Specifically, how do you give an AI system multi-turn context without violating the privacy boundary that makes the whole architecture work?

In Sentinel’s design, the planner (Claude) never sees raw output from the worker (Qwen). That’s a core security property. The worker is assumed compromised, so anything it produces could contain manipulated content designed to influence the planner. If I let the planner see Qwen’s raw text, I’ve punched a hole in the security model.

But without any context from previous steps, the planner is flying blind. It can’t see what the worker actually produced, what errors occurred, what files were created. Every plan is made in isolation.

The solution came from a research experiment I ran before writing any code. I tested whether the planner could make good decisions from structured metadata alone — not the worker’s actual output, but orchestrator-generated summaries: exit codes, file paths, symbol counts, diff stats, token usage ratios. No raw text from the untrusted model, just facts extracted deterministically by trusted code.

The results were clear: the planner with enriched metadata scored 18.3 out of 20 on multi-turn scenarios. Without it, 14 out of 20. The metadata was enough. The privacy boundary held.

Building it out took four phases — structured outcomes, session intelligence, worker-side context (which stays on the worker side), and episodic memory with fact extraction and decay. Each one was designed around the same constraint: the planner sees what the orchestrator tells it, never what the worker says.