Part 10: The Sandbox

One of the features I’m most proud of is also one of the most paranoid: every shell command the worker model generates runs inside its own disposable container.

The idea came from a simple question — if I assume Qwen is compromised at all times, why am I letting it run commands on the same filesystem as everything else? Even with the scan pipeline catching malicious output, defence in depth means the execution environment itself should be hostile territory for an attacker.

So I built a sandbox. Each command gets a fresh container — spun up, command runs, output captured, container destroyed. No state leaks between commands. If the worker tries something nasty, it’s trapped in a box that ceases to exist the moment the command finishes. The workspace is mounted read-write so legitimate file operations still work, but the container has no network access, no capabilities, read-only root filesystem.

Or at least, that’s what I thought.

The design came together cleanly — 34 tests, no regressions. It was one of those rare sessions where everything just clicked. Clean abstractions, clear boundaries. I was genuinely proud of it.

That pride would come back to bite me in a way I never expected. But at the time, it felt like the cleanest thing I’d built in the entire project. And the principle behind it — don’t trust the environment, contain it — became a foundation for everything that followed.