Part 12: The Trust Ladder

Trust levels are how Sentinel learns to let go.

At Trust Level 0, every single action requires human approval. The system plans, the worker executes, the scanners check, and then it stops and waits for you to say “yes, go ahead.” It’s safe but it’s exhausting. You can’t leave it alone for five minutes.

Trust Level 1 introduced auto-approval for safe reads. If the plan only involves reading files, checking statuses, listing directories — things that can’t change anything — the system handles it without asking. Small step, but it meant I could ask “what’s in this directory?” without clicking an approve button.

Trust Level 2 added file reads to the auto-approve list. The system tracks where data came from (the provenance system) and classifies read operations as safe. More autonomy, same security guarantees, because reading a file can’t cause harm.

Trust Level 3 was the first genuinely scary one. File writes. The system can now create and modify files — but only within the workspace, only after scanning the content with Semgrep, and only with human approval still required for the actual write. It can prepare the content autonomously but still checks before committing it to disk.

Trust Level 4 is the big one. Autonomous command execution. The planner specifies exactly which commands and paths each step is allowed to use, the constraint validator enforces those boundaries at runtime, and a constitutional denylist blocks certain things regardless — no rm -rf, no privilege escalation, no network exfiltration, ever.

Each level was its own project. Design, implement, benchmark, red team, fix, re-benchmark. Then activate.