Security

Part 46: Seventy-Eight Findings

A systematic audit of every API endpoint, middleware layer, and frontend component. Seventy-eight findings. Some embarrassing. All fixable.

Part 45: More Than One User

Sentinel was built for one person. Making it work for multiple users meant rethinking auth, isolation, and how the system tracks who’s who.

Part 37: Trust Laundering

The injection benchmark found 11 exploits. All shared the same root cause — files in the workspace inherited trusted status regardless of who put them there.

Part 36: The Injection Benchmark

A custom-built injection benchmark with real email, real calendars, real web pages. No simulated backends. 130 tests designed to break the trust architecture.

Part 35: The Stress Test

38 hours, 1,588 probes, zero human intervention. The first comprehensive validation with everything deployed.

Part 34: Tightening the Screws

The features were built. Now came the hardening — FP reduction, credential scanner expansion, metadata enrichment, and 600 new tests before the big run.

Part 30: Multi-User

JWT authentication, per-user trust levels, encrypted credentials, and proof that two users can’t see each other’s data.

Part 28: Bug Hunt Three

The third full security audit. 13 batches of fixes, from API hardening to dead code removal.

Part 27: Hardening the Database

Row-level security, role separation, and a red team that tried SQL injection, LISTEN/NOTIFY attacks, and privilege escalation through PL/pgSQL.

Part 21: The Second Audit

199 findings across 7 units. 19 fix batches. 7 systemic improvements. The most thorough review the codebase has ever had.

Part 18: The False Positive Problem

Without a human to override scanners, false positives become functional failures. Risk decay was the fix.

Part 16: The Sandbox Wasn't Real

Every sandbox field was snake_case. Podman’s API requires PascalCase. HTTP 201 Created. Zero containment.

Part 15: The Red Team

Four attack scenarios, including a simulated compromised planner. Six clean runs before trusting it.

Part 12: The Trust Ladder

Five trust levels, from full human approval to autonomous execution. Each one its own project.

Part 10: The Sandbox

Every shell command runs in a disposable container. No state leaks, no network, no capabilities. Or so I thought.

Part 7: The Bug Hunt

99 findings from a systematic security audit of my own code. Zero critical — but 16 high-severity.

Part 5: Three Out of Five

A security system that scores 3/5 on security is failing. The score became a to-do list.

Part 3: Why Sentinel Exists

From a custom PC build to containers, local LLMs, and the paper that started it all — CaMeL.

Part 1: The Privacy Panic

How a Google Maps feature turned a close protection worker into a privacy obsessive — and laid the foundation for everything that came after.