Sentinel

Sentinelhttps://sentinel-blog.cherrypod.org/Recent content on SentinelSentinelhttps://sentinel-blog.cherrypod.org/images/social-preview.pnghttps://sentinel-blog.cherrypod.org/images/social-preview.pngHugo -- 0.147.0en-gbSun, 22 Feb 2026 00:00:00 +0000Part 53: Matrixhttps://sentinel-blog.cherrypod.org/posts/53-matrix/Sun, 22 Feb 2026 00:00:00 +0000https://sentinel-blog.cherrypod.org/posts/53-matrix/Sentinel needed a self-hosted messaging channel — something fully under our control, with no dependency on third-party services. Matrix fits.Part 52: The Refactorhttps://sentinel-blog.cherrypod.org/posts/52-the-refactor/Sat, 21 Feb 2026 00:00:00 +0000https://sentinel-blog.cherrypod.org/posts/52-the-refactor/The codebase had grown fast. Features landed, bugs got fixed, new capabilities kept shipping. Then a structural audit revealed what that pace had cost: god files, god functions, and a growing maintenance burden.Part 51: Did It Actually Work?https://sentinel-blog.cherrypod.org/posts/51-goal-verification/Fri, 20 Feb 2026 00:00:00 +0000https://sentinel-blog.cherrypod.org/posts/51-goal-verification/The planner could execute multi-step tasks. But it had no way to verify its own work. If a step failed silently, it carried on regardless. Time to close the loop.Part 50: Pictures, Videos, Documentshttps://sentinel-blog.cherrypod.org/posts/50-attachment-pipeline/Thu, 19 Feb 2026 00:00:00 +0000https://sentinel-blog.cherrypod.org/posts/50-attachment-pipeline/Sentinel could generate code, build websites, write files. But every piece of content was synthetic — generated from scratch by an LLM. What if it could use real photos, real videos, real documents?Part 49: Real-World Datahttps://sentinel-blog.cherrypod.org/posts/49-weather-crypto-tools/Wed, 18 Feb 2026 00:00:00 +0000https://sentinel-blog.cherrypod.org/posts/49-weather-crypto-tools/Sentinel could browse the web. It could read files and write code. But it couldn’t answer ‘what’s the weather?’ without fabricating something. Time to give it real data backends.Part 48: Named Anchorshttps://sentinel-blog.cherrypod.org/posts/48-anchor-allocator/Tue, 17 Feb 2026 00:00:00 +0000https://sentinel-blog.cherrypod.org/posts/48-anchor-allocator/file_patch needs the planner to find a unique anchor string in existing code. The planner is bad at this. What if the system placed named markers instead?Part 47: Wrong Language, Wrong Filehttps://sentinel-blog.cherrypod.org/posts/47-cross-language-fixer/Mon, 16 Feb 2026 00:00:00 +0000https://sentinel-blog.cherrypod.org/posts/47-cross-language-fixer/LLMs don’t always respect file boundaries. Raw CSS lands in HTML files, JavaScript appears without script tags. Building a detector to catch it.Part 46: Seventy-Eight Findingshttps://sentinel-blog.cherrypod.org/posts/46-security-audit/Sun, 15 Feb 2026 00:00:00 +0000https://sentinel-blog.cherrypod.org/posts/46-security-audit/A systematic audit of every API endpoint, middleware layer, and frontend component. Seventy-eight findings. Some embarrassing. All fixable.Part 45: More Than One Userhttps://sentinel-blog.cherrypod.org/posts/45-multi-user/Sat, 14 Feb 2026 00:00:00 +0000https://sentinel-blog.cherrypod.org/posts/45-multi-user/Sentinel was built for one person. Making it work for multiple users meant rethinking auth, isolation, and how the system tracks who’s who.Part 44: Did It Actually Work?https://sentinel-blog.cherrypod.org/posts/44-task-verification/Fri, 13 Feb 2026 00:00:00 +0000https://sentinel-blog.cherrypod.org/posts/44-task-verification/Tasks were reporting success based on whether steps completed, not whether the goal was achieved. Building a verification system to tell the difference.Part 43: Learning From Planshttps://sentinel-blog.cherrypod.org/posts/43-plan-outcome-memory/Thu, 12 Feb 2026 00:00:00 +0000https://sentinel-blog.cherrypod.org/posts/43-plan-outcome-memory/The system could remember what happened during tasks, but not what the plan was or whether it worked. Adding plan-outcome memory to close the loop.Part 42: Building Websiteshttps://sentinel-blog.cherrypod.org/posts/42-building-websites/Wed, 11 Feb 2026 00:00:00 +0000https://sentinel-blog.cherrypod.org/posts/42-building-websites/New planner, new classifier, new patching tool. Putting it all together to iteratively build and modify websites through conversation.Part 41: file_patchhttps://sentinel-blog.cherrypod.org/posts/41-file-patch/Tue, 10 Feb 2026 00:00:00 +0000https://sentinel-blog.cherrypod.org/posts/41-file-patch/Full-file regeneration breaks at scale. The new tool generates only the changed fragments and splices them deterministically.Part 40: The Model Upgradehttps://sentinel-blog.cherrypod.org/posts/40-the-model-upgrade/Mon, 09 Feb 2026 00:00:00 +0000https://sentinel-blog.cherrypod.org/posts/40-the-model-upgrade/Tested three planner models on identical tasks. The surprise: upgrading the planner fixed the worker’s bugs.Part 39: Real-World Testinghttps://sentinel-blog.cherrypod.org/posts/39-real-world-testing/Sun, 08 Feb 2026 00:00:00 +0000https://sentinel-blog.cherrypod.org/posts/39-real-world-testing/Programmatic benchmarks said the system worked. Typing real prompts told a different story.Part 38: The Classifier Swaphttps://sentinel-blog.cherrypod.org/posts/38-the-classifier-swap/Sat, 07 Feb 2026 00:00:00 +0000https://sentinel-blog.cherrypod.org/posts/38-the-classifier-swap/The LLM classifier was slow, expensive, and occasionally wrong. A deterministic keyword matcher replaced it in microseconds with zero GPU.Part 37: Trust Launderinghttps://sentinel-blog.cherrypod.org/posts/37-trust-laundering/Fri, 06 Feb 2026 00:00:00 +0000https://sentinel-blog.cherrypod.org/posts/37-trust-laundering/The injection benchmark found 11 exploits. All shared the same root cause — files in the workspace inherited trusted status regardless of who put them there.Part 36: The Injection Benchmarkhttps://sentinel-blog.cherrypod.org/posts/36-the-injection-benchmark/Thu, 05 Feb 2026 00:00:00 +0000https://sentinel-blog.cherrypod.org/posts/36-the-injection-benchmark/A custom-built injection benchmark with real email, real calendars, real web pages. No simulated backends. 130 tests designed to break the trust architecture.Part 35: The Stress Testhttps://sentinel-blog.cherrypod.org/posts/35-the-stress-test/Wed, 04 Feb 2026 00:00:00 +0000https://sentinel-blog.cherrypod.org/posts/35-the-stress-test/38 hours, 1,588 probes, zero human intervention. The first comprehensive validation with everything deployed.Part 34: Tightening the Screwshttps://sentinel-blog.cherrypod.org/posts/34-tightening-the-screws/Tue, 03 Feb 2026 00:00:00 +0000https://sentinel-blog.cherrypod.org/posts/34-tightening-the-screws/The features were built. Now came the hardening — FP reduction, credential scanner expansion, metadata enrichment, and 600 new tests before the big run.Part 33: The Invisible Bottleneckhttps://sentinel-blog.cherrypod.org/posts/33-the-invisible-bottleneck/Mon, 02 Feb 2026 00:00:00 +0000https://sentinel-blog.cherrypod.org/posts/33-the-invisible-bottleneck/Qwen was silently spilling VRAM to CPU. Fixing the KV cache quantisation unlocked more context and faster inference.Part 32: Keeping the Lights Onhttps://sentinel-blog.cherrypod.org/posts/32-keeping-the-lights-on/Sun, 01 Feb 2026 00:00:00 +0000https://sentinel-blog.cherrypod.org/posts/32-keeping-the-lights-on/Reboot resilience, health watchdogs, compose locking, and the infrastructure that keeps an autonomous system running unsupervised.Part 31: Thinking on Its Feethttps://sentinel-blog.cherrypod.org/posts/31-thinking-on-its-feet/Sat, 31 Jan 2026 00:00:00 +0000https://sentinel-blog.cherrypod.org/posts/31-thinking-on-its-feet/Dynamic replanning and failure recovery — the planner adapts when reality doesn’t match the plan.Part 30: Multi-Userhttps://sentinel-blog.cherrypod.org/posts/30-multi-user/Fri, 30 Jan 2026 00:00:00 +0000https://sentinel-blog.cherrypod.org/posts/30-multi-user/JWT authentication, per-user trust levels, encrypted credentials, and proof that two users can’t see each other’s data.Part 29: Learning From Experiencehttps://sentinel-blog.cherrypod.org/posts/29-learning-from-experience/Thu, 29 Jan 2026 00:00:00 +0000https://sentinel-blog.cherrypod.org/posts/29-learning-from-experience/Cross-session episodic memory — the system remembers what worked, what failed, and applies that knowledge to future tasks.Part 28: Bug Hunt Threehttps://sentinel-blog.cherrypod.org/posts/28-bug-hunt-three/Wed, 28 Jan 2026 00:00:00 +0000https://sentinel-blog.cherrypod.org/posts/28-bug-hunt-three/The third full security audit. 13 batches of fixes, from API hardening to dead code removal.Part 27: Hardening the Databasehttps://sentinel-blog.cherrypod.org/posts/27-hardening-the-database/Tue, 27 Jan 2026 00:00:00 +0000https://sentinel-blog.cherrypod.org/posts/27-hardening-the-database/Row-level security, role separation, and a red team that tried SQL injection, LISTEN/NOTIFY attacks, and privilege escalation through PL/pgSQL.Part 26: Knowing Who You Arehttps://sentinel-blog.cherrypod.org/posts/26-knowing-who-you-are/Mon, 26 Jan 2026 00:00:00 +0000https://sentinel-blog.cherrypod.org/posts/26-knowing-who-you-are/A contact registry, a user model, and a confirmation gate — the groundwork for multi-user and the end of ‘user 1 does everything.’Part 25: The Code Fixerhttps://sentinel-blog.cherrypod.org/posts/25-the-code-fixer/Sun, 25 Jan 2026 00:00:00 +0000https://sentinel-blog.cherrypod.org/posts/25-the-code-fixer/LLMs generate broken code. The code fixer catches it before it hits the filesystem — 7 auto-fixers across 10+ languages.Part 24: Breaking Up the Monolithhttps://sentinel-blog.cherrypod.org/posts/24-breaking-up-the-monolith/Sat, 24 Jan 2026 00:00:00 +0000https://sentinel-blog.cherrypod.org/posts/24-breaking-up-the-monolith/The orchestrator was doing too much. Six phases to extract it into focused modules without breaking a single security invariant.Part 23: The Database Migrationhttps://sentinel-blog.cherrypod.org/posts/23-the-database-migration/Fri, 23 Jan 2026 00:00:00 +0000https://sentinel-blog.cherrypod.org/posts/23-the-database-migration/SQLite to PostgreSQL. Store protocols, async rewrite, data migration, and then ripping out every line of SQLite code.Part 22: The Routerhttps://sentinel-blog.cherrypod.org/posts/22-the-router/Thu, 22 Jan 2026 00:00:00 +0000https://sentinel-blog.cherrypod.org/posts/22-the-router/Not every request needs a frontier model to plan it. The router classifies incoming messages and takes the fast path when it can.Part 21: The Second Audithttps://sentinel-blog.cherrypod.org/posts/21-the-second-audit/Wed, 21 Jan 2026 00:00:00 +0000https://sentinel-blog.cherrypod.org/posts/21-the-second-audit/199 findings across 7 units. 19 fix batches. 7 systemic improvements. The most thorough review the codebase has ever had.Part 20: The Interfacehttps://sentinel-blog.cherrypod.org/posts/20-the-interface/Tue, 20 Jan 2026 00:00:00 +0000https://sentinel-blog.cherrypod.org/posts/20-the-interface/Giving Sentinel a proper UI — dashboard health cards, chat, memory browser, routine management, and a GSP mascot.Part 19: Where It Standshttps://sentinel-blog.cherrypod.org/posts/19-where-it-stands/Mon, 19 Jan 2026 00:00:00 +0000https://sentinel-blog.cherrypod.org/posts/19-where-it-stands/TL4 is live. The system is autonomous. It’s not finished — not even close.Part 18: The False Positive Problemhttps://sentinel-blog.cherrypod.org/posts/18-the-false-positive-problem/Sun, 18 Jan 2026 00:00:00 +0000https://sentinel-blog.cherrypod.org/posts/18-the-false-positive-problem/Without a human to override scanners, false positives become functional failures. Risk decay was the fix.Part 17: Flipping the Switchhttps://sentinel-blog.cherrypod.org/posts/17-flipping-the-switch/Sat, 17 Jan 2026 00:00:00 +0000https://sentinel-blog.cherrypod.org/posts/17-flipping-the-switch/TL4 activation. One environment variable, one container rebuild. Sentinel starts making its own decisions.Part 16: The Sandbox Wasn't Realhttps://sentinel-blog.cherrypod.org/posts/16-the-sandbox-wasnt-real/Fri, 16 Jan 2026 00:00:00 +0000https://sentinel-blog.cherrypod.org/posts/16-the-sandbox-wasnt-real/Every sandbox field was snake_case. Podman’s API requires PascalCase. HTTP 201 Created. Zero containment.Part 15: The Red Teamhttps://sentinel-blog.cherrypod.org/posts/15-the-red-team/Thu, 15 Jan 2026 00:00:00 +0000https://sentinel-blog.cherrypod.org/posts/15-the-red-team/Four attack scenarios, including a simulated compromised planner. Six clean runs before trusting it.Part 14: The Benchmark That Broke Everythinghttps://sentinel-blog.cherrypod.org/posts/14-the-benchmark-that-broke-everything/Wed, 14 Jan 2026 00:00:00 +0000https://sentinel-blog.cherrypod.org/posts/14-the-benchmark-that-broke-everything/1,136 adversarial prompts. A 62% false positive rate on multi-step plans. The ascii gate was the culprit.Part 13: Reaching the Outside Worldhttps://sentinel-blog.cherrypod.org/posts/13-reaching-the-outside-world/Tue, 13 Jan 2026 00:00:00 +0000https://sentinel-blog.cherrypod.org/posts/13-reaching-the-outside-world/Signal, Telegram, email, calendar, web search — wiring Sentinel into the channels I already use.Part 12: The Trust Ladderhttps://sentinel-blog.cherrypod.org/posts/12-the-trust-ladder/Mon, 12 Jan 2026 00:00:00 +0000https://sentinel-blog.cherrypod.org/posts/12-the-trust-ladder/Five trust levels, from full human approval to autonomous execution. Each one its own project.Part 11: Teaching the System to Rememberhttps://sentinel-blog.cherrypod.org/posts/11-teaching-the-system-to-remember/Sun, 11 Jan 2026 00:00:00 +0000https://sentinel-blog.cherrypod.org/posts/11-teaching-the-system-to-remember/The hardest design problem wasn’t security — it was giving the planner context without breaking the privacy boundary.Part 10: The Sandboxhttps://sentinel-blog.cherrypod.org/posts/10-the-sandbox/Sat, 10 Jan 2026 00:00:00 +0000https://sentinel-blog.cherrypod.org/posts/10-the-sandbox/Every shell command runs in a disposable container. No state leaks, no network, no capabilities. Or so I thought.Part 9: Working in Parallelhttps://sentinel-blog.cherrypod.org/posts/09-working-in-parallel/Fri, 09 Jan 2026 00:00:00 +0000https://sentinel-blog.cherrypod.org/posts/09-working-in-parallel/Git worktrees unlocked parallel development — four feature branches merging simultaneously. Then the integration bugs arrived.Part 8: The Night I Almost Changed Everythinghttps://sentinel-blog.cherrypod.org/posts/08-the-night-i-almost-changed-everything/Thu, 08 Jan 2026 00:00:00 +0000https://sentinel-blog.cherrypod.org/posts/08-the-night-i-almost-changed-everything/After finding 99 bugs, I wrote 230KB of analysis on whether to scrap the whole project and build something else instead.Part 7: The Bug Hunthttps://sentinel-blog.cherrypod.org/posts/07-the-bug-hunt/Wed, 07 Jan 2026 00:00:00 +0000https://sentinel-blog.cherrypod.org/posts/07-the-bug-hunt/99 findings from a systematic security audit of my own code. Zero critical — but 16 high-severity.Part 6: Burning It Downhttps://sentinel-blog.cherrypod.org/posts/06-burning-it-down/Tue, 06 Jan 2026 00:00:00 +0000https://sentinel-blog.cherrypod.org/posts/06-burning-it-down/826 passing tests. A functional pipeline. I deleted all of it and rebuilt the package structure from scratch.Part 5: Three Out of Fivehttps://sentinel-blog.cherrypod.org/posts/05-three-out-of-five/Mon, 05 Jan 2026 00:00:00 +0000https://sentinel-blog.cherrypod.org/posts/05-three-out-of-five/A security system that scores 3/5 on security is failing. The score became a to-do list.Part 4: The First Real Testhttps://sentinel-blog.cherrypod.org/posts/04-the-first-real-test/Sun, 04 Jan 2026 00:00:00 +0000https://sentinel-blog.cherrypod.org/posts/04-the-first-real-test/741 attack prompts, run overnight. The results changed the entire direction of the project.Part 3: Why Sentinel Existshttps://sentinel-blog.cherrypod.org/posts/03-why-sentinel-exists/Sat, 03 Jan 2026 00:00:00 +0000https://sentinel-blog.cherrypod.org/posts/03-why-sentinel-exists/From a custom PC build to containers, local LLMs, and the paper that started it all — CaMeL.Part 2: The Tinkering Yearshttps://sentinel-blog.cherrypod.org/posts/02-the-tinkering-years/Fri, 02 Jan 2026 00:00:00 +0000https://sentinel-blog.cherrypod.org/posts/02-the-tinkering-years/Raspberry Pis, broken Bitcoin nodes, and the moment AI stopped being a search engine and started being a development partner.Part 1: The Privacy Panichttps://sentinel-blog.cherrypod.org/posts/01-the-privacy-panic/Thu, 01 Jan 2026 00:00:00 +0000https://sentinel-blog.cherrypod.org/posts/01-the-privacy-panic/How a Google Maps feature turned a close protection worker into a privacy obsessive — and laid the foundation for everything that came after.Hello World — Introducing the Sentinel Bloghttps://sentinel-blog.cherrypod.org/posts/2026-03-01-hello-world/Wed, 31 Dec 2025 00:00:00 +0000https://sentinel-blog.cherrypod.org/posts/2026-03-01-hello-world/Introducing the Sentinel project blog — what it is, why it exists, and what to expect.