Part 46: Seventy-Eight Findings

With multi-user in place, the system needed a proper security review. Not a quick scan — a methodical walk through every API route, every middleware check, every frontend component. Seventeen modules examined. Seventy-eight findings documented.

Some were expected. Cross-user isolation gaps in streaming endpoints — the SSE event stream and log viewer were broadcasting events for all users, not filtering by the authenticated user. WebSocket routine events had the same problem. These are the kind of bugs that don’t exist in a single-user system and only appear when user boundaries actually matter.

Some were embarrassing. The rate limiter was a no-op. SlowAPIMiddleware had never been registered in the ASGI stack. Every @limiter.limit() decorator in the codebase — every carefully considered rate limit on login attempts, API calls, file operations — was decorative. The middleware registration was a one-line fix. The fact that it had been missing since rate limiting was first added was harder to accept.

The IDOR vulnerabilities followed a pattern. List endpoints for memory, webhooks, and routines returned all records, not just the authenticated user’s. The database had row-level security as a backstop, but the API layer wasn’t filtering. Defence in depth means both layers should enforce the boundary, not just the one you hope catches it.

Frontend hardening turned up the usual suspects. The HTML sanitiser wasn’t stripping style attributes — a vector for CSS-based data exfiltration. Status values rendered in the UI weren’t escaped, creating a stored XSS path if a task name contained markup. The health endpoint was verbose enough to fingerprint the backend stack.

The CSRF findings were more nuanced. The content security policy exempted /sites from same-origin enforcement so that user-built websites could load resources. But the exemption was too broad — it covered /sites/anything, not just static file serving. Narrowing it to /sites/static/ maintained the functionality without opening the API surface. Similarly, the webhook endpoint had a blanket CSRF exemption that should have been scoped to requests with valid webhook signatures.

The authentication hardening was incremental. Constant-time PIN comparison to prevent timing attacks. JTI validation on every request (not just revocation checks). A logout endpoint that actually invalidates the token server-side. Periodic cleanup of the revocation set to prevent unbounded memory growth. None of these individually are dramatic, but collectively they close the gaps between “has authentication” and “authentication is robust.”

Structured audit logging went in alongside the fixes. Security-critical events — login attempts, PIN changes, session revocations, task submissions — now produce structured log entries with consistent fields. Not for compliance (this is a personal project), but for debuggability. When something goes wrong with auth, the logs should tell the story without needing to reproduce the issue.

The policy engine remediation was a separate pass. ANSI-C quoting in shell commands wasn’t being decoded, so $'\t' was being treated as a literal string instead of a tab character. Environment variable prefixes weren’t being normalised. Small robustness fixes that only surface under specific input patterns.

Seventy-eight findings sounds like a lot. It is. But the majority were variants of the same underlying patterns: single-user assumptions that didn’t survive multi-user activation, and defence-in-depth layers where only one layer was actually enforcing. The fixes were systematic — once the pattern was identified, every instance could be found and addressed. The audit itself took longer than the remediation.