Part 49: Real-World Data

Sentinel had a growing collection of tools — file operations, code generation, web browsing, search. But certain categories of question exposed a gap. “What’s the weather in London?” would produce a confident, entirely fabricated answer. “What’s the Bitcoin price?” would get a number that was plausible but wrong. The planner had no way to fetch structured, real-time data from external APIs. It could browse a weather website and try to scrape the answer, but that’s fragile, slow, and produces unstructured text that’s hard to verify.

The solution was purpose-built data backends. Not general web scraping, but direct API integrations that return structured, typed results. Weather was the first target, crypto prices the second. Both follow the same architectural pattern: an abstract backend interface, multiple concrete implementations, automatic fallback between them, and provenance tagging on the output.

The weather system has two backends. Met Office covers the UK with high accuracy — their Weather DataHub Global Spot API returns detailed forecasts with specific weather codes that map to human-readable conditions. Open-Meteo covers everywhere else, using the WMO standard weather code set. It’s free, requires no API key, and provides reasonable global coverage. A geocoding service resolves place names to coordinates before either backend is called, with an in-memory cache to avoid hammering the geocoder for repeated lookups.

Routing between backends is automatic. UK locations go to Met Office first. If the Met Office API key isn’t configured, or the request fails, it falls back to Open-Meteo. Non-UK locations go straight to Open-Meteo. The planner doesn’t need to know which backend answered — both normalise into the same WeatherResult structure with temperature, conditions, wind speed, and humidity. A three-day forecast is included alongside current conditions.

Crypto prices follow the same dual-backend pattern. CoinGecko provides rich data — prices in both GBP and USD, 24-hour percentage change, market cap, volume. But it has rate limits and a slight lag. Binance gives real-time USD ticker prices with minimal latency, but only USD and no market cap data. A map of ten major coins (BTC, ETH, SOL, XRP, ADA, DOT, LINK, AVAX, DOGE, MATIC) with aliases handles the translation between user input (“bitcoin”, “btc”, “BTC”) and each backend’s identifier format.

The default is Binance for speed, with CoinGecko available when richer data is needed. If the user asks for a coin that Binance doesn’t support, it automatically falls back to CoinGecko. Output is formatted for readability — currency symbols, directional arrows for 24-hour change, human-readable market cap ("$1.52T" rather than raw numbers).

Both tools are wired through the standard executor pipeline. The planner sees them as tool descriptions in its context. When invoked, the handler validates parameters, calls the appropriate backend, and returns the result. Every piece of data that comes back is tagged as UNTRUSTED with a DataSource.WEB provenance marker. This means weather forecasts and price data flow through the same security pipeline as any other external content — the judge sees the provenance tag and treats the data accordingly.

The backends are deliberately read-only. No authentication tokens are stored for services that could modify state. Met Office needs an API key for access, but the key only grants read permission. CoinGecko and Open-Meteo don’t require authentication at all. Binance uses their public ticker endpoint. The attack surface is minimal — even if the tool execution were somehow compromised, the worst case is fetching public data.

What makes this different from just telling the planner to browse a weather website is reliability and verifiability. A structured API response either parses correctly or fails cleanly. There’s no ambiguity about whether the planner correctly extracted “15°C” from a page full of HTML. The temperature is a number in a JSON field. The conditions are a code that maps to a known string. The judge can verify that the tool returned valid data without understanding weather.

These tools also have a fast path. Sentinel’s classifier recognises certain messages as direct data requests and routes them straight to the API — bypassing the planner entirely. Sending “bitcoin” gets a price back from Binance in under a second. Sending “weather” hits the Met Office API with a preconfigured default location and returns a forecast just as fast. No planning step, no tool selection, no LLM reasoning about what the user wants. The classifier sees a single-word data request, calls the backend directly, and returns the result. For queries that need more context — “compare the weather in three cities” or “track BTC over the last hour” — the full planner pipeline handles it. But for the common case of wanting a quick answer to a quick question, the fast path makes these feel like native commands rather than AI tool calls.

It’s a small addition in terms of code — a few hundred lines per backend. But it fills a category of question that previously produced hallucinated answers. When someone asks the weather, they get a real forecast from a real meteorological service. When they ask a crypto price, they get a real number from a real exchange. The planner doesn’t need to pretend it knows things it doesn’t.