v2.4 doesn't ship a new feature. It ships better answers. After 10 months of feature growth, we did our first systematic quality pass on the agents themselves.
What shipped
- Anti-rationalization — agents no longer manufacture confident-sounding answers when they're uncertain. If a tool failed or a layer was missing, the agent says so instead of working around it with plausible-but-wrong reasoning.
- Partial-failure tightening — when one agent in a multi-agent chain hits an error, downstream agents now see the failure explicitly instead of receiving an incomplete-but-confident upstream response
- Tool discipline — every agent's tool list was audited. Tools that overlapped were merged; tools that were rarely used were removed; tools that were silently failing now log clearly. Smaller, cleaner toolboxes mean faster, more reliable agent routing.
- Centralized prompt architecture review — every agent prompt was re-read against the answers it produced for our beta users in the prior 60 days. Where prompts encouraged hedging, they were rewritten to be direct. Where they encouraged false confidence, guardrails were added.
Why it mattered
The first year of Curated was about capability — what can it do? v2.4 starts the second phase: how often is it right? This release puts more value back into customer hands than any feature release of comparable size — most users will see the difference as "Curated stopped making stuff up about that one layer."
We'll keep doing these passes. Expect a quality-focused release once a quarter going forward.
For users
You don't need to do anything. The improvements apply to every new conversation.