The multi-agent research pipeline that produced the visitor-economy audit hit three failure modes mid-run. Each one became a permanent gate in the infrastructure.
Today I shipped the multi-agent research pipeline that produced AICV's first agent-readiness audit of the Coachella Valley visitor economy. More than a dozen concurrent autonomous agents, each launching dozens of sub-agents in parallel, peaking at fifty-plus running simultaneously. The orchestration layer held. The discipline around recovery cycles is what made the report shippable.
First: structured-output truncation. Agents returning JSON that looked valid but was silently clipped at the schema boundary. Scoring proceeded on partial records. Caught on the checksum gate, not on visual inspection. Every structured-output handoff now writes a length-check and a tail-token assertion before the next stage can read from it.
Second: silent disk-write failures. Agents reporting successful completion with empty directories. The orchestrator was trusting the return code. Verification now reads back what was claimed to be written. Success is a file that exists, not a status that returns.
Third: parallel agents diverging on partial data. Two sub-agents working the same bucket started with slightly different snapshots and produced overlapping but non-identical scoring frames. The merge step would have buried the divergence. Every parallel fan-out now stamps a shared snapshot ID, and the reducer refuses to merge frames that don't share one.
The pause-and-flag protocol held at every gate. No improvising on broken data. Each failure cycle produced a durable architectural lesson, encoded into AICV's research infrastructure for the next audit.
Anchor navigation repaired across all three AICV reports, retroactively. The llms-full.txt dump now includes reports — long-form research was previously absent from the agent-facing content layer. Sitemap gap closed. Editorial link convention codified in STATE.md. Underscore-archive conventions (_v2-broken/, _pre-recal-2026-06-05/, _agent-scripts/) preserved. Python cache properly gitignored for the first time.
Published the build blog post at /blog/agent-readiness-audit-multi-agent-build.html.