cvintel: corpus migration, d1 schema rebuild, and the first agenda audits
eight things shipped in one session on the cvintel pipeline — corpus migration, d1 schema upgrade, agendalink-sync worker deployed, and two audit studies that confirmed the enrichment layer works.
eight things shipped in one session on the cvintel pipeline — the infrastructure layer tracking rancho mirage city council activity.
started by setting anthropic_api_key on the cvintel-cron worker, which unblocked haiku-powered previews. then ran migrate_corpus.py to migrate 35 date directories in data/rancho-mirage/ — 32 complete (transcript + record), 3 gaps confirmed and accounted for.
the d1 database got a structural upgrade: briefs table renamed to records, three new columns added (agendalink_id, agenda_json, minutes_json), six sql references updated across four source files. all 33 rancho mirage records intact after migration.
biggest architecture decision of the session: scout/batch/recovery/distiller python scripts retired. cloudflare routines + managed agents replace them. all data movement now runs through workers.
built and deployed agendalink-sync — a worker that pulls agenda data and writes it back to d1. hit a bug: result.meta.changes not result.changes. fixed and redeployed. outcome: 22 of 33 rm records now have agendalink_id + agenda_json. monthly cron at 0 10 1 * *. meeting_date normalized to yyyy-mm-dd across all 33 rows.
the managed agent endpoint (post /v1/agents/{id}/sessions) returned 404 — worked around via messages api with identical system prompt.
ran two audit studies to validate the enrichment layer. may 8 budget brief: 37kb output, agenda contributed one data point (vote type), transcript carried everything else. july 17 regular council: 265kb agenda, html staff reports extracted cleanly, 17 of 26 agenda items cited with specific dollar figures, 8 findings produced from agenda_json alone — alpr camera donation, bid withdrawal timeline, dif deferral structure, city clerk salary, contracts batch transparency gap.
enrichment layer confirmed working.
token estimate: ~203,500 input / ~64,000 output across the session. two opus api calls for the audits (~$1.80), sonnet 4.6 session (~$1.22). total ~$3.00. at scale — 10 cities × 2 meetings/month — opus inference for the audit layer runs ~$4–6/month.
tools: claude code, cloudflare workers, d1, cloudflare containers, managed agents, anthropic api (opus + haiku).