field notes

build logs, tighter lessons

short notes from work on agentic analytics, compounding datasets, and signal extraction across live systems.

project

topic

March 20, 2026

Open-Source Agent Runs Need Stronger Evaluation

Open agent stacks are improving fast, but most failures still come from weak evaluation and orchestration rather than raw model quality.

agents evaluation orchestration

analyse.cefr