The Boring Infrastructure of Reliable Agents
Why reliability isn’t just prompt engineering—it’s boring DevOps, transport adapters, and handling messy API quirks.
The Dual-Loop: Deterministic Gates vs. Adversarial Review
Why passing the linter isn’t enough: pairing fast deterministic checks with slow, adversarial AI reviews.
AgentRig-on-AgentRig: Self-Verifying AI Workflows
How we test the harness itself: running the full agent lifecycle inside a unit test.
The Inversion of Responsibility: Onboarding AI as an Architect
Why ‘Creative Assistant’ mode fails for real software, and how we inverted the context flow.
From Script to Platform: Decoupling Domain Logic
How we transformed AgentRig from a ContextLab-specific script into a reusable, language-agnostic AI governance platform.
Evidence Contracts Over Placeholder Artifacts
Why strategic tasks fail agent execution, and how we fixed it with mandatory evidence contracts.
Designing Tasks for Low-Context Coding Agents
Why valid JSON tasks fail execution, and the strict briefing contracts required for low-context agents.
From Plan to Backlog: Deterministic Decomposition That Keeps Traceability
How we fixed plan-to-task decomposition so generated backlogs stayed accountable, traceable, and complete.
Why "Looks Good" Plans Fail: Building Enforceable Planning Gates
What broke in our early planning loop, and how we moved from subjective review to enforceable planning gates.
Introducing AgentRig: A Practical AI Dev Tooling Harness
Why we built AgentRig, what it does, and what we are learning while building AI-assisted engineering workflows.