ai-standards

An orchestration framework for Claude Code that builds full-stack applications. Nine specialised AI agents — spec, implement, review, test — each with its own context window, strict standards, and a single role.

Quickstart GitHub

What it does

You describe a feature in plain language. The framework splits the work across nine isolated AI agents, runs them through a deterministic pipeline (spec → implement → DoD-check → review → test), and produces implemented, reviewed, and tested code following Hexagonal Architecture, DDD, CQRS, and Event-Driven design.

Each agent runs in its own context window. Handoffs between agents are structured markdown files; failures fail loud via a four-value Status block (complete / blocked / failed / incomplete). The orchestrator never advances on missing signal.

6 commands

/init-project, /create-specs, /refine-specs, /build-plan, /update-specs, /check-web — every step of the pipeline is a slash command.

9 isolated agents

Spec Analyzer, Backend Developer, Frontend Developer, DoD-checker, Backend Reviewer, Frontend Reviewer, Tester, DevOps, Web Auditor — each with strict tools and a single role.

33 standards

Backend, frontend, security, performance, observability, GDPR/PII, payments, LLM integration, file storage, geo-search, audit log, feature flags, …

11 critical paths

Reviewers load only the rules relevant to the current diff via a coverage-aware protocol — no defensive full-checklist reads.

Why it exists

Most “AI builds an app” frameworks operate on vibes and a single big prompt. ai-standards constrains the AI with:

Spec before code. No agent writes a line without a validated spec.
Stable rule IDs. Every architectural rule has a stable identifier (BE-015, AZ-001, PA-006, …) reviewers cite by ID, never paraphrased prose.
Per-phase bundles. The orchestrator distills only the rules relevant to the current feature into two bundles (dev + tester). Subagents read the bundle, not the full standards directory.
Coverage-aware checklist loading. Reviewers walk matched critical paths first; the full review checklist is consulted per-section only on real coverage gaps, with mandatory citation.
Real cost telemetry. Every subagent reports total_tokens from the Anthropic SDK; the orchestrator emits a per-phase cost table at the end of every /build-plan run.
Self-validating. 23 static smoke checks + dynamic smoke fixtures keep the framework’s contracts from drifting silently.

Status

Active. Pre-1.0 (0.42.x). Empirical Reviewer-savings of 30-50k Sonnet tokens per phase confirmed at N=2 in real consumer use. See Token economics for the full numbers.

Honest limitations

Single stack. Today this builds PHP/Symfony + Vue 3 applications. Standards, scaffolds, and reference files would need rewriting for a different stack. The orchestration patterns (agents, handoffs, spec-first, coverage-aware loading) are portable; the implementation details are not.
Claude Code only. Agent orchestration relies on Claude Code’s subagent system. Adapting to another AI tool requires significant work.
Developer in the loop. This does not replace the developer. You describe features, approve specs, confirm git operations, and make decisions the AI cannot make alone.
Opinionated architecture. Hexagonal + DDD + CQRS is enforced, not suggested. If your project does not follow these patterns, the standards will fight your codebase.

Built and maintained by Mario Marco Esteve. MIT licensed.