AI-Augmented Development Factory Workflow
Situation
AI-assisted development tools can generate enormous amounts of code. The problem is not generation speed — it’s quality control, consistency, and the compounding cost of skipping steps. Without discipline, AI-assisted development produces code that passes a quick read but fails under review, breaks in CI, or doesn’t match what the product actually needs.
During my sabbatical, I was building across 100+ repositories. I needed a workflow that would let me ship fast with AI assistance while maintaining the quality standards I’d expect from a team I managed.
Decision
I codified a repeatable development loop as a set of slash commands in Claude Code:
/plan-story → /implement-story → /verify → /demo → /pr
Each step enforces a specific gate:
/plan-story: Produces an implementation plan and test plan before any code is written. Forces me to think about scope, dependencies, and test strategy before the AI starts generating./implement-story: TDD-first implementation. Write the failing test, then make it pass, then refactor. The AI generates code, but the test comes first./verify: Code review gate. Fix review issues before running CI. This avoids burning CI time on code that has obvious problems — review is cheaper than a full test suite./demo: Product acceptance gate. Run the feature, verify it works as intended, document what was demonstrated. Fix product concerns before the PR exists./pr: Draft the PR with summary, test description, and risk notes.
Quality constraints enforced throughout:
- CRAP score < 8 (complexity × coverage metric via SimpleCov)
- Every test must justify its carrying cost
- No unrelated refactors in a story PR
- Demo scripts written to
docs/demos/after every completed story
Risk
The risk was over-process. A solo developer adding five mandatory steps to every feature sounds like bureaucracy. The workflow could slow me down more than it helped, especially for small changes.
I accepted this risk because the alternative — shipping fast without gates — had already failed. Early in the sabbatical, I shipped features that looked correct but had subtle bugs, missing edge cases, or product mismatches that required rework. The rework cost more than the gates would have.
The other risk: the workflow was designed for me, working with AI. It might not transfer to a team context. I accepted this because the principles (plan before code, test before implementation, review before CI, demo before merge) are universal — only the tooling is specific.
Change
The Factory Workflow let me ship 22 PRDs across billeisenhauer-app in roughly two weeks with consistent quality. Across all sabbatical projects, it became my default operating rhythm.
Specific outcomes:
- Zero rollbacks on any project using the full workflow
- CRAP scores stayed below 8 across the codebase
- Demo documentation created a reviewable product record, not just a code record
- The
/verifygate caught issues that would have failed CI, saving 5–10 minutes per cycle
The workflow also produced transferable judgment about AI-assisted development: the AI is excellent at generating code from a clear plan and test spec, but poor at deciding what to build, what to skip, and when the product is actually done. The human gates (plan, review, demo) are where the judgment lives.
What This Demonstrates
- Process design for AI-augmented work: Not “use AI to go faster” but “design a workflow that makes AI-assisted development reliable.”
- Quality gates that earn their cost: Each gate exists because skipping it produced a measurable failure. No gate was added speculatively.
- Discipline at speed: 22 PRDs in two weeks is fast. The workflow didn’t slow me down — it prevented the rework that would have slowed me down more.
- Transferable operating system: The specific tools are mine, but the pattern (plan → implement → review → accept → ship) applies to any team adopting AI-assisted development.