This is the source of truth for the agent's workflow. AGENTS.md references this file for the authoritative loop definition.
Before anything else, internalize these non-negotiable principles:
1. NO PRODUCTION CODE WITHOUT A FAILING TEST FIRST
2. NO FIXES WITHOUT ROOT CAUSE INVESTIGATION FIRST
3. NO COMPLETION CLAIMS WITHOUT FRESH VERIFICATION EVIDENCE
4. EVIDENCE BEFORE CLAIMS, ALWAYS
Violating the letter of these rules is violating the spirit.
Trigger: Any creative work - features, components, modifications.
Process:
- Read the
.directivecompletely - Check current project state (files, docs, recent commits)
- Ask clarifying questions one at a time (prefer multiple choice)
- Propose 2-3 approaches with trade-offs
- Lead with your recommendation and explain why
- Present design in sections (200-300 words), validate each
- Document validated design before implementation
Key Principles:
- One question at a time - don't overwhelm
- YAGNI ruthlessly - remove unnecessary features
- Explore alternatives before settling
- Be flexible - go back when something doesn't make sense
┌─────────────────────────────────────────────────────────────────────────────┐
│ PHASE 1: PLANNING │
│ │
│ 1. Clone target repository to .workrepo/ │
│ 2. Deep codebase analysis (read ALL source files) │
│ 3. Director creates execution plan │
│ 4. Web research via Google Custom Search (if configured) │
│ 5. Gather input from all personas │
│ 6. Director synthesizes requirements │
│ 7. Post decision table to PR (what's incorporated vs skipped) │
│ │
│ OUTPUT: Written plan with bite-sized tasks (2-5 min each) │
└─────────────────────────────────────────────────────────────────────────────┘
Each step is ONE action:
- "Write the failing test" - step
- "Run it to make sure it fails" - step
- "Implement minimal code to make test pass" - step
- "Run tests and verify they pass" - step
- "Commit" - step
Not acceptable: "Implement the feature" (too big)
Every task must have:
- Files: Exact paths to create/modify/test
- Code: Complete code in plan (not "add validation")
- Commands: Exact commands with expected output
- Verification: How to prove this step worked
┌─────────────────────────────────────────────────────────────────────────────┐
│ PHASE 2: IMPLEMENTATION │
│ │
│ For each task in the plan: │
│ │
│ ┌──────────────────────────────────────────────────────────────────┐ │
│ │ RED: Write failing test │ │
│ │ → Run test, WATCH IT FAIL │ │
│ │ → Confirm it fails for the right reason │ │
│ └──────────────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌──────────────────────────────────────────────────────────────────┐ │
│ │ GREEN: Write minimal code to pass │ │
│ │ → Run test, WATCH IT PASS │ │
│ │ → No extra features, no "improvements" │ │
│ └──────────────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌──────────────────────────────────────────────────────────────────┐ │
│ │ REFACTOR: Clean up (tests stay green) │ │
│ │ → Remove duplication │ │
│ │ → Improve names │ │
│ │ → Extract helpers if needed │ │
│ └──────────────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌──────────────────────────────────────────────────────────────────┐ │
│ │ COMMIT: Atomic commit with clear message │ │
│ │ → feat: [Story ID] - [Story Title] │ │
│ └──────────────────────────────────────────────────────────────────┘ │
│ │
│ Create feature branch │
│ Apply code changes with automatic retry for failed edits │
│ Push changes │
│ Create Pull Request │
└─────────────────────────────────────────────────────────────────────────────┘
Write code before the test? DELETE IT. Start over.
No exceptions:
- Don't keep it as "reference"
- Don't "adapt" it while writing tests
- Don't look at it
- Delete means delete
Before claiming ANY task complete:
- IDENTIFY: What command proves this claim?
- RUN: Execute the FULL command (fresh, complete)
- READ: Full output, check exit code, count failures
- VERIFY: Does output confirm the claim?
- ONLY THEN: Make the claim
✅ [Run test command] [See: 34/34 pass] "All tests pass"
❌ "Should pass now" / "Looks correct" / "I'm confident"
┌─────────────────────────────────────────────────────────────────────────────┐
│ PHASE 3: REVIEW CYCLES │
│ │
│ For each cycle (max 5 iterations): │
│ │
│ ┌──────────────────────────────────────────────────────────────────┐ │
│ │ SPEC COMPLIANCE REVIEW │ │
│ │ → Does code match spec exactly? │ │
│ │ → Nothing missing? Nothing extra? │ │
│ │ → If issues: Fix → Re-review (don't skip re-review) │ │
│ └──────────────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌──────────────────────────────────────────────────────────────────┐ │
│ │ CODE QUALITY REVIEW │ │
│ │ → Is implementation well-built? │ │
│ │ → If issues: Fix → Re-review (don't skip re-review) │ │
│ └──────────────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌──────────────────────────────────────────────────────────────────┐ │
│ │ PERSONA REVIEWS (Project Manager, Technical Writer, Researcher) │ │
│ │ → Each outputs structured JSON │ │
│ │ → If NEEDS_WORK: Director synthesizes → Engineer fixes │ │
│ │ → Commit after each fix │ │
│ └──────────────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌──────────────────────────────────────────────────────────────────┐ │
│ │ If ALL approved → Director Final Review │ │
│ │ → If APPROVE: Submit GitHub approval with "LGTM" │ │
│ │ → Break loop │ │
│ └──────────────────────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
Order matters:
- First: Spec compliance (does it match requirements?)
- Then: Code quality (is it well-built?)
Never:
- Skip reviews
- Proceed with unfixed issues
- Start quality review before spec compliance is ✅
- Accept "close enough" on spec compliance
When reviewer finds issues:
- Implementer fixes them
- Reviewer reviews again
- Repeat until approved
- Don't skip the re-review
┌─────────────────────────────────────────────────────────────────────────────┐
│ SYSTEMATIC DEBUGGING │
│ │
│ NO FIXES WITHOUT ROOT CAUSE INVESTIGATION FIRST │
│ │
│ Phase 1: Root Cause Investigation │
│ → Read error messages carefully (they often contain the solution) │
│ → Reproduce consistently │
│ → Check recent changes (git diff, commits) │
│ → Trace data flow to find where bad value originates │
│ │
│ Phase 2: Pattern Analysis │
│ → Find working examples in same codebase │
│ → Compare against references (read completely, don't skim) │
│ → Identify every difference, however small │
│ │
│ Phase 3: Hypothesis and Testing │
│ → Form single hypothesis: "I think X because Y" │
│ → Test minimally (smallest possible change) │
│ → One variable at a time │
│ │
│ Phase 4: Implementation │
│ → Create failing test case FIRST │
│ → Implement single fix (ONE change at a time) │
│ → Verify fix │
│ │
│ If 3+ fixes failed: STOP │
│ → Question the architecture │
│ → Don't attempt Fix #4 without architectural discussion │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
If you catch yourself thinking:
- "Quick fix for now, investigate later"
- "Just try changing X and see if it works"
- "Add multiple changes, run tests"
- "It's probably X, let me fix that"
- "I don't fully understand but this might work"
ALL of these mean: STOP. Return to Phase 1.
┌─────────────────────────────────────────────────────────────────────────────┐
│ HUMAN HANDOFF │
│ │
│ 1. Verify tests pass (MANDATORY - run the command, see the output) │
│ 2. Present exactly 4 options: │
│ - Merge back to base branch locally │
│ - Push and create a Pull Request │
│ - Keep the branch as-is (human handles later) │
│ - Discard this work │
│ │
│ PR is ready for human codebase owner to review and merge │
│ Agent does NOT auto-merge - human makes final decision │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
All GitHub comments read as ONE cohesive internal reflection from a single developer. Not separate disconnected reviews.
- NEVER break comment continuity - each comment aware of previous ones
- NEVER reveal multiple personas in GitHub output
- ALWAYS maintain first-person voice ("I've checked...", "I noticed...")
- ALWAYS show reasoning process, not just conclusions
I've been through this carefully, and I'm feeling good about the overall direction.
The footer component slots in cleanly without disrupting the existing layout system.
Building on what I noted earlier about the styling approach - the way this uses
the existing color tokens is exactly right. No new variables, no special cases,
just consistent application of what's already there.
REVIEW FROM PROJECT MANAGER:
- Requirements met: Yes
REVIEW FROM TECHNICAL WRITER:
- Naming: Acceptable
Each iteration spawns a new instance with clean context. Memory persists via:
- Git history (commits from previous iterations)
progress.txtor.state/(learnings and context).directive(the goal)- AGENTS.md files (codebase patterns)
After completing work, check if edited files have learnings worth preserving:
Good additions:
- "When modifying X, also update Y to keep them in sync"
- "This module uses pattern Z for all API calls"
- "Tests require the dev server running on PORT 3000"
Don't add:
- Story-specific implementation details
- Temporary debugging notes
| Excuse | Reality |
|---|---|
| "Too simple to test" | Simple code breaks. Test takes 30 seconds. |
| "I'll test after" | Tests passing immediately prove nothing. |
| "Already manually tested" | Ad-hoc ≠ systematic. No record, can't re-run. |
| "Deleting X hours is wasteful" | Sunk cost fallacy. Keeping unverified code is technical debt. |
| "TDD will slow me down" | TDD faster than debugging. |
| "Issue is simple, don't need process" | Simple issues have root causes too. |
| "Emergency, no time for process" | Systematic is FASTER than thrashing. |
| "Should work now" | RUN the verification. |
| "I'm confident" | Confidence ≠ evidence. |
When all tasks have been completed, verified, and reviewed:
- All tests pass (verified with fresh run)
- All reviews approved
- PR ready for human review
Reply with completion status and hand off to human.
| Phase | Key Activities | Success Criteria |
|---|---|---|
| 0. Brainstorm | Questions, alternatives, design | Validated design document |
| 1. Planning | Codebase analysis, task breakdown | Written plan with bite-sized tasks |
| 2. Implementation | TDD cycle per task | Code passes tests, committed |
| 3. Review | Spec compliance, code quality, persona reviews | All reviewers approve |
| 4. Debugging | Root cause → hypothesis → test → fix | Bug resolved with test |
| 5. Handoff | Verify tests, present options | PR ready for human |