The Architect-Builder Pattern: A Production Workflow for AI Development
Claude Code amazes me with its power. It also frustrates me with its limitations.
It built complex implementations fast: a 500-line Python scraper in 5 minutes. Yet it misunderstood simple questions, treating them as implementation requests. “How does this work?” would trigger “Let me change that for you!”
It jumped ahead with assumptions (“if you want x, you must want y too”), misinterpreted ambiguous instructions (turns out English is lossy; coding languages exist for a reason), and sometimes produced code a junior developer wouldn’t ship.
It often implemented partial solutions (apply a fix to one file but not all relevant files) and fail to finish without additional instructions. Sometimes bugs only appeared in CI—or worse, production.
Attempt 1: Two Claude Code Windows
I tried separating design from execution using two Claude Code windows: one for planning, one for implementation. Better, but still messy. Context wasn’t always right. I wanted to iterate on designs without dropping into .md files. And because Claude Code uses Sonnet 4.5, I couldn’t always get the deep reasoning I needed for complex architectural decisions.
Attempt 2: Opus for Design
Then I switched to Opus in a separate window for design. Much better. Superior reasoning, easier iteration with Google Docs. But it lacked codebase context, forcing me to upload snippets constantly.
The Solution: Three-Phase Workflow
Finally, I landed on a three-phase approach: design in Opus, validate against the codebase in Claude Code, implement in a fresh Claude Code session.
Claude’s outputs matched my expectations every time. What often took 5+ turns now only took 1.
The Pattern Emerges
Around the same time, I noticed the pattern emerging elsewhere. In my product leadership Slack (Supra), fellow PMs shared similar workflows in our time-saving-ai-workflow channel. One PM described moving Claude Code from “fast and ugly” to “well-architected” by having it document current architecture, feeding that to Claude in research mode for strategic planning, then back to Claude Code for execution.
Colleagues share it. PMs discuss it in Slack. Engineers adopt it independently. There’s a growing consensus: this pattern works.
I call it the Architect-Builder Pattern.
The Problem: Context Loss and Premature Solutions
AI coding assistants fail in two ways: they lose context and they jump to solutions. This results in inconsistent code, integration issues, and costly rewrites.
The Solution: Three-Phase Separation of Concerns
The Architect-Builder Pattern uses three distinct phases with clear boundaries:
Phase 1: Strategic Architecture (The Architect)
- Purpose: High-level design and architectural decisions
- Model: Claude Opus (or equivalent reasoning-focused model)
- Focus: The “why” and “what” of your solution
- Output: Detailed architecture blueprint with clear specifications
I need to design [SYSTEM]. Provide:
1. Architecture Blueprint (components, data flow, interfaces)
2. Implementation Checklist (critical path vs nice-to-have)
3. Validation Criteria (tests, benchmarks, edge cases)
4. Technical Instructions (files, dependencies, config)
Context: [constraints, tech stack, team size]
Principles: Be concrete, show don't tell, define boundaries
💡 Phase 1 Tips:
Iterate first: “Help me improve this prompt, but don’t execute it yet”
Design partner mode: “Ask clarifying questions before we begin”
Pattern library: “What are the best practices for [X]? Include real-world examples”
Phase 2: Pattern Validation (The Validator)
- Purpose: Catch integration issues before implementation
- Model: Claude Code with full codebase context
- Focus: Align the design to your context
- Output: Detailed implementation prompt
Validate this design against the codebase:
[Paste Phase 1 design]
Check:
1. Pattern Consistency (alignment with existing code)
2. Dependency Analysis (blast radius, hidden coupling)
3. Risk Assessment (what breaks, rollback plan)
4. Optimization (reusable components, simplifications)
Output: Refined plan with file paths, patterns to follow, test strategy
Phase 3: Autonomous Execution (The Builder)
- Purpose: Clean implementation without context pollution
- Model: Fresh Claude Code session
- Focus: Clean execution
- Output: Implemented new features and functionality
Implement this validated design:
[Paste Phase 2 output]
Parameters: Mode [careful/standard/fast], Testing [unit/integration/e2e]
Order:
1. Core Infrastructure → Checkpoint: compilation
2. Business Logic → Checkpoint: unit tests
3. UI/API Layer → Checkpoint: integration tests
4. Full test suite → Checkpoint: all green
Proceed autonomously but pause at checkpoints.
📋 Full templates with detailed instructions and examples: GitHub
Real-World Example: Adding Consumption into Product Health Analysis
I’m building an AI-powered customer intelligence platform that automatically collects ~5,000 monthly signals from 14+ sources (GitHub, Slack, Salesforce, Zendesk, etc.) and generates weekly product health reports in 20 seconds using Claude Opus and RAG. It helps product teams identify at-risk ARR, track customer feedback patterns, and make data-driven decisions across, reducing report creation time from days to minutes. I’ll share more about this app in the future (and open-source a version of it).
I added product consumption signals using this pattern:
Phase 1: Architectural Design (Opus)
I'm building a customer intelligence platform that collects signals from multiple sources (GitHub, Slack, Salesforce, etc.) to generate product health reports. I need to add consumption/usage signals as a new data source to better understand customer engagement patterns.
Current architecture:
- Signal collectors run on schedule to gather data from various APIs
- Data is normalized and stored for analysis
- RAG pipeline processes signals to generate insights
I want to add consumption metrics that track:
- Customer resource usage patterns
- Credit consumption trends
Can you help me design the architecture for this consumption signal collector? I need:
1. Data collection strategy: How should I structure the collection of consumption metrics? What patterns should I follow to ensure consistency with my existing collectors?
2. Schema design: What key metrics and dimensions should I capture? How should I model the data for both real-time monitoring and historical trend analysis?
3. Integration approach: How should this new collector integrate with my existing signal processing pipeline?
4. Performance considerations: What are the key architectural decisions I need to make around caching, batching, and rate limiting when dealing with potentially large volumes of consumption data?
5. Flexibility: How can I design this to easily adapt to different cloud providers or consumption models in the future?
Please provide a high-level architectural design with clear separation of concerns and extensibility in mind.
The design took 30 minutes and three iterations. Manual research would have taken multiple days.
Phase 2: Codebase Validation (Claude Code)
Analyze our codebase and validate this design:
[Pasted Opus design]
Validation caught 10+ major issues but I’ll highlight three:
1. Framework Integration Gap
The design proposed creating a new abstract base class hierarchy:
# ❌ WRONG - Design proposed:
class ConsumptionCollector(ABC):
@abstractmethod
async def collect_metrics(self) -> List[ConsumptionSignal]:
pass
But the codebase already had an established pattern:
# ✅ CORRECT - Existing pattern:
class BaseCollector(ABC):
@abstractmethod
async def collect(self) -> List[Signal]:
pass
def _build_signal(self, signal_data: dict) -> Signal:
"""Standardized signal building method"""
pass
2. Database Schema Conflicts
Design proposed new tables:
-- ❌ WRONG - Would create parallel storage:
CREATE TABLE consumption_signals (
id UUID PRIMARY KEY,
customer_id TEXT,
provider TEXT,
metric_type TEXT,
value DECIMAL
);
CREATE TABLE consumption_aggregates (
period TEXT,
aggregated_value DECIMAL
);
But existing schema already supported this:
-- ✅ CORRECT - Existing unified schema:
CREATE TABLE signals (
id UUID PRIMARY KEY,
customer_id TEXT,
source TEXT, -- Already handles provider tracking
signal_type TEXT, -- Can be 'consumption'
severity TEXT,
metadata JSONB, -- Flexible storage for consumption data
created_at TIMESTAMP,
occurred_at TIMESTAMP
)
3. Testing Strategy Absent
Design had no testing specifications, but codebase had patterns:
# ✅ CORRECT - Required test structure:
class TestSnowflakeConsumptionCollector:
@pytest.fixture
def mock_snowflake_connection():
with patch('snowflake.connector.connect') as mock:
mock_cursor = Mock()
mock_cursor.fetchall.return_value = [
{
'START_TIME': '2024-01-15T10:00:00Z',
'WAREHOUSE_NAME': 'COMPUTE_WH',
'CREDITS_USED': 23.5
}
]
yield mock
async def test_customer_mapping_not_found(self, mock_db):
"""Test fallback when no mapping exists"""
collector = SnowflakeConsumptionCollector()
# Should use account_id as customer_id with warning
async def test_retry_on_transient_failure(self):
"""Test exponential backoff retry logic"""
with patch('snowflake.connector.connect') as mock:
mock.side_effect = [TimeoutError, TimeoutError, Mock()]
# Should succeed on third attempt
Phase 3: Implementation (Fresh Claude Code Session)
In a new window, implement this validated design:
Implementation Prompt: Data Collection Feature
Task: Implement Data Collection Feature
Objective: Add a new collector that gathers consumption metrics and integrates them into the existing platform as signals, following all established patterns.
Critical Context - Must Follow Existing Architecture
1. Extend Existing Base Classes
# CORRECT: Use existing pattern
class ConsumptionCollector(BaseCollector):
async def collect(self) -> List[Signal]:
# Implementation2. Use Unified Data Model
# Store in existing signals table, NOT new tables
Signal(
id=str(uuid.uuid4()),
customer_id="CUST-001", # from mapping
source="consumption_collector",
signal_type="consumption",
severity=Severity.INFO,
metadata={
"provider": "cloud_provider",
"metric_type": "compute_credits",
"value": 123.45,
"dimensions": {...}
}
)Implementation Requirements
Data Collection
- Query last 7 days on first run
- Incremental: last 48 hours (accounts for data latency)
- Handle 3-hour reporting lag in source system
- Aggregate by hour
Testing Requirements
class TestConsumptionCollector:
async def test_collect_returns_signals(self):
"""Test successful collection"""
async def test_customer_mapping(self):
"""Test account to customer mapping"""
async def test_handles_missing_mapping(self):
"""Test fallback behavior"""
async def test_retry_on_transient_failure(self):
"""Test retry logic"""File Structure
collectors/
├── consumption_collector.py # Main implementation
tests/collectors/
├── test_consumption_collector.py
├── fixtures/
│ └── sample_data.json
config/
├── collectors.py # Add config classSuccess Criteria
✅ Extends existing BaseCollector pattern
✅ Stores in unified signals table
✅ Maps provider accounts to customer IDs
✅ All tests pass with >80% coverage
Out of Scope - Do NOT Implement
❌ New database tables for this feature
❌ Real-time streaming
❌ Anomaly detection (Phase 2)
❌ Custom caching layer
❌ New configuration formats
Key Design Decisions
- Why extend BaseCollector: Maintains architectural consistency
- Why unified storage: Prevents schema fragmentation
- Why 48-hour lookback: Accounts for source system latency
Implementation took less than 1 hour. No rollbacks. No hotfixes. First deployment worked.
Advanced Patterns
Progressive Enhancement
Start minimal, layer complexity:
Iteration 1: Core functionality only
Iteration 2: Add error handling and edge cases
Iteration 3: Optimize performance and add monitoring
This prevents over-engineering while ensuring robustness.
Documentation-Driven Development
Maintain two documents per feature:
feature.md: Clear specifications that outlive implementationfeature_TASKS.md: Task breakdowns for parallelization
This creates automatic documentation for future development.
Context Preservation
Create a PROJECT_CONTEXT.md file that travels between phases:
# Project Context
## Phase 1 Decisions
- [Architectural choices made]
- [Key constraints identified]
## Phase 2 Discoveries
- [Integration points found]
- [Risks identified]
## Phase 3 Implementation Notes
- [Deviations from plan]
- [Lessons learned]
Measuring Success
Since adopting this pattern, my metrics transformed:
| Metric | Before | After | Improvement |
|---|---|---|---|
| First-time success rate | 45% | 82% | +82% |
| Average implementation time | 8 hours | 3 hours | -62% |
| Post-deployment bugs | 3-5 per feature | 0-1 per feature | -80% |
| LLM API Cost per feature | ~$25.00 | ~$9.00 | -64% |
When to Use This Pattern
Ideal Use Cases
✅ Complex architectural decisions where mistakes are expensive
✅ Integration with existing codebases where conflicts lurk
✅ Team projects requiring clear documentation
✅ High-stakes deployments where rollback is complex
✅ Regulatory environments requiring audit trails
When to Skip
❌ Simple scripts or utilities
❌ Isolated proof-of-concepts
❌ Pure greenfield projects with no constraints
❌ Emergency hotfixes (though validation phase still valuable)
Design becomes documentation
Phase 1 designs become stakeholder-ready documentation. The validator teaches codebase patterns. Single-focus phases reduce cognitive load. The workflow creates documentation for easy knowledge transfer.
What’s Next: Feedback Loop & Automation Opportunities
Each project refines the pattern library, compounding efficiency. The 50th feature takes a fraction of the time of the first. The manual process could be automated:
- API orchestration for programmatic task routing
- Structured handoffs using JSON schemas instead of markdown
- Parallel validation running multiple checks simultaneously
- Learning systems improving prompts based on success patterns
Keep the separation of concerns. Keep clear boundaries. Automate the repetitive tasks.
Connection to Multi-Model Orchestration
This pattern is part of a broader shift in how we use AI for development. As I wrote in How I Built My Blog, strategic model selection matters more than any single technical choice.
Different models excel at different tasks:
- Opus: Complex reasoning, architectural decisions, ambiguous requirements
- Sonnet: Rapid implementation, codebase analysis, pattern matching
- Haiku: Validation, testing, simple refactors
The Architect-Builder Pattern formalizes this multi-model approach into a repeatable workflow.
What Patterns Are You Seeing?
This pattern emerged from collective experimentation across communities. I’ve seen variations of it independently discovered by PMs, engineers, and technical leaders.
I’m curious about your experience:
- What patterns have you discovered for AI-assisted development?
- How do you handle the handoff between planning and implementation?
- What metrics matter most in your workflow?
The best patterns emerge from practice, not theory.
P.S. If you want another example of this pattern in action, check out how I used it to add copy-to-clipboard functionality to code blocks on this website.