GSD vs SpecKit, BMAD, and Compound Engineering: Picking the Right AI Coding Framework
Everyone agrees vibe coding doesn’t work for serious software. “Describe a thing, let the model rip, ship it” produces brittle code that falls apart the moment requirements get complex. The UC San Diego study confirmed what practitioners already knew: professionals control their agents, they don’t let agents control them.
So the question isn’t whether you need structure. It’s how much structure, and what kind.
Four frameworks have emerged to fill this gap: GSD (Get Shit Done), SpecKit, BMAD, and Compound Engineering. They all attack the same root problem but at different layers of the stack and with wildly different amounts of ceremony. After testing GSD extensively with Claude Code and studying the others, here’s what I’ve found.
What Each Framework Actually Does
Before comparing, it helps to understand the core loop of each one.
GSD wraps context engineering and sub-agent orchestration behind a small set of /gsd:* commands. The philosophy: complexity lives inside the system, not in your daily workflow. You get a fixed set of Markdown artifacts in your repo (PROJECT.md, REQUIREMENTS.md, ROADMAP.md, STATE.md, PLAN.md) and a repeatable loop:
milestone → phases → discuss → plan → execute → verify → complete → next milestone
SpecKit is GitHub’s open-source toolkit for Spec-Driven Development (SDD). It scaffolds .github/prompts and .specify directories with templates, scripts, and a “constitution” to constrain AI behavior. The workflow is more linear:
constitution → specify → plan → tasks → implement
BMAD (Breakthrough Method for Agile AI-Driven Development) models a full cross-functional team with role-based AI agents: Analyst, Product Manager, Architect, Scrum Master, Developer, and QA. These agents collaboratively generate PRDs, architecture documents, and QA plans.
Compound Engineering is a philosophy from Every that argues each unit of work should make the next unit easier. Their Claude Code plugin exposes /workflows:* commands for planning, executing, reviewing, and compounding learnings.
The Ceremony Spectrum
This is where the frameworks diverge most sharply. I think of it as a spectrum from “just enough structure” to “full enterprise SDLC”:
| Framework | Ceremony Level | Setup Time | Artifacts Generated |
|---|---|---|---|
| GSD | Low to medium | 5-10 min | 6-8 Markdown files |
| Compound Engineering | Medium | 10-15 min | Plans + codified learnings |
| SpecKit | Medium to high | 15-20 min | Templates, prompts, constitution, specs |
| BMAD | High | 30+ min | PRDs, architecture docs, QA plans, compliance traces |
Why this matters: More ceremony isn’t better or worse. It’s a trade-off between structure and velocity. A solo developer building a SaaS prototype has fundamentally different needs than a regulated enterprise migrating a legacy system.
GSD: Speed With Guardrails
GSD’s core insight is that context rot kills AI-assisted projects. Instead of stuffing everything into a single chat that degrades over time, GSD constantly refreshes AI context from structured files.
The command surface is intentionally small:
| Command | Purpose |
|---|---|
/gsd:new-project | Initialize with research, requirements, roadmap |
/gsd:discuss-phase N | Capture decisions before planning |
/gsd:plan-phase N | Research and plan with verification steps |
/gsd:execute-phase N | Execute plans (often in parallel) |
/gsd:verify-work N | UAT against REQUIREMENTS.md |
/gsd:complete-milestone | Archive, tag, start next cycle |
What makes it feel fast in practice:
- Context engineering over context stuffing. Each phase refreshes what the AI sees from the project files, so you don’t lose coherence at conversation turn 47.
- Git-native workflow. Artifacts live in the repo. They get committed, branched, and reviewed like code.
- Multi-agent execution. GSD can fan out tasks to multiple agents via worktrees and manage them through notifications.
- Deterministic loops. Every phase has explicit discuss/plan/execute/verify stages. You can resume work, audit decisions, and avoid spaghetti prompting.
Best for: Solo developers and small teams who want structured AI workflows without drowning in process. If you’re building with Claude Code and want just enough guardrails to keep things coherent across sessions, GSD is the natural fit.
SpecKit: Formal Specs for Multi-Agent Teams
SpecKit takes a different approach. Instead of optimizing for a single developer’s flow, it builds a generic SDD scaffolding layer that works across Claude Code, Copilot, Amazon Q, Gemini CLI, and others.
The key differentiator is the constitution: a set of principles and non-negotiables that constrain AI behavior project-wide. Think of it as a CLAUDE.md on steroids, designed to work with any agent.
The workflow is more sequential than GSD’s:
- Define the constitution (principles and constraints)
/speckit.specifygenerates structured specs and user stories/speckit.planproduces technical plans and quickstarts/speckit.tasksbreaks plans into executable tasks/speckit.implementlets agents execute under those constraints
Where SpecKit shines is in environments where multiple AI tools coexist. If your team uses Copilot for some work, Claude Code for others, and Amazon Q for infrastructure, SpecKit gives you one spec-driven process that spans all of them.
Best for: Teams that want formal spec/plan/task pipelines and work across multiple AI coding assistants. If your workflow already mirrors PRD-to-architecture processes, SpecKit will feel natural.
BMAD: Enterprise-Grade Agent Orchestration
BMAD is the heavyweight. Where GSD models a developer workflow and SpecKit models a spec lifecycle, BMAD models an entire software team.
The framework assigns specialized AI agents to distinct roles:
%%{init: {"layout": "dagre"}}%%
flowchart TB
A[Analyst] --> B[Product Manager]
B --> C[Architect]
C --> D[Scrum Master]
D --> E[Developer]
E --> F[QA Agent]
F -->|"feedback"| C
F -->|"approved"| G[Ship]
Each agent generates and refines artifacts. The QA and Architect agents “argue” over edge cases before implementation starts. Everything gets version-controlled for traceability.
This is where BMAD’s value proposition becomes clear: provable traceability. You can trace code back to business logic and regulatory constraints. For SOC 2 compliance, HIPAA requirements, or audit-heavy environments, this paper trail is not optional. It’s the whole point.
Best for: Enterprises and regulated industries where governance, compliance, and auditability matter as much as throughput. If you need to prove why a piece of code exists and what requirement it satisfies, BMAD was built for that.
Compound Engineering: Making Work Compound
Compound Engineering attacks a different angle entirely. The premise: most engineering is linear. You build feature A, then feature B, and each one adds complexity. Compound engineering flips this so each unit of work makes the next one easier.
The Claude Code plugin provides four core commands:
| Command | Purpose | Effort Share |
|---|---|---|
/workflows:plan | Turn ideas into implementation plans | ~40% |
/workflows:work | Execute with worktrees and task tracking | ~20% |
/workflows:review | Multi-agent code reviews | ~30% |
/workflows:compound | Document learnings for reuse | ~10% |
Notice the ratio: 80% planning and review, 20% execution. This is deliberate. The philosophy is that the value isn’t in writing code faster. It’s in making each change leave the codebase better than it found it.
Where Compound Engineering differs from GSD is the knowledge model. GSD organizes around projects, milestones, and phases. Compound Engineering organizes around changes. Each change captures patterns and learnings that feed forward into future work.
Best for: Product teams working on complex, long-lived codebases where accumulated knowledge is the real asset. If your biggest cost isn’t writing code but relearning context and repeating mistakes, this is the framework to study.
When to Choose What
Choose GSD if you:
- Work primarily with Claude Code
- Are a solo developer or small team
- Want structured workflows without heavy process
- Need to ship fast while keeping context coherent across sessions
- Value git-native, milestone-driven development
Choose SpecKit if you:
- Use multiple AI coding assistants
- Want a formal spec-driven pipeline
- Need a constitution that constrains AI behavior project-wide
- Prefer the spec-to-plan-to-tasks-to-implement flow
- Work on teams where specs are already part of the culture
Choose BMAD if you:
- Need traceability from requirements to code
- Work in regulated industries (healthcare, finance, government)
- Want role-based AI agents that mirror a cross-functional team
- Have compliance requirements that demand an artifact paper trail
- Are willing to invest in heavier upfront process for downstream auditability
Choose Compound Engineering if you:
- Maintain complex, long-lived codebases
- Want each engineering cycle to improve future cycles
- Value review and learning capture over raw execution speed
- Work on product teams where institutional knowledge compounds
- Already use Claude Code and want a planning-heavy workflow
They’re Not Mutually Exclusive
Here’s the thing most comparisons miss: these frameworks operate at different layers. GSD is a context-engineering and execution layer. SpecKit is a specification layer. BMAD is a governance layer. Compound Engineering is a knowledge-compounding layer.
You could realistically use GSD’s milestone-driven execution with Compound Engineering’s learning capture. Or SpecKit’s constitution approach with BMAD’s role-based agents. The frameworks solve adjacent problems, not the same problem at different quality levels.
The real question isn’t “which framework is best?” It’s “what’s actually breaking in my AI-assisted workflow right now?”
If your AI loses context after 20 minutes, you have a context engineering problem. GSD fixes that.
If your specs are ambiguous and agents interpret them differently, you have a specification problem. SpecKit fixes that.
If you can’t prove why code exists or trace it back to requirements, you have a governance problem. BMAD fixes that.
If your team keeps solving the same problems and each project starts from zero, you have a knowledge compounding problem. Compound Engineering fixes that.
Pick the one that matches your actual pain. Or pick two.
The Bottom Line
The vibe coding era is ending. Not because AI agents aren’t capable, but because professional software development demands more than “describe and ship.” These four frameworks represent different bets on what structure matters most: context coherence (GSD), formal specifications (SpecKit), organizational governance (BMAD), or compounding knowledge (Compound Engineering).
For most solo developers and small teams building with Claude Code, GSD hits the sweet spot. Low ceremony, high leverage, and just enough structure to keep projects coherent across sessions without drowning in process. But the best framework is the one that matches the problem you actually have, not the one with the most GitHub stars.
Experimenting with AI coding frameworks? I’d love to hear what’s working for you and what isn’t. Reach out on LinkedIn.