GitHub Spec Kit crossed 90,000 GitHub stars within six months of its open-source release in late 2025. BMAD-METHOD has been adopted by engineering teams at Amazon, Google, and Shopify. GSD, built natively for Claude Code, is described by its practitioners as the most rigorous execution framework available. Yet most development teams using AI coding tools are prompting ad hoc, without a structured workflow of any kind.
That gap is closing, and the decision about how to close it has become a genuine architectural choice. A new category — spec-driven development — has emerged from the failure modes of unstructured AI coding: agents that produce plausible code that drifts from intent, hallucinates APIs, and degrades in quality as projects grow. The four frameworks compared here — GSD, BMAD, OpenSpec, and GitHub Spec Kit — each claim to solve that problem. They do not solve it in the same way, and the differences determine which one is right for a given organisation.
What You Are Really Choosing Between
The question "which framework should we use?" usually conceals a more fundamental one: where in the development process does your biggest quality risk sit? The answer should drive the choice — not feature lists.
Spec-driven development is not a single thing. It is a family of approaches that share one principle but diverge sharply in scope, ceremony, and what they optimise for. GSD is primarily a context management system. BMAD is a full-lifecycle agent orchestration platform. OpenSpec is a change documentation standard. GitHub Spec Kit is a cross-agent workflow normalisation tool. Choosing without understanding that distinction leads to either over-engineering — deploying BMAD to write a microservice — or under-engineering — using Spec Kit for a twelve-month brownfield migration with audit requirements.
The choice also carries commitment. These frameworks shape how specifications are written, how implementation tasks are broken down, and what artefacts are produced. Switching mid-project is expensive. The right time to make this decision is before the first sprint, not after the third.
When to Use Which Framework
The variables that matter most when choosing a spec-driven framework are not feature lists. They are team size, project type, and compliance requirements. The matrix below maps the most common scenarios to a recommended starting point.
| Scenario | Team size | Project type | Compliance | Framework |
|---|---|---|---|---|
| Pre-PMF startup MVP | 1–3 | Greenfield | Low | OpenSpec or GSD |
| Funded startup, scaling team | 4–15 | Greenfield product | Medium | GitHub Spec Kit |
| Complex enterprise greenfield | 10–15 | New platform or major product | Medium–High | BMAD |
| Brownfield modernisation | Any | Legacy refactor | Medium | OpenSpec + selective BMAD |
| Regulated industry (FinTech, healthcare) | Any | Either | High — SOC 2, HIPAA, EU AI Act | BMAD |
| Solo developer, fast iteration | 1 | Anything small | Low | GSD |
GSD: Maximum execution rigour for Claude Code environments
GSD — Get Stuff Done — is a specification and execution framework built entirely on Claude Code's native capabilities: slash commands, CLAUDE.md files, hooks, and agent spawning. Its core argument is that AI output quality degrades as context windows fill. GSD's own documentation describes the pattern: at 50 percent context, Claude starts cutting corners; at 70 percent, hallucinations and forgotten requirements become common. The solution is aggressive atomicity.
Each plan is broken into tasks sized to occupy roughly 50 percent of a fresh context window. GSD then spawns isolated subagents for each task, so task 50 in a project gets the same model quality as task one. The workflow spans six commands: project initialisation, phase discussion, phase planning, parallel execution, verification, and milestone archiving. Each produces structured artefacts — PLAN.md, SPEC.md, VERIFY.md — that accumulate into a coherent record of the project.
The constraints are clear. GSD is Claude Code-specific. It does not port to GitHub Copilot, Cursor, or Gemini CLI. Teams running mixed AI coding environments will find it creates a two-tier workflow. The structured artefact generation also adds ceremony that is proportionate on complex multi-phase projects and disproportionate on small feature work.
For enterprise teams standardised on Claude Code who are building substantial applications, GSD offers the most rigorous execution pipeline of any framework in this comparison.
BMAD: Full agile team simulation for greenfield builds
BMAD — Breakthrough Method for Agile AI-Driven Development — is the most architecturally ambitious framework in this space. Where other frameworks augment a single developer's workflow, BMAD simulates an entire agile software team using 12 or more specialised AI agents: Business Analyst, Product Manager, Architect, UX Designer, Scrum Master, Developer, QA Engineer, Technical Writer, and others. Each agent operates from a defined persona with specific responsibilities, handoff protocols, and review gates.
The appeal for enterprise is that BMAD mirrors organisational structures that large development teams already use. Projects managed through it produce the full set of artefacts one would expect from a staffed programme: requirements documents, architecture specifications, test strategies, and release notes — all generated and reviewed through structured agent interactions. For organisations that need a development process AI can participate in rather than replace, BMAD provides the scaffolding.
The cost is real and measurable. In some of our benchmarking, BMAD averages roughly 31,600 tokens per workflow run, with large projects consuming 230 million tokens per week. That translates to monthly API costs of $800 to more than $2,000 per developer — with peak observed spend of $3,200 on Claude Opus. The same benchmarks show the overhead is not just financial: a comparable CRM dashboard build takes 12 minutes under OpenSpec, 90 minutes under Spec Kit, and 5.5 hours under BMAD. The thoroughness that makes BMAD valuable on complex greenfield builds is the same property that makes it expensive on anything smaller.
BMAD is best suited to new product development where the full lifecycle overhead pays for itself in structured output and where the development team is experienced enough to maintain the workflow under pressure.
OpenSpec: Brownfield change management with an audit trail
OpenSpec occupies a narrower position than the others. Rather than orchestrating the full development lifecycle, it addresses one specific failure mode: changes to existing systems that are poorly documented, making it difficult to understand what changed, why, and what the intended state was before and after. Its proposal-centred workflow uses delta markers — ADDED, MODIFIED, REMOVED — to track precisely what each change introduces relative to the current system state.
The openspec/ directory structure separates stable current-state documentation from active proposals. Each proposal carries its own proposal.md, tasks.md, and delta specifications. Changes require documentation before implementation begins. OpenSpec is also tool-agnostic: it does not require a specific AI coding assistant and imposes less process overhead than BMAD or GSD.
For organisations with formal change approval processes — regulated industries, public sector bodies, or teams where architecture review boards must sign off on technical modifications — that audit trail is not optional. It is what enables the AI coding workflow to exist within governance constraints. OpenSpec is the only framework in this comparison designed with that requirement explicitly in mind.
The limitation is scope. OpenSpec does not address context management, agent orchestration, or quality gates beyond documentation. Teams using it will need a separate execution approach for the implementation work itself — typically Spec Kit or a tool-specific workflow alongside it.
GitHub Spec Kit: Cross-agent standardisation at scale
GitHub's Spec Kit, open-sourced in late 2025, reached more than 90,000 GitHub stars and over 8,000 forks by May 2026, making it the most widely adopted framework in this comparison by that measure. It implements a four-phase workflow — Spec, Plan, Tasks, Implement — through slash commands that work across 29 named AI coding agent integrations, including Claude Code, GitHub Copilot, Cursor, Windsurf, and Gemini CLI, with a generic option covering others.
The positioning is deliberately broad. Spec Kit is not the deepest framework for any single AI tool, but it is the only one that works the same way regardless of which tool a team is using. For organisations running mixed AI coding environments — which describes most large enterprises with varied team preferences or evolving procurement relationships — that portability is the primary differentiator. A standard specification format that travels across tools means the governance and review process does not need to change when a team switches assistants.
A growing ecosystem of 70 or more community extensions adds integrations with Jira, Azure DevOps, and GitHub Issues, along with quality gates for security, testing, and specification drift detection. GitHub's direct involvement provides reasonable confidence in long-term maintenance — a consideration that matters when a framework is being embedded into development standards across a large organisation.
The trade-off is depth. Cross-agent compatibility requires Spec Kit to operate at a level of abstraction that tool-specific frameworks exceed. Teams using Claude Code exclusively will find GSD more rigorous. Teams with complex greenfield builds will find BMAD more structured. Spec Kit earns its position for environments where standardisation across tools and teams matters more than optimisation for any single one.
Five Questions That Determine the Right Framework
Framework features are the wrong starting point. The right starting point is your organisation's actual constraints. These five questions, answered honestly, surface the constraints that matter most.
1. Greenfield or brownfield? Brownfield work tilts toward OpenSpec as the primary framework, with selective BMAD involvement for components being rebuilt from scratch. Greenfield opens up the full range: BMAD for complex new platforms, Spec Kit for teams that need cross-agent portability, GSD for Claude Code environments where execution rigour is the priority.
2. What is your team size and structure? A solo developer or a small pair working at speed is better served by GSD or OpenSpec — both add structure without adding ceremony. A scaling team of four to 15, especially one that has recently hired its first product manager or architect, benefits from Spec Kit's standardised workflow that functions regardless of which AI tool each person uses. Multi-team enterprise environments with explicit role separation are where BMAD's agent personas — Business Analyst, Architect, QA Engineer — map onto real organisational structure rather than simulating one.
3. What are your compliance requirements? Two distinct scenarios produce different answers. For internal governance — architecture review boards, change approval workflows, teams where a technical change must be documented before implementation — OpenSpec's delta specification approach is the right fit; it treats the change record as the primary output. For externally mandated compliance (SOC 2, HIPAA, EU AI Act, or the Colorado AI Act enforceable from June 2026), BMAD is the stronger choice. Its comprehensive agent-generated documentation — requirements, architecture decisions, test strategy, release notes — provides the kind of structured evidence trail that external auditors expect. OpenSpec documents what changed; BMAD documents why every decision was made and against what requirements.
4. What is your API cost tolerance? BMAD's benchmark costs of $800 to more than $2,000 per developer per month make it a non-starter for many teams. If that range is beyond what the project budget supports, BMAD is off the table regardless of its other merits. OpenSpec and GSD carry the lowest API overhead — they add structure without increasing model call frequency. Spec Kit sits in the middle. Run the numbers before choosing, not after.
5. How tied are you to a specific AI coding tool? GSD is the only framework in this comparison with a hard tool dependency — it requires Claude Code. If your teams use multiple AI coding assistants, or if that is likely to change, Spec Kit's 29 named integrations or BMAD V6's cross-platform compatibility with Claude Code, Cursor, Codex, Copilot, and Windsurf are the practical options. OpenSpec and the other non-GSD frameworks are CLI-agnostic and impose no tooling constraint.
How Organisations Move Between Frameworks
Because these are methodologies and not deeply coupled platforms, migration is pretty painless, in our own personal workflows we often switch. The artefacts — PRDs, specs, stories — port across tools because the underlying data is just Markdown and JSON.
- OpenSpec to BMAD. As a brownfield modernisation succeeds and the team starts building new features on top, OpenSpec's delta-only model can feel thin. Carry the archived spec forward as the BMAD Architect agent's input document.
- Spec Kit to BMAD. Common when a startup raises a Series A, hires its first PM, and needs explicit role separation. The Spec Kit constitution maps cleanly onto BMAD's master agent prompts.
- BMAD to GSD. The reverse direction is real. When a complex BMAD project ships and the team enters maintenance mode, GSD's lean two-prompt loop is often better suited to the bug-fix-and-small-feature cadence that follows.
If you are working on transforming how your engineering teams use AI — moving beyond autocomplete to spec-driven, agentic workflows — our SDLC Enablement service is where to start.
← All posts