Schema-Driven Development
Most enterprise systems fail the same way. Not because the code is bad, but because nobody agrees on what the system actually is. The entity model lives in one team's head. The state machines are implicit in conditional logic. The event contracts are scattered across docs that haven't been updated in months. Permissions are tribal knowledge. And when an AI agent shows up to help build the next feature, it has to reverse-engineer all of this from source code.
I've been working on an approach I'm calling Schema-Driven Development. The idea: model a system's entire semantics as 10 interlocking JSON schemas before writing application code. Entities, state machines, events, tools, workflows, data ownership, architecture, roles, compliance, integrations. All cross-referenced. All machine-readable. All validatable.
The full methodology is here. This post is about why I think it matters, where the ideas come from, and where it probably breaks down.
None of this is new
OMG's Model Driven Architecture said "the model is the source of truth" in 2001. Dietz formalized enterprise ontology modeling at TU Delft. Palantir has been shipping ontology-driven software for years. The term "Schema-Driven Development" itself has been around since 2017, when it meant API contract-first. The Thoughtworks Radar put spec-driven development on the map for 2025, and now Kiro, spec-kit, and Tessl are all shipping tooling for it.
What I did was take these existing ideas and wire them into a single cross-referenced graph across 10 semantic layers. The pieces aren't mine. The assembly is.
Didn't MDA fail?
Yes. Fowler called it "the night of the living CASE tools." Den Haan documented eight failure modes. The ones that killed it:
- No round-trip engineering. Generate code from a model, touch the code, regenerate, your changes are gone. SDD schemas live in the same repo as the code and change in the same PR.
- UML was the wrong abstraction. Nobody wants to draw sequence diagrams instead of writing code. JSON is already the language your APIs speak.
- The tooling didn't exist. In 2003 there were no schema validators in CI, no breaking-change detection, no AI agents consuming structured specs.
But the real difference is ambition. MDA tried to replace code with models. I'm trying to constrain code with specifications. The code still exists, developers still write it, but the spec defines the boundaries it operates within. Less ambitious, probably more realistic.
What schemas can and can't do
JSON Schema validates structure, not behavior. It can tell you a field is required. It cannot tell you a state transition is only valid after a credit check passes. The schema layers define what the system is. Enforcing how it behaves at runtime requires separate machinery: state machine engines for transition guards, policy engines like Cedar for authorization, contract testing like Pact for integration safety, agent constraints like AgentSpec for LLM guardrails.
Purpose-built languages like AWS Smithy and TypeSpec are more powerful for this than JSON Schema. I chose JSON because everyone already has tooling for it. That's a trade-off.
Schemas that can't evolve get abandoned
Stripe encapsulates each breaking API change in a small, self-contained module. Nearly a hundred versions over six years, each one a fixed cost, not a compounding one. Confluent enforces compatibility at the broker level. Buf runs 53 breaking-change rules in CI.
SDD needs the same discipline. Within a layer, changes like adding a field should be backward-compatible by default. Across layers, adding a new state to an entity means you also need new transitions, new events, and new tools. Cross-reference validators should catch incomplete cascades before merge. That's the theory. I haven't battle-tested it yet.
Where this probably doesn't work
It's probably good for systems with 50+ entities and multiple service boundaries, regulated domains like finance and healthcare, long-lived systems that will outlast the people who built them, and AI-assisted development where agents need structured context to operate safely.
It's probably overhead for discovery-phase products, single-service CRUD apps, UI-heavy work, small teams that communicate well enough without it, and prototypes where learning speed matters more than architectural correctness.
Isoform nails it: most projects have both. Stable contracts at system boundaries, adaptive iteration within them. Schema-drive the parts that are expensive to misunderstand. Leave the rest lighter.
Not everything needs a schema on day one
The methodology has 10 layers. That doesn't mean you formalize all of them upfront.
| Tier | Layers | Why |
|---|---|---|
| Always formal | Ontology, State Machines, Events, Data Ownership | Getting these wrong is expensive. An entity owned by two services or a missing state transition will surface as a production bug at 2am. |
| Formal when regulated | Compliance Rules, Roles/Policies | Auditable in finance and healthcare. Code comments are fine until the auditor shows up. |
| Lightweight is fine | Workflows, C4, Integrations, Tool Registry | Start with markdown. Formalize when the system stabilizes. Over-formalizing early is the architecture astronaut problem. |
Schema drift costs enterprises $2.1M annually. That makes the case for formalization where it counts. Hohpe's architect elevator makes the case for not turning it into a bureaucracy.
Will it work?
This hasn't shipped. The unfulfilled promise critique is fair. Nobody has built an integrated cross-layer schema system that works in production. MDA didn't deliver on this promise. Marmelab argues that spec-driven development is waterfall in disguise, and that's worth taking seriously.
The thing that might make it different this time is that AI agents are a real consumer of specifications now, not a hypothetical one. Today, right now, an AI coding agent performs better with a structured ontology and tool registry than with scattered docs and source code. Whether that's enough to justify the upfront cost is the experiment I'm running.
Sources
Academic
- OMG MDA - Model-driven architecture (2001)
- Dietz, DEMO - Enterprise ontology, TU Delft
- OMG UML 2.5.1 / PSSM - State machine semantics
- CMU FOSD / SEI SPL - Software product lines
- Ostroff et al., XP 2004 - Agile spec-driven development
- arXiv 2602.00180 - SDD formalization
- arXiv 2602.02584 - Constitutional SDD for security
- ISO/IEC/IEEE 29148:2018 - Requirements engineering
Industry
- Birgitta Bockeler - SDD tooling critique (martinfowler.com)
- Simon Willison - Agentic engineering patterns
- Kent Beck - Canon TDD
- Thoughtworks Radar - SDD as emerging technique
- Kiro / Tessl / spec-kit
- Palantir Foundry / AWS Smithy / Cedar / Stripe / Pact
- Anthropic MCP
Counter-arguments worth reading
- Fowler on MDA / Den Haan's 8 failure modes
- Marmelab: SDD is waterfall
- Spolsky: architecture astronauts
- larner.dev: unfulfilled promise
- Isoform: limits of SDD
Full methodology: steig.io/schema-driven-development
Member discussion