How to Split a Monolith GraphQL Schema into Subgraphs
Transitioning from a tightly coupled GraphQL monolith to a federated architecture requires systematic extraction, strict boundary enforcement, and zero-downtime routing. Monolithic schemas at scale introduce deployment bottlenecks, resolver contention, and cross-team merge conflicts. This guide provides a production-tested migration workflow to decompose your schema while preserving backward compatibility and client stability. Before initiating extraction, verify your routing layer and composition pipeline are provisioned according to GraphQL Federation Architecture & Design standards.
1. Audit Existing Schema & Map Resolver Dependencies
Do not extract blindly. Begin by exporting the complete SDL and tracing resolver execution paths to establish a dependency baseline.
Diagnostic Workflow:
- Export SDL: Run
rover graph introspect <monolith-endpoint> --output monolith.graphql - Trace Execution: Enable Apollo Studio tracing or attach OpenTelemetry to capture resolver latency, N+1 queries, and cross-service calls.
- Map Data Ownership: Document which resolvers hit specific databases, external APIs, or caches. Flag shared utility types and circular references.
- Identify Extraction Candidates: High-coupling nodes (e.g.,
User.orders,Order.user) will require federation directives. Types with single ownership are primary extraction targets.
2. Define Domain-Aligned Subgraph Boundaries
Splitting by technical layer (queries vs. mutations) creates distributed monoliths. Group types and resolvers by business domain, data ownership, and team responsibility. Establish clear ownership contracts before writing code.
When mapping types to services, reference Defining Subgraph Boundaries for Microservices to prevent cross-domain coupling. Each subgraph must be independently deployable and resilient to partial outages in sibling services.
3. Extract Types & Implement Federation v2 Directives
Migrate SDL fragments and resolvers into independent service repositories. Apply Apollo Federation v2 directives to establish entity references and shared type contracts.
Original Monolith Fragment
type User { id: ID! name: String! orders: [Order!]! }
type Order { id: ID! total: Float! user: User! items: [OrderItem!]! }
type Query { user(id: ID!): User! ordersByStatus(status: String!): [Order!]! }
Extracted User Subgraph
type User @key(fields: "id") {
id: ID!
name: String!
}
extend type Query {
user(id: ID!): User
}
Extracted Order Subgraph (Referencing User)
type Order @key(fields: "id") {
id: ID!
total: Float!
user: User!
}
extend type User @key(fields: "id") {
id: ID! @external
orders: [Order!]! @requires(fields: "id")
}
Critical Implementation Notes:
- Use
@keyto designate primary identifiers for cross-subgraph joins. - Use
@externalto declare fields resolved by another subgraph. - Use
@shareableonly when multiple subgraphs legitimately own identical field logic. - Every entity subgraph must implement the
_entitiesresolver. Missing this breaks the composition pipeline.
4. Configure Gateway Routing & Schema Composition
Deploy a composition engine to validate subgraph schemas before routing traffic. The supergraph build pipeline merges SDLs, resolves type conflicts, and generates a unified query plan.
Minimal Viable Composition Config (supergraph.yaml)
federation_version: 2
subgraphs:
user-service:
routing_url: http://user-svc:4001/graphql
schema:
file: ./user.graphql
order-service:
routing_url: http://order-svc:4002/graphql
schema:
file: ./order.graphql
composition:
validation_mode: strict
error_handling: fail_fast
output: ./supergraph.graphql
Pipeline Execution:
rover supergraph compose --config supergraph.yaml
Gateway Routing Strategy:
- Deploy the gateway alongside the legacy monolith.
- Implement readiness probes (
/health/ready) that block traffic until composition succeeds. - Configure weighted routing (e.g., 10% federated, 90% monolith) using your ingress controller or API gateway.
- Maintain a fallback mechanism to the legacy monolith during the transition.
5. Validate, Test & Incrementally Migrate Traffic
Execute integration tests against the composed supergraph to verify query resolution, entity stitching, and error propagation.
Entity Stitching Verification:
query {
ordersByStatus(status: "PENDING") {
id
total
user {
id
name
}
}
}
Expected Response Structure:
{
"data": {
"ordersByStatus": [
{
"id": "ord_123",
"total": 149.99,
"user": {
"id": "usr_456",
"name": "Jane Doe"
}
}
]
}
}
Traffic Shift Protocol:
- Enable feature flags for specific query paths.
- Monitor resolver latency, error rates (
5xx,GRAPHQL_VALIDATION_FAILED), and cache hit ratios. - Once stability thresholds are met (P99 < 200ms, error rate < 0.1%), incrementally shift 100% of traffic.
- Decommission legacy endpoints and enforce CI/CD schema validation gates.
Troubleshooting: Common Composition & Routing Failures
| Symptom | Exact Error Payload | Diagnostic Workflow | Resolution |
|---|---|---|---|
| Ambiguous field ownership | FEDERATION_ERROR: Field "User.email" is defined in multiple subgraphs without @shareable or @key |
Run rover supergraph compose --dry-run |
Add @shareable to duplicated fields or consolidate to a single owner. |
| Broken entity joins | QUERY_ERROR: Cannot resolve entity for type "User" with keys ["id"] |
Check _entities resolver implementation in the target subgraph |
Ensure __resolveReference returns the correct entity shape and matches @key fields. |
| Composition validation failure | COMPOSITION_FAILED: Type "Order" does not have a valid @key directive in subgraph "order-service" |
Inspect SDL for missing @key or malformed syntax |
Add @key(fields: "id") to the root type definition. |
| Connection exhaustion post-extraction | NET_ERR: ECONNREFUSED / POOL_EXHAUSTED |
Monitor DB connection pools in newly isolated services | Implement connection pooling (PgBouncer/Prisma) and tune max_connections per subgraph. |
| Client breaking changes | GRAPHQL_VALIDATION_FAILED: Cannot query field "legacyField" on type "User" |
Run schema diff in CI against previous supergraph version | Enforce backward compatibility gates; deprecate fields before removal. |
FAQ
How do I handle shared enums and input types across multiple subgraphs?
Define shared enums/inputs in a dedicated contract package or use @shareable if duplication is unavoidable. Ensure identical SDL definitions across all consumers. Version these contracts explicitly and enforce automated schema diff checks in CI/CD.
Can I run the monolith and subgraphs simultaneously during migration?
Yes. Route specific queries/mutations to new subgraphs via the gateway while keeping the legacy monolith active for unextracted paths. Implement field-level routing or query rewriting to shift traffic incrementally without client disruption.
What happens to existing resolvers during extraction?
Resolvers migrate to their respective subgraph services. The gateway intercepts queries, resolves entity references via _entities, and delegates execution to subgraph-specific resolvers. Legacy resolvers remain active until fully extracted and validated.
How do I prevent schema drift in a distributed GraphQL architecture?
Enforce strict validation in CI/CD using composition checks, contract testing, and automated diff tools. Require domain-owner approvals for SDL changes and maintain a centralized supergraph registry to track versioned schema states.
Next Steps
- Run
rover graph introspectand generate a dependency matrix. - Extract the highest-value, lowest-coupling domain first (e.g.,
UserorCatalog). - Implement
@keyand_entitiesresolvers. - Validate composition locally before pushing to staging.
- Shift 5% traffic, monitor P95/P99 latency, and scale extraction iteratively.