How to Split a Monolith GraphQL Schema into Subgraphs

Transitioning from a tightly coupled GraphQL monolith to a federated architecture requires systematic extraction, strict boundary enforcement, and zero-downtime routing. Monolithic schemas at scale introduce deployment bottlenecks, resolver contention, and cross-team merge conflicts. This guide provides a production-tested migration workflow to decompose your schema while preserving backward compatibility and client stability. Before initiating extraction, verify your routing layer and composition pipeline are provisioned according to GraphQL Federation Architecture & Design standards.

1. Audit Existing Schema & Map Resolver Dependencies

Do not extract blindly. Begin by exporting the complete SDL and tracing resolver execution paths to establish a dependency baseline.

Diagnostic Workflow:

  1. Export SDL: Run rover graph introspect <monolith-endpoint> --output monolith.graphql
  2. Trace Execution: Enable Apollo Studio tracing or attach OpenTelemetry to capture resolver latency, N+1 queries, and cross-service calls.
  3. Map Data Ownership: Document which resolvers hit specific databases, external APIs, or caches. Flag shared utility types and circular references.
  4. Identify Extraction Candidates: High-coupling nodes (e.g., User.orders, Order.user) will require federation directives. Types with single ownership are primary extraction targets.

2. Define Domain-Aligned Subgraph Boundaries

Splitting by technical layer (queries vs. mutations) creates distributed monoliths. Group types and resolvers by business domain, data ownership, and team responsibility. Establish clear ownership contracts before writing code.

When mapping types to services, reference Defining Subgraph Boundaries for Microservices to prevent cross-domain coupling. Each subgraph must be independently deployable and resilient to partial outages in sibling services.

3. Extract Types & Implement Federation v2 Directives

Migrate SDL fragments and resolvers into independent service repositories. Apply Apollo Federation v2 directives to establish entity references and shared type contracts.

Original Monolith Fragment

type User { id: ID! name: String! orders: [Order!]! }
type Order { id: ID! total: Float! user: User! items: [OrderItem!]! }
type Query { user(id: ID!): User! ordersByStatus(status: String!): [Order!]! }

Extracted User Subgraph

type User @key(fields: "id") {
 id: ID!
 name: String!
}
extend type Query {
 user(id: ID!): User
}

Extracted Order Subgraph (Referencing User)

type Order @key(fields: "id") {
 id: ID!
 total: Float!
 user: User!
}
extend type User @key(fields: "id") {
 id: ID! @external
 orders: [Order!]! @requires(fields: "id")
}

Critical Implementation Notes:

  • Use @key to designate primary identifiers for cross-subgraph joins.
  • Use @external to declare fields resolved by another subgraph.
  • Use @shareable only when multiple subgraphs legitimately own identical field logic.
  • Every entity subgraph must implement the _entities resolver. Missing this breaks the composition pipeline.

4. Configure Gateway Routing & Schema Composition

Deploy a composition engine to validate subgraph schemas before routing traffic. The supergraph build pipeline merges SDLs, resolves type conflicts, and generates a unified query plan.

Minimal Viable Composition Config (supergraph.yaml)

federation_version: 2
subgraphs:
 user-service:
 routing_url: http://user-svc:4001/graphql
 schema:
 file: ./user.graphql
 order-service:
 routing_url: http://order-svc:4002/graphql
 schema:
 file: ./order.graphql
composition:
 validation_mode: strict
 error_handling: fail_fast
 output: ./supergraph.graphql

Pipeline Execution:

rover supergraph compose --config supergraph.yaml

Gateway Routing Strategy:

  • Deploy the gateway alongside the legacy monolith.
  • Implement readiness probes (/health/ready) that block traffic until composition succeeds.
  • Configure weighted routing (e.g., 10% federated, 90% monolith) using your ingress controller or API gateway.
  • Maintain a fallback mechanism to the legacy monolith during the transition.

5. Validate, Test & Incrementally Migrate Traffic

Execute integration tests against the composed supergraph to verify query resolution, entity stitching, and error propagation.

Entity Stitching Verification:

query {
 ordersByStatus(status: "PENDING") {
 id
 total
 user {
 id
 name
 }
 }
}

Expected Response Structure:

{
 "data": {
 "ordersByStatus": [
 {
 "id": "ord_123",
 "total": 149.99,
 "user": {
 "id": "usr_456",
 "name": "Jane Doe"
 }
 }
 ]
 }
}

Traffic Shift Protocol:

  1. Enable feature flags for specific query paths.
  2. Monitor resolver latency, error rates (5xx, GRAPHQL_VALIDATION_FAILED), and cache hit ratios.
  3. Once stability thresholds are met (P99 < 200ms, error rate < 0.1%), incrementally shift 100% of traffic.
  4. Decommission legacy endpoints and enforce CI/CD schema validation gates.

Troubleshooting: Common Composition & Routing Failures

Symptom Exact Error Payload Diagnostic Workflow Resolution
Ambiguous field ownership FEDERATION_ERROR: Field "User.email" is defined in multiple subgraphs without @shareable or @key Run rover supergraph compose --dry-run Add @shareable to duplicated fields or consolidate to a single owner.
Broken entity joins QUERY_ERROR: Cannot resolve entity for type "User" with keys ["id"] Check _entities resolver implementation in the target subgraph Ensure __resolveReference returns the correct entity shape and matches @key fields.
Composition validation failure COMPOSITION_FAILED: Type "Order" does not have a valid @key directive in subgraph "order-service" Inspect SDL for missing @key or malformed syntax Add @key(fields: "id") to the root type definition.
Connection exhaustion post-extraction NET_ERR: ECONNREFUSED / POOL_EXHAUSTED Monitor DB connection pools in newly isolated services Implement connection pooling (PgBouncer/Prisma) and tune max_connections per subgraph.
Client breaking changes GRAPHQL_VALIDATION_FAILED: Cannot query field "legacyField" on type "User" Run schema diff in CI against previous supergraph version Enforce backward compatibility gates; deprecate fields before removal.

FAQ

How do I handle shared enums and input types across multiple subgraphs?

Define shared enums/inputs in a dedicated contract package or use @shareable if duplication is unavoidable. Ensure identical SDL definitions across all consumers. Version these contracts explicitly and enforce automated schema diff checks in CI/CD.

Can I run the monolith and subgraphs simultaneously during migration?

Yes. Route specific queries/mutations to new subgraphs via the gateway while keeping the legacy monolith active for unextracted paths. Implement field-level routing or query rewriting to shift traffic incrementally without client disruption.

What happens to existing resolvers during extraction?

Resolvers migrate to their respective subgraph services. The gateway intercepts queries, resolves entity references via _entities, and delegates execution to subgraph-specific resolvers. Legacy resolvers remain active until fully extracted and validated.

How do I prevent schema drift in a distributed GraphQL architecture?

Enforce strict validation in CI/CD using composition checks, contract testing, and automated diff tools. Require domain-owner approvals for SDL changes and maintain a centralized supergraph registry to track versioned schema states.

Next Steps

  1. Run rover graph introspect and generate a dependency matrix.
  2. Extract the highest-value, lowest-coupling domain first (e.g., User or Catalog).
  3. Implement @key and _entities resolvers.
  4. Validate composition locally before pushing to staging.
  5. Shift 5% traffic, monitor P95/P99 latency, and scale extraction iteratively.