When to Use Schema Stitching vs Apollo Federation
Engineering teams evaluating GraphQL composition strategies must weigh operational complexity against architectural scalability. Both patterns unify disparate services into a single graph, but their routing mechanics, type resolution models, and failure modes diverge significantly. Schema stitching relies on runtime resolver delegation and manual type merging at the gateway layer. Apollo Federation shifts composition to a build-time router, enforcing declarative contracts via @key directives and distributed _entities resolution. Selecting the wrong pattern often triggers cascading routing failures, unmanageable type collisions, and unpredictable query planning latency. Understanding the foundational principles of GraphQL Federation Architecture & Design is critical before committing to a composition strategy. This guide provides diagnostic workflows, minimal viable configurations, and step-by-step migration paths to help you decide when to use schema stitching vs Apollo Federation.
Core Architectural Differences: Manual vs Declarative Composition
| Dimension | Schema Stitching | Apollo Federation |
|---|---|---|
| Composition Phase | Runtime (gateway merges schemas on startup) | Build-time (router validates subgraph schemas before deployment) |
| Routing Logic | Explicit delegateToSchema resolver chains |
Query planner precomputes execution paths across subgraphs |
| Type Resolution | Manual type mapping & field merging | @key directives + _entities query for distributed ownership |
| Failure Surface | Runtime resolver errors, silent field drops | Schema validation failures, directive mismatches, planning timeouts |
Diagnostic Workflow: Identify Your Current Composition Model
- Run
graphql introspectionon your gateway. - Check for
_entitiesand_servicefields. If present, you are running Federation. - Inspect gateway logs for
delegateToSchemaormergeSchemascalls. If present, you are using stitching. - Query a shared type (e.g.,
User) across two services. If the gateway throwsCannot return null for non-nullable field, stitching is failing to merge resolvers correctly.
Minimal Viable Configurations
Schema Stitching Gateway (Node.js / @graphql-tools/stitch)
import { makeExecutableSchema } from '@graphql-tools/schema';
import { stitchSchemas } from '@graphql-tools/stitch';
import { delegateToSchema } from '@graphql-tools/delegate';
// Service A & B schemas loaded via introspection or SDL files
const serviceASchema = await introspectSchema(serviceAEndpoint);
const serviceBSchema = await introspectSchema(serviceBEndpoint);
const gatewaySchema = stitchSchemas({
subschemas: [
{ schema: serviceASchema },
{ schema: serviceBSchema }
],
// Manual type merging required for shared types
typeMerging: {
types: {
User: {
selectionSet: `{ id }`,
merge: (originalResult, context, info) =>
delegateToSchema({ schema: serviceBSchema, operation: 'query', fieldName: 'userById', args: { id: originalResult.id }, context, info })
}
}
}
});
Apollo Federation Subgraph Definition (GraphQL SDL)
type Query {
product(id: ID!): Product
}
type Product @key(fields: "id") {
id: ID!
name: String!
price: Float!
}
# Extended type in a different subgraph (e.g., Inventory)
extend type Product @key(fields: "id") {
id: ID! @external
inStock: Boolean!
}
Federated Router Initialization (TypeScript / @apollo/server)
import { ApolloServer } from '@apollo/server';
import { startStandaloneServer } from '@apollo/server/standalone';
import { buildSubgraphSchema } from '@apollo/subgraph';
const server = new ApolloServer({
schema: buildSubgraphSchema([{ typeDefs: productTypeDefs, resolvers: productResolvers }]),
});
const { url } = await startStandaloneServer(server, {
listen: { port: 4001 },
context: async ({ req }) => ({ token: req.headers.authorization }),
});
Decision Thresholds
When Schema Stitching Still Makes Sense
- Legacy Monolith Decomposition: You are wrapping REST endpoints with a lightweight GraphQL facade and need to merge schemas without refactoring downstream services.
- Centralized Ownership: A single platform team controls all services, making manual resolver mapping manageable.
- Low Query Volume / Rapid Prototyping: CI/CD pipelines lack schema validation gates, and query planning overhead is negligible.
- Operational Constraint: You cannot enforce strict subgraph contracts or deploy a dedicated router layer.
When Apollo Federation Is Mandatory
- Cross-Team Domain Ownership: Multiple squads own distinct bounded contexts and require independent deployment cycles.
- High-Traffic Microservices Ecosystems: You need deterministic query planning, automatic batching, and predictable latency SLAs.
- Automated Contract Testing: Your CI/CD requires strict schema diffing, directive validation, and breaking-change prevention.
- Shared Type Complexity: You are struggling with overlapping definitions and need a single source of truth enforced by the router. For teams hitting overlapping type definitions, refer to Resolving Schema Conflicts in Apollo Federation to implement strict contract validation before scaling.
Migration Path & Step-by-Step Resolution
Transitioning from stitching to Federation requires a phased, zero-downtime approach. Do not attempt a big-bang rewrite.
- Isolate Domain Boundaries: Map existing types to owning services. Extract
@keyfields and identify_entitiesresolution requirements. - Deploy Parallel Router: Spin up an Apollo Router alongside your existing stitching gateway. Route 1% of traffic via feature flag or header injection.
- Shift Resolver Logic: Move
delegateToSchemachains into subgraph resolvers. Replace manualtypeMergingwith@keyand@externaldirectives. - Enable Schema Validation Gates: Integrate
rover subgraph checkinto CI. Block merges that introduce breaking changes or directive conflicts. - Cutover & Decommission: Gradually increase router traffic to 100%. Monitor
_entitiesresolution latency and query planner cache hit rates. Retire the stitching gateway.
Gateway Routing Overlap During Transition
Running both gateways simultaneously introduces routing complexity. Use a reverse proxy (Envoy/Nginx) to split traffic based on X-GraphQL-Router: federation headers. Ensure both gateways resolve identical query shapes to prevent client-side cache fragmentation.
Troubleshooting & Diagnostic Workflows
| Symptom | Exact Error Payload | Root Cause | Resolution Path |
|---|---|---|---|
| Duplicate Type Definition | GRAPHQL_VALIDATION_FAILED: Type "User" was defined multiple times with different fields. |
Stitching merging conflicting SDLs or Federation missing @extends |
In Federation, designate one subgraph as the base owner. Use extend type User @key(fields: "id") in others. |
Missing _entities Resolution |
ApolloRouterError: Subgraph "inventory" failed to resolve _entities for type Product |
Missing @key directive or resolver not returning __typename |
Ensure base type includes __typename in resolver. Verify @key matches the exact field path. |
| Query Planner Timeout | ApolloRouterError: Query planning timed out after 5000ms |
Circular @key dependencies or unbounded field selection |
Flatten subgraph boundaries. Add @requires to limit cross-service fetch depth. Enable queryPlanner.experimental_cache |
| Stitching Resolver Drop | [Stitching] Field "User.email" dropped: no resolver mapped in subschema |
Manual delegateToSchema missing field mapping |
Explicitly map the field in typeMerging or migrate to Federation’s declarative resolution. |
Diagnostic Checklist for Production Incidents
- Run
rover subgraph introspect <endpoint>to verify schema shape. - Check router logs for
query_planner.execution_time_msspikes. - Validate
_entitiespayloads usingcurl -X POST -H "Content-Type: application/json" -d '{"query": "query { _entities(representations: [{__typename: \"User\", id: \"123\"}]) { ... on User { id name } } }"}' <router-url> - If latency exceeds SLA, enable
queryPlanner.experimental_parallelism: trueand audit subgraph network hops.
FAQ
Can I run schema stitching and Apollo Federation in the same architecture?
Technically yes, but it introduces severe routing complexity and query planning conflicts. It is only recommended as a temporary migration bridge. Long-term, standardize on one composition model to maintain predictable performance and clear ownership boundaries.
Does Apollo Federation replace all schema stitching use cases?
No. Federation excels in distributed, multi-team environments with strict CI/CD validation. Schema stitching remains practical for lightweight internal APIs, legacy monolith decomposition, or scenarios where a single team controls all services and prefers manual resolver mapping.
How do I handle shared types like User or Product across subgraphs?
In Federation, one subgraph owns the base type definition. Others extend it using extend type and @external. This prevents duplicate definitions and ensures a single source of truth. Proper boundary mapping is essential to avoid circular dependencies.
What is the performance impact of switching from stitching to Federation?
Federation typically improves query planning efficiency by precomputing execution paths at build time. However, it introduces additional network hops for _entities resolution. Proper subgraph design and query batching mitigate latency, whereas stitching often suffers from N+1 resolver chains at runtime.