Apollo Router vs @apollo/gateway: Production Trade-offs
Choosing between the Rust-based Apollo Router and the Node-based @apollo/gateway is the most consequential runtime decision for a production federated graph, and for new deployments the answer is almost always the Apollo Router. This page lays out the performance, memory, query-plan caching, extensibility, and operational trade-offs so you can defend the choice — or justify staying on the gateway during a migration.
When This Comparison Matters
- You are standing up a new federated graph and need to pick a runtime before writing deployment config.
- You run
@apollo/gatewaytoday and want to quantify what migrating to the Apollo Router buys you. - You have deep Node-based gateway customisation (plugins,
willSendRequesthooks) and need to know whether the router’s extensibility model can replace it.
This is a runtime choice only; both options serve the same composed supergraph and assume the architecture and subgraph work in Apollo Router Configuration and Deployment is already done.
Prerequisites
- A composed Federation v2 supergraph (subgraphs using
@linkto the federation spec) - The
roverCLI for composition (rover supergraph compose) - For the router: the
routerbinary or container image, version pinned - For the gateway: Node 18+,
@apollo/gatewayand@apollo/server
The Two Runtimes
@apollo/gateway is the original federation runtime: a Node library you embed in an Apollo Server process. It composes (or loads) a supergraph, plans queries in JavaScript, and executes fetches against subgraphs. Because it is a Node library, you extend it with JavaScript plugins and you operate it like any Node service.
The Apollo Router is a standalone binary written in Rust. It serves the same supergraph but plans and executes in native code, configures via router.yaml rather than code, and extends via Rhai scripts or external coprocessors rather than in-process JavaScript. Apollo positions the router as the recommended production runtime and the gateway as legacy for high-throughput use.
The decisive differences come down to where the work runs (native Rust versus the Node event loop) and how you customise behaviour (declarative config plus out-of-process hooks versus in-process JavaScript).
Comparison Matrix
The diagram summarises how each runtime scores on the dimensions that drive the decision; the table that follows gives the detail.
| Dimension | Apollo Router (Rust) | @apollo/gateway (Node) |
|---|---|---|
| Runtime | Standalone native binary | Library inside Apollo Server |
| Throughput | High; native execution, no GC pauses | Bounded by the single-threaded event loop |
| Tail latency | Stable p99 under load | More variable under GC and load |
| Memory | Low, predictable | Higher; V8 heap overhead |
| Query-plan cache | In-memory and distributed (Redis) | In-memory per process only |
| Configuration | Declarative router.yaml |
JavaScript code |
| Extensibility | Rhai scripts, external coprocessors | In-process JS plugins/hooks |
| Persisted queries | Native APQ + persisted query manifest | Supported via Apollo Server |
| Recommended for production | Yes | Legacy / migration only |
Performance and Memory
The router executes query planning and response merging in native Rust with no garbage collector, so its tail latency stays flat as concurrency rises. The gateway runs that same work on Node’s single-threaded event loop, where CPU-bound query planning competes with I/O and where V8 garbage collection introduces periodic latency. For graphs with complex query plans or high request rates, this is the difference that matters most: the router holds p99 steady where the gateway’s p99 drifts upward under the same load. Memory follows the same pattern — the router’s footprint is low and predictable, while the gateway carries V8 heap overhead that grows with concurrency.
The practical consequence is in how you scale each. Because the gateway is single-threaded, you scale it by running more Node processes (or more pods), one process saturating roughly one core; CPU-bound planning on a complex graph can make a single gateway process a bottleneck well before the subgraphs are stressed. The router uses all available cores within one process, so a single router instance handles far more concurrency on the same hardware, and you scale out for availability and headroom rather than to work around a per-process ceiling. For a cost-sensitive deployment this often means materially fewer router pods than gateway pods to serve the same traffic, which compounds the memory advantage. The flat tail latency also simplifies capacity planning: you can provision against a stable p99 rather than budgeting for GC-driven spikes.
Query-Plan Caching
Both runtimes cache query plans in memory to avoid re-planning identical operations. The router goes further: it offers a distributed (Redis) plan cache shared across the whole fleet, so a newly scaled-up pod starts warm instead of re-planning every operation from cold. The gateway’s plan cache is per-process and dies with the process. On a large autoscaling deployment this is a real operational advantage for the router; cold-start planner CPU spikes are a common gateway pain point. The distributed plan cache is covered in Configuring Query Plan Caching in the Apollo Router.
Extensibility: Coprocessors vs JavaScript Plugins
This is the one axis where the gateway can win. With the gateway you write arbitrary JavaScript in RemoteGraphQLDataSource.willSendRequest, plugins, and lifecycle hooks, all in-process with full access to your Node ecosystem.
// @apollo/gateway: in-process header injection
import { ApolloGateway, RemoteGraphQLDataSource } from '@apollo/gateway';
const gateway = new ApolloGateway({
buildService({ url }) {
return new RemoteGraphQLDataSource({
url,
willSendRequest({ request, context }) {
// arbitrary JS runs per subgraph request, in-process
request.http?.headers.set('x-user-id', context.userId);
},
});
},
});
The router replaces in-process JavaScript with two out-of-process or sandboxed options: lightweight Rhai scripts for simple transforms, and external coprocessors (an HTTP service the router calls at lifecycle stages) for anything heavier. For pure header propagation you do not need either — declarative config covers it:
# router.yaml: declarative equivalent of the willSendRequest above
headers:
all:
request:
- propagate:
named: authorization
coprocessor:
url: http://localhost:4010/auth # only when logic exceeds declarative config
router:
request:
headers: true
The trade-off: the router’s model is safer (custom logic cannot block the request path or leak memory into the router) but adds a network hop for coprocessors. If your customisation is mostly header and context manipulation — as in the authorization patterns described in Directive Patterns for Cross-Service Authorization — the router covers it with config and coprocessors. If you depend on deep in-process Node integration, budget for porting that logic to a coprocessor before migrating.
Verification Steps
Confirm the router serves your composed supergraph identically to the gateway before cutting over.
# 1. Compose the same supergraph both runtimes will serve
rover supergraph compose --config supergraph.yaml > supergraph.graphql
# 2. Launch the router against it
./router --config router.yaml --supergraph supergraph.graphql
# 3. Issue a representative federated query and diff the response
curl -s http://localhost:4000/graphql \
-H 'content-type: application/json' \
-d '{"query":"{ topProducts { name reviews { body author { name } } } }"}'
The response shape must match what the gateway returned for the same operation. Then load-test both with your real operation mix and compare p99 latency and memory; the router should show lower, flatter numbers. Finally, replay any custom gateway plugin behaviour through the router’s config or coprocessor and verify headers and context arrive at subgraphs unchanged.
Common Mistakes & Gotchas
- Assuming plugins port one-to-one. In-process gateway plugins do not map directly to router config. Audit every
willSendRequest/plugin hook and decide per hook whether it becomes declarative config, a Rhai script, or a coprocessor before you commit to a migration date. - Forgetting header propagation defaults differ. The gateway often forwards headers via custom data-source code; the router forwards nothing by default. A naive cutover produces sudden
401s from subgraphs until you add explicitpropagaterules. - Comparing latency without a warm plan cache. Benchmarking the router with a cold cache understates it. Warm both runtimes with your operation set before measuring, and enable the router’s Redis plan cache if your production deployment will use it.
Frequently Asked Questions
Is @apollo/gateway deprecated?
It is treated as legacy for production: Apollo recommends the Apollo Router for new and high-throughput deployments. The gateway still works and still receives support, but new performance and caching capabilities land in the router first.
Can I migrate from the gateway to the router without changing subgraphs?
Yes. Both serve the same composed Federation v2 supergraph, so subgraphs are unaffected. The migration work is on the runtime: translating gateway config and plugins into router.yaml, Rhai, or coprocessors, and re-establishing header propagation.
When would I deliberately stay on the gateway?
When you have substantial in-process Node customisation that is expensive to port, or a small/low-throughput graph where the router’s performance edge is irrelevant and the gateway’s in-process JavaScript model is simpler for your team to operate. For anything performance-sensitive, the router wins.
Related
- Apollo Router Configuration and Deployment — parent guide
- Configuring Query Plan Caching in the Apollo Router — the router’s distributed plan cache
- Federated GraphQL Operations in Production — running federation at scale
- Gateway Routing Strategies for Federated APIs — how either runtime plans queries