Schema Validation in CI/CD Pipelines
As distributed GraphQL architectures scale, the moment that breaks production is rarely a bad resolver — it is a subgraph SDL that composed fine in isolation but introduced a breaking change once merged into the supergraph. Because Apollo Federation composition is all-or-nothing, one unchecked field removal or nullability narrowing can fail composition for every team sharing the graph. Effective GraphQL Federation Architecture & Design therefore depends on automated guardrails that intercept breaking changes before they reach the registry. This guide details the checkpoint architecture, the Rover CLI workflows, and the contract-enforcement rules that make schema validation a reliable gate rather than a flaky bottleneck.
The focused companion page, federated schema validation in CI/CD pipelines, drills into the composition-engine mechanics and exact error payloads; managed publishing and approval flow lives in schema registry and managed federation.
Prerequisites
Concept Deep-Dive: Validation Checkpoints
Validation must be spread across the CI/CD lifecycle so it balances developer velocity against production stability. Catching everything in one expensive post-merge step is too slow to act on; catching nothing until deploy is too late. The standard architecture has three checkpoints.
Pre-commit linting validates SDL syntax, directive usage, and naming conventions locally with graphql-schema-linter or an ESLint GraphQL rule set. It is fast and catches typos before they ever reach CI.
PR-triggered composition checks compare the proposed subgraph SDL against the registered production supergraph to detect breaking changes, using rover subgraph check. This is the load-bearing gate: it runs on every pull request that touches a schema and is what branch protection blocks on.
Post-merge staging verification runs a full rover supergraph compose against a staging registry followed by integration queries, confirming the merged supergraph actually serves traffic.
The reason these three checkpoints exist as a sequence rather than a single gate is that each catches a different class of error at a different cost. Linting is essentially free and catches the cheapest mistakes — malformed SDL, a missing directive import — so it belongs in the editor and the pre-commit hook where feedback is instant. The PR check is moderately expensive because it talks to the registry and runs a real composition, but it is the only stage that can answer the question that actually matters: does this change break the supergraph other teams depend on? Post-merge staging compose is the most expensive, end-to-end stage, and it exists as a backstop for the cross-subgraph conflicts an incremental PR check cannot see — a change to subgraph A that only breaks composition in combination with an unrelated, already-merged change to subgraph B. Skipping any one stage does not just lose coverage; it pushes that error class to a later, costlier point in the pipeline. The discipline is to fail as early and as cheaply as the error class allows.
A subtle but important property of federated validation is that “breaking” is defined relative to live client traffic, not to the schema in the abstract. Removing a field that no client queries is, operationally, additive — nobody notices. Removing a field that one mobile client version still queries is an outage for those users. This is why a mature gate integrates production usage metrics from the registry: it lets the pipeline distinguish a theoretical breaking change from a client-impacting one, and reserve hard failures for the latter while soft-warning on the former behind a deprecation window.
The validation scope should mirror your service topology. Properly defining subgraph boundaries for microservices dictates which pipelines run which checks, so a PR touching one subgraph validates only its dependencies rather than forcing an expensive full rebuild on every unrelated change.
Directive & Config Spec Table
| Key / Flag | Where | Valid values | Composition-time vs runtime |
|---|---|---|---|
rover subgraph check |
PR step | graph ref + --name + --schema |
Composition-time: diffs proposed SDL against the registered supergraph |
rover supergraph compose |
post-merge / local | --config supergraph.yaml |
Composition-time: produces the merged supergraph SDL |
federation_version |
supergraph.yaml |
e.g. =2.9.0 |
Composition-time: pins the spec and diagnostic set |
APOLLO_GRAPH_REF |
env | graph-id@variant |
Selects the variant the check diffs against |
--background / --format json |
check flags | flag / json,plain |
Controls output shape consumed by CI parsing |
@deprecated(reason:) |
SDL | string reason | Composition-time validation; runtime returns the field with a deprecation hint |
Step-by-Step Implementation
1. Install and authenticate Rover
# Install Rover (Linux/macOS) — not an npm package
curl -sSL https://rover.apollo.dev/nix/latest | sh
# Windows PowerShell
iwr 'https://rover.apollo.dev/win/latest' | iex
2. Add the PR composition check
This GitHub Actions workflow caches the Rover binary, runs a check against the schema registry, and blocks the merge on breaking changes.
name: GraphQL Schema Validation
on:
pull_request:
paths:
- 'subgraphs/**'
- 'schema.graphql'
jobs:
validate-schema:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Cache Rover Binary
uses: actions/cache@v4
with:
path: ~/.rover
key: ${{ runner.os }}-rover-${{ hashFiles('supergraph.yaml') }}
restore-keys: ${{ runner.os }}-rover-
- name: Install Rover
run: curl -sSL https://rover.apollo.dev/nix/latest | sh
- name: Add Rover to PATH
run: echo "$HOME/.rover/bin" >> $GITHUB_PATH
- name: Check Subgraph Against Production
run: |
rover subgraph check "$APOLLO_GRAPH_REF" \
--name my-subgraph \
--schema ./schema.graphql \
--output json > check_results.json
env:
APOLLO_KEY: ${{ secrets.APOLLO_GRAPH_API_KEY }}
APOLLO_GRAPH_REF: ${{ vars.APOLLO_GRAPH_REF }}
- name: Fail on Breaking Changes
run: |
if jq -e '[.data.changes[] | select(.severity == "FAILURE")] | length > 0' check_results.json > /dev/null; then
echo "::error::Breaking changes detected. Review check_results.json for details."
exit 1
fi
3. Enforce contract rules
Federation v2 introduces routing-critical directives — @key, @override, @shareable, @inaccessible — that must be validated during composition. Three rules carry most of the weight. Ensure every @key field exists and is resolvable, because a missing key field causes silent routing failures at runtime. Flag any type defined in multiple subgraphs without @shareable, since that fails composition with INVALID_FIELD_SHARING. And require a reason on every @deprecated, blocking removal until the deprecation window expires and usage metrics confirm zero active references. Align these thresholds with your type ownership and shared schema contracts so cross-team dependency violations are caught at the gate rather than in production.
4. Add a local SDL diff fallback
For air-gapped or registry-restricted environments, a lightweight diff catches unauthorised field removals before invoking external tools.
import { parse } from 'graphql';
import fs from 'fs';
function extractTypeMap(sdl: string): Record<string, string[]> {
const map: Record<string, string[]> = {};
for (const def of parse(sdl).definitions) {
if (def.kind === 'ObjectTypeDefinition' && def.fields) {
map[def.name.value] = def.fields.map((f) => f.name.value);
}
}
return map;
}
function detectBreakingChanges(currentSDL: string, proposedSDL: string) {
const current = extractTypeMap(currentSDL);
const proposed = extractTypeMap(proposedSDL);
const breaking: { type: string; removedFields: string[] }[] = [];
for (const [type, fields] of Object.entries(current)) {
const removed = fields.filter((f) => !(proposed[type] ?? []).includes(f));
if (removed.length) breaking.push({ type, removedFields: removed });
}
return breaking;
}
const violations = detectBreakingChanges(
fs.readFileSync('./current.graphql', 'utf8'),
fs.readFileSync('./proposed.graphql', 'utf8'),
);
if (violations.length) {
console.error('BREAKING CHANGES:', JSON.stringify(violations, null, 2));
process.exit(1);
}
console.log('Schema diff validation passed.');
Composition Pipeline Integration
For multi-subgraph repositories, parallelise validation so CI throughput scales with the number of services rather than serialising on them.
SUBGRAPHS := auth users inventory payments
.PHONY: validate-all $(SUBGRAPHS:%=validate-%)
validate-all:
@echo "Running parallel subgraph validation..."
@$(MAKE) -j$(shell nproc) $(SUBGRAPHS:%=validate-%)
@echo "Running supergraph composition..."
@rover supergraph compose --config supergraph.yaml --output composed.graphql
validate-%:
@rover subgraph check "$$APOLLO_GRAPH_REF" \
--name $* \
--schema subgraphs/$*/schema.graphql \
--output json | \
jq -e '[.data.changes[] | select(.severity == "FAILURE")] | length == 0' \
|| (echo "::error::$* contains breaking changes" && exit 1)
Once checks pass, publishing to the registry promotes the schema for managed federation; that publish-and-approve flow is covered in schema registry and managed federation.
5. Promote validated schemas to the registry
A passing check is a gate, not a publish. Once the PR merges, the validated subgraph must be published so the router can pick it up. In managed federation the router polls the registry and hot-reloads the supergraph without a redeploy, which is why the publish step is the actual moment a schema goes live for routing.
name: Publish Subgraph
on:
push:
branches: [main]
paths: ['subgraphs/**']
jobs:
publish:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Install Rover
run: |
curl -sSL https://rover.apollo.dev/nix/latest | sh
echo "$HOME/.rover/bin" >> $GITHUB_PATH
- name: Publish to registry
env:
APOLLO_KEY: ${{ secrets.APOLLO_GRAPH_API_KEY }}
APOLLO_GRAPH_REF: ${{ vars.APOLLO_GRAPH_REF }}
run: |
rover subgraph publish "$APOLLO_GRAPH_REF" \
--name my-subgraph \
--schema ./schema.graphql \
--routing-url https://my-subgraph.internal/graphql
The full publish-and-approve workflow, including schema proposals and managed federation polling, is covered in schema registry and managed federation.
Performance & Scale Considerations
Full supergraph composition scales poorly in large monorepos, so reserve rover supergraph compose for post-merge or nightly runs and keep PR latency low with incremental rover subgraph check. Cache the Rover binary and the supergraph definition keyed by commit SHA to skip redundant registry calls. Be aware of the trade-off: incremental diffing is fast but can miss a cross-subgraph routing conflict that only a full compose surfaces, which is exactly why staging verification exists as a backstop. Decide where to fail fast and where to soft-warn: type narrowing, @key removal, non-nullable field changes, and directive stripping should hard-fail, while field deprecation, optional-argument removal, and enum-value addition can soft-warn behind a mandatory migration window of, say, 14 days.
Failure Modes & Debugging
error[E029]: Breaking changes detected from rover subgraph check. The proposed SDL removes or narrows a field that the registered supergraph still exposes. Parse check_results.json with jq, confirm whether the field has live traffic, and either restore it or schedule a deprecation window before removal.
INVALID_FIELD_SHARING during compose — Field "User.email" is defined in multiple subgraphs but is not marked as @shareable. Two subgraphs contribute the same field. Mark it @shareable in each, or consolidate ownership — see resolving schema conflicts in Apollo Federation.
Check passes locally but fails in CI. Almost always a variant mismatch: the local run diffed against @dev while CI uses @production. Always pass an explicit APOLLO_GRAPH_REF per environment and never rely on a default variant.
Pipeline times out on composition. Either an oversized monorepo composing every subgraph on each PR, or network egress to Apollo Studio is blocked. Switch PRs to incremental checks and confirm the runner can reach the registry endpoint.
Frequently Asked Questions
How do I prevent CI/CD validation from becoming a deployment bottleneck?
Run incremental rover subgraph check on PRs, cache the Rover binary and supergraph definitions, and parallelise per-subgraph checks. Reserve full rover supergraph compose for staging or nightly builds rather than blocking every pull request on it.
Should validation block merges on all breaking changes?
Only block on changes that impact active client queries. Use production traffic metrics to separate theoretical breaks from real ones, soft-warn on deprecations behind an enforced migration window, and hard-fail on type narrowing or @key removal.
How does schema validation interact with Apollo Federation v2 directives?
The toolchain must parse and verify @key, @override, @shareable, and @inaccessible during composition. Pin federation_version: =2.x.x in supergraph.yaml so rover subgraph check enforces strict directive parsing and catches routing conflicts before deploy.
Where do schema checks end and managed federation begin?
Checks gate the change at the PR; once merged, publishing the validated subgraph to the registry is what hot-reloads the router. That publish-and-approve handoff is detailed in schema registry and managed federation.