17.4 Hermetic and Reproducible Builds¶
When researchers at Reproducible Builds tried to rebuild Debian packages, they found that identical source code often produced different binaries—timestamps, build paths, and compiler randomization all introduced variations. These variations aren't just academic concerns; they make it impossible to verify that a published binary came from claimed source code. The SolarWinds attackers exploited exactly this gap, inserting malicious code during compilation that wouldn't appear in source reviews. Hermetic and reproducible builds close this vulnerability by ensuring builds are isolated, deterministic, and independently verifiable.
This section explains hermetic and reproducible builds as foundational supply chain security controls, with practical implementation guidance across ecosystems.
Defining Hermeticity¶
A hermetic build is isolated from external influences during execution. The build process cannot access the network, cannot read from unspecified file system locations, and produces output based solely on declared inputs.
Hermetic Build Requirements:
| Requirement | Description |
|---|---|
| No network access | Build cannot fetch from internet during execution |
| Declared inputs | All dependencies explicitly listed |
| Isolated file system | Only specified directories accessible |
| Fixed toolchain | Compiler, linker, tools are explicit inputs |
| No host leakage | Environment variables, paths, username don't affect output |
Hermetic vs. Non-Hermetic:
Non-Hermetic Build:
┌─────────────────────────────────────────────────────────────────┐
│ BUILD ENVIRONMENT │
│ │
│ Sources ─┐ │
│ │ ┌─────────┐ npm install ┌─────────┐ │
│ Config ─┼──►│ Build │◄─────────────► │ npm │ │
│ │ │ Process │ at runtime │ Registry│ │
│ ??? ─┘ └─────────┘ └─────────┘ │
│ ↑ │
│ └── Host system state, env vars, timestamps, random │
│ │
│ Result: Different output each time, external dependencies │
└─────────────────────────────────────────────────────────────────┘
Hermetic Build:
┌─────────────────────────────────────────────────────────────────┐
│ BUILD SANDBOX │
│ │
│ Sources ─┐ │
│ │ ┌─────────┐ │
│ Config ─┼──►│ Build │ No network ╳───► External │
│ │ │ Process │ │
│ Deps ─┘ └─────────┘ │
│ (pinned) │
│ │
│ All inputs declared, no external access, deterministic │
└─────────────────────────────────────────────────────────────────┘
What Hermeticity Prevents:
| Attack Vector | How Hermeticity Helps |
|---|---|
| Dependency confusion | Cannot fetch unexpected packages at build time |
| Registry compromise | Pre-fetched dependencies, no runtime access |
| Network MITM | No network access to intercept |
| Build-time injection | Only declared inputs can affect output |
| Environment manipulation | Host environment isolated |
Reproducible Build Definition¶
A reproducible build produces bit-for-bit identical output from the same source input, regardless of when, where, or by whom the build is performed.
Reproducibility Requirements:
- Same source → Same output (deterministic)
- Independent rebuilders can verify (verifiable)
- Build environment is defined and portable (documented)
- No non-determinism in the build process (predictable)
Sources of Non-Reproducibility:
| Source | Example | Fix |
|---|---|---|
| Timestamps | __DATE__, __TIME__ macros |
Use SOURCE_DATE_EPOCH |
| File ordering | Non-deterministic directory listing | Sort before processing |
| Random data | UUIDs, random seeds | Seed deterministically |
| Build paths | Absolute paths in binaries | Use relative or normalized paths |
| Parallel builds | Order-dependent outputs | Ensure deterministic ordering |
| Compiler version | Different optimization | Pin exact toolchain version |
| Locale settings | Sorting, string collation | Set consistent locale |
Reproducibility in Practice:
# Build #1: Developer machine
$ bazel build //app:binary
# SHA256: abc123...
# Build #2: CI server, different time, different machine
$ bazel build //app:binary
# SHA256: abc123... (identical)
# Build #3: Independent auditor
$ bazel build //app:binary
# SHA256: abc123... (verifiable)
Reproducible builds transform the question "how do you know this binary came from that source?" from one requiring trust to one with cryptographic proof.
Security Benefits¶
Hermetic and reproducible builds provide multiple security benefits.
Verification:
┌─────────────────────────────────────────────────────────────────┐
│ VERIFICATION CHAIN │
├─────────────────────────────────────────────────────────────────┤
│ │
│ Source Code ──► Build ──► Binary ──► Deploy │
│ │ │ │
│ │ Reproducible │ │
│ └──────────────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ Independent Rebuilder │ │
│ │ • Fetch same source │ │
│ │ • Run same build │ │
│ │ • Compare output │ │
│ │ • Match? → Source matches binary (verified) │ │
│ │ • No match? → Something was modified (alert) │ │
│ └─────────────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────┘
Security Benefits Table:
| Benefit | Explanation |
|---|---|
| Build tampering detection | Modified build produces different hash |
| Source/binary linkage | Prove binary came from source |
| Supply chain verification | Third parties can verify builds |
| Incident investigation | Rebuild historical versions for comparison |
| SLSA compliance | Required for higher SLSA levels |
| Insider threat defense | Can't hide modifications in build |
SLSA Relevance:
SLSA (Supply-chain Levels for Software Artifacts) v1.0 defines three Build track levels focusing on build integrity:
| SLSA Build Level | Build Requirement |
|---|---|
| Build L1 | Provenance exists documenting the build process |
| Build L2 | Builds run on a hosted build platform with signed provenance |
| Build L3 | Hardened build platform with strong tamper protection |
While SLSA v0.1 included a Level 4 requiring reproducible builds, the current specification focuses Build L3 on hardened builders. Hermetic and reproducible builds remain best practices that strengthen Build L3 compliance.
Tools Comparison¶
Several build systems support hermetic and reproducible builds.
Google's build system, designed for hermeticity from the start.
# BUILD.bazel
load("@rules_java//java:defs.bzl", "java_binary")
java_binary(
name = "app",
srcs = glob(["src/**/*.java"]),
deps = [
"@maven//:com_google_guava_guava",
"@maven//:org_slf4j_slf4j_api",
],
# All dependencies explicit, no network during build
)
# Hermetic execution with sandbox
bazel build --spawn_strategy=sandboxed //app:app
# Verify reproducibility
bazel build //app:app --experimental_guard_against_concurrent_changes
Bazel Characteristics:
- Built-in sandboxing and hermeticity
- Content-addressable caching
- Language rules for most ecosystems
- Steep learning curve
- Strong Google/large company adoption
Nix:
Functional package manager ensuring reproducibility through isolation.
# default.nix
{ pkgs ? import <nixpkgs> {} }:
pkgs.stdenv.mkDerivation {
name = "myapp-1.0.0";
src = ./.;
buildInputs = [ pkgs.nodejs pkgs.yarn ];
buildPhase = ''
yarn install --offline
yarn build
'';
installPhase = ''
mkdir -p $out
cp -r dist/* $out/
'';
# All inputs hashed, build isolated
}
Nix Characteristics:
- Declarative, functional approach
- Complete environment isolation
- Nixpkgs provides huge package set
- Reproducibility by design
- Unique (and initially confusing) paradigm
Buck2:
Meta's next-generation build system.
# BUCK
cxx_binary(
name = "app",
srcs = glob(["src/**/*.cpp"]),
deps = [
"//third-party:boost",
"//third-party:fmt",
],
# Hermetic by default
)
Buck2 Characteristics:
- Designed for extreme scale (Meta)
- Strong isolation and reproducibility
- Remote execution support
- Newer, smaller community
- Excellent performance
Pants:
Modern build system with Python roots, expanding ecosystem support.
# pants.toml
[GLOBAL]
pants_version = "2.18.0"
[python]
interpreter_constraints = ["==3.11.*"]
enable_resolves = true
[python.resolves]
python-default = "3rdparty/python/default.lock"
Tool Comparison Matrix:
| Feature | Bazel | Nix | Buck2 | Pants |
|---|---|---|---|---|
| Hermeticity | Excellent | Excellent | Excellent | Good |
| Reproducibility | Excellent | Excellent | Excellent | Good |
| Learning curve | Steep | Very steep | Steep | Moderate |
| Language support | Broad | Very broad | Focused | Growing |
| Remote execution | Yes | Limited | Yes | Yes |
| Community | Large | Large | Growing | Growing |
| Best for | Large monorepos | System + apps | Massive scale | Python/growing |
Ecosystem-Specific Implementation¶
Implementing hermeticity varies by language ecosystem.
Java/JVM:
# WORKSPACE.bazel - Pin all Maven dependencies
load("@rules_jvm_external//:defs.bzl", "maven_install")
maven_install(
name = "maven",
artifacts = [
"com.google.guava:guava:32.1.3-jre",
"org.slf4j:slf4j-api:2.0.9",
],
repositories = [
"https://repo1.maven.org/maven2",
],
# Lock file ensures reproducibility
maven_install_json = "//:maven_install.json",
)
# Generate lock file
bazel run @maven//:pin
# Build hermetically
bazel build //app:app --experimental_strict_action_env
Go:
Go's module system supports reproducibility well:
// go.mod
module example.com/app
go 1.21
require (
github.com/gin-gonic/gin v1.9.1
github.com/lib/pq v1.10.9
)
// go.sum provides checksums for verification
# Verify dependencies haven't changed
go mod verify
# Build with Bazel for full hermeticity
bazel build //:app
Rust:
Cargo supports reproducibility with locked dependencies:
# Cargo.toml
[package]
name = "myapp"
version = "0.1.0"
[dependencies]
tokio = { version = "1.32", features = ["full"] }
serde = { version = "1.0", features = ["derive"] }
# Lock dependencies
cargo generate-lockfile
# Verify build (Cargo doesn't guarantee bit-for-bit reproducibility)
# Use Bazel or Nix for full hermeticity
Node.js:
Node.js presents challenges due to ecosystem design:
// package.json
{
"name": "myapp",
"version": "1.0.0",
"dependencies": {
"express": "4.18.2",
"lodash": "4.17.21"
}
}
# Lock dependencies
npm ci # Uses package-lock.json strictly
# For hermeticity, pre-fetch dependencies
npm pack
# Or use Bazel rules_nodejs with strict dependencies
Bazel for Node.js:
# WORKSPACE.bazel
load("@aspect_rules_js//npm:repositories.bzl", "npm_translate_lock")
npm_translate_lock(
name = "npm",
pnpm_lock = "//:pnpm-lock.yaml",
verify_node_modules_ignored = "//:.bazelignore",
)
Caching Strategies¶
Hermetic builds can be slow without caching. Secure caching accelerates builds while maintaining security properties.
Content-Addressable Caching:
┌─────────────────────────────────────────────────────────────────┐
│ CONTENT-ADDRESSABLE CACHE │
├─────────────────────────────────────────────────────────────────┤
│ │
│ Build Action: │
│ inputs = [src/main.go, go.mod, go.sum] │
│ action = "go build" │
│ toolchain = "go-1.21.0" │
│ │
│ Cache Key = hash(inputs + action + toolchain) │
│ = sha256:abc123... │
│ │
│ Cache Lookup: │
│ If sha256:abc123 exists → Return cached output │
│ If not → Execute build, store result at sha256:abc123 │
│ │
│ Security Property: │
│ Same inputs always produce same key │
│ Different inputs produce different key │
│ Cannot poison cache without changing inputs │
│ │
└─────────────────────────────────────────────────────────────────┘
Remote Caching:
# Bazel: Use remote cache
bazel build //app:app \
--remote_cache=grpcs://cache.example.com \
--remote_upload_local_results=true
# Cache entries are:
# - Keyed by input hash (cannot be poisoned by different inputs)
# - Verified on retrieval (hash must match)
# - Shared across machines and CI
Cache Security Considerations:
| Consideration | Mitigation |
|---|---|
| Cache poisoning | Content-addressable keys prevent wrong-content attacks |
| Data exfiltration | Authenticate cache access, use mTLS |
| Cache pollution | Verify cache entries match expected hash |
| Cache timing attacks | Use consistent cache lookup patterns |
Secure Remote Cache Configuration:
# Remote Build Execution configuration
remote_execution:
instance_name: "projects/build-system/instances/default"
# Authentication
auth:
type: "service_account"
credential_file: "/var/run/secrets/build-cache-sa.json"
# Security settings
security:
tls_enabled: true
verify_outputs: true
reject_unsigned_cache_entries: true
Independent Verification¶
Reproducible builds enable independent verification by rebuilders.
Verification Process:
- Obtain published artifact with claimed source reference
- Fetch source at that reference
- Rebuild using documented build process
- Compare output to published artifact
- Report match or discrepancy
Reproducible Builds Verification:
#!/bin/bash
# verify-build.sh
# 1. Get published artifact
PUBLISHED_HASH=$(curl -s https://releases.example.com/app-1.0.0.sha256)
PUBLISHED_ARTIFACT=$(curl -O https://releases.example.com/app-1.0.0.tar.gz)
# 2. Fetch source at claimed version
git clone https://github.com/example/app.git
cd app
git checkout v1.0.0
# 3. Rebuild
bazel build //app:release --config=release
# 4. Compare
REBUILT_HASH=$(sha256sum bazel-bin/app/release.tar.gz | cut -d' ' -f1)
# 5. Report
if [ "$PUBLISHED_HASH" == "$REBUILT_HASH" ]; then
echo "✓ Verification passed: Binary matches source"
else
echo "✗ Verification failed: Binary does not match source"
echo " Published: $PUBLISHED_HASH"
echo " Rebuilt: $REBUILT_HASH"
exit 1
fi
Multi-Party Verification:
┌─────────────────────────────────────────────────────────────────┐
│ MULTI-PARTY VERIFICATION │
├─────────────────────────────────────────────────────────────────┤
│ │
│ Publisher │
│ ├── Builds from source │
│ ├── Publishes binary with hash: abc123 │
│ └── Publishes provenance attestation │
│ │
│ Rebuilder 1 (independent) │
│ ├── Fetches source │
│ ├── Rebuilds independently │
│ ├── Gets hash: abc123 ✓ │
│ └── Signs attestation "I rebuilt, matches" │
│ │
│ Rebuilder 2 (different org) │
│ ├── Fetches source │
│ ├── Rebuilds independently │
│ ├── Gets hash: abc123 ✓ │
│ └── Signs attestation "I rebuilt, matches" │
│ │
│ Consumer │
│ ├── Sees: Publisher + 2 independent rebuilders agree │
│ └── High confidence binary matches source │
│ │
└─────────────────────────────────────────────────────────────────┘
Diffoscope for Debugging:
When builds don't match, diffoscope helps identify why:
# Compare two builds to find differences
diffoscope build1/app.tar.gz build2/app.tar.gz --html report.html
# Output shows exactly what differs
# - Timestamps
# - Embedded paths
# - Random seeds
# - Build ordering
Gradual Adoption Path¶
Moving to hermetic builds is a journey, not a single step.
Adoption Phases:
Phase 1: Awareness and Baseline
# Phase 1: Assessment (Weeks 1-4)
### Goals
- Understand current build reproducibility
- Identify largest sources of non-determinism
- Select pilot project
### Actions
1. Build same commit twice, compare outputs
2. Use diffoscope to identify differences
3. Document current build dependencies
4. Choose build system for evaluation
Phase 2: Lock Dependencies
## Phase 2: Dependency Pinning (Weeks 5-8)
### Goals
- All dependencies version-locked
- Lock files committed to repository
- No floating version specifiers
### Actions
1. Generate lock files for all package managers
2. Implement lock file update process
3. CI fails if lock file doesn't match
4. Document dependency update procedure
Phase 3: Eliminate Non-Determinism
## Phase 3: Deterministic Builds (Weeks 9-16)
### Goals
- Same source produces same output
- No timestamps, random values, or paths in output
- Documented build environment
### Actions
1. Set SOURCE_DATE_EPOCH for timestamps
2. Sort file lists before processing
3. Use deterministic compiler flags
4. Normalize build paths
5. Verify with rebuild comparison
Phase 4: Implement Hermeticity
## Phase 4: Hermetic Builds (Weeks 17-24)
### Goals
- Builds execute in isolation
- No network access during build
- All inputs explicitly declared
### Actions
1. Pre-fetch all dependencies
2. Implement sandbox execution
3. Disable network in build environment
4. Verify no host environment leakage
5. Enable reproducibility verification
Adoption Checklist:
## Hermetic Build Adoption Checklist
### Dependencies
- [ ] All dependencies version-locked
- [ ] Lock files in version control
- [ ] No floating versions (^, ~, *)
- [ ] Pre-fetch mechanism implemented
### Determinism
- [ ] Timestamps eliminated or fixed
- [ ] File ordering deterministic
- [ ] No random values in output
- [ ] Build paths normalized
### Isolation
- [ ] Network disabled during build
- [ ] File system access restricted
- [ ] Environment variables controlled
- [ ] Toolchain versions pinned
### Verification
- [ ] Rebuild produces identical output
- [ ] Diffoscope finds no differences
- [ ] CI verifies reproducibility
- [ ] Independent verification possible
Recommendations¶
For Build Engineers:
-
Start with dependency locking. Before pursuing full hermeticity, ensure all dependencies are pinned and locked. This provides immediate security benefit with lower effort.
-
Use diffoscope to find non-determinism. Build twice, compare with diffoscope, fix differences one by one. Iterate until builds match.
-
Select appropriate tooling. Bazel and Nix provide the strongest guarantees but have steep learning curves. Evaluate trade-offs for your context.
For Platform Engineers:
-
Invest in remote caching. Hermetic builds can be slower without caching. Content-addressable remote caches provide security and performance.
-
Build verification into pipelines. Automatically rebuild and compare as part of release process. Fail if builds don't reproduce.
-
Document the build environment. Even without full reproducibility, documenting exact toolchain versions enables forensic reconstruction.
For Security Architects:
-
Target SLSA Level 4. Use SLSA as a framework for improving build security. Hermetic, reproducible builds are required for the highest level.
-
Enable independent verification. Publish enough information (source refs, build instructions, environment specs) that anyone can verify builds.
-
Plan for gradual adoption. Full hermeticity takes time. Define phases, measure progress, and celebrate incremental wins.
Hermetic and reproducible builds transform software supply chain security from "trust us" to "verify this." When any party can rebuild from source and verify the output matches published binaries, supply chain attacks become detectable. This doesn't make attacks impossible, but it makes them visible—and visibility is the foundation of defense.