Skip to content

17.4 Hermetic and Reproducible Builds

When researchers at Reproducible Builds tried to rebuild Debian packages, they found that identical source code often produced different binaries—timestamps, build paths, and compiler randomization all introduced variations. These variations aren't just academic concerns; they make it impossible to verify that a published binary came from claimed source code. The SolarWinds attackers exploited exactly this gap, inserting malicious code during compilation that wouldn't appear in source reviews. Hermetic and reproducible builds close this vulnerability by ensuring builds are isolated, deterministic, and independently verifiable.

This section explains hermetic and reproducible builds as foundational supply chain security controls, with practical implementation guidance across ecosystems.

Defining Hermeticity

A hermetic build is isolated from external influences during execution. The build process cannot access the network, cannot read from unspecified file system locations, and produces output based solely on declared inputs.

Hermetic Build Requirements:

Requirement Description
No network access Build cannot fetch from internet during execution
Declared inputs All dependencies explicitly listed
Isolated file system Only specified directories accessible
Fixed toolchain Compiler, linker, tools are explicit inputs
No host leakage Environment variables, paths, username don't affect output

Hermetic vs. Non-Hermetic:

Non-Hermetic Build:
┌─────────────────────────────────────────────────────────────────┐
│                     BUILD ENVIRONMENT                            │
│                                                                  │
│  Sources ─┐                                                      │
│           │   ┌─────────┐   npm install   ┌─────────┐           │
│  Config  ─┼──►│  Build  │◄─────────────► │ npm     │           │
│           │   │ Process │    at runtime   │ Registry│           │
│  ???    ─┘   └─────────┘                 └─────────┘           │
│     ↑                                                            │
│     └── Host system state, env vars, timestamps, random         │
│                                                                  │
│  Result: Different output each time, external dependencies      │
└─────────────────────────────────────────────────────────────────┘

Hermetic Build:
┌─────────────────────────────────────────────────────────────────┐
│                     BUILD SANDBOX                                │
│                                                                  │
│  Sources ─┐                                                      │
│           │   ┌─────────┐                                       │
│  Config  ─┼──►│  Build  │   No network   ╳───► External        │
│           │   │ Process │                                        │
│  Deps    ─┘   └─────────┘                                       │
│  (pinned)                                                        │
│                                                                  │
│  All inputs declared, no external access, deterministic         │
└─────────────────────────────────────────────────────────────────┘

What Hermeticity Prevents:

Attack Vector How Hermeticity Helps
Dependency confusion Cannot fetch unexpected packages at build time
Registry compromise Pre-fetched dependencies, no runtime access
Network MITM No network access to intercept
Build-time injection Only declared inputs can affect output
Environment manipulation Host environment isolated

Reproducible Build Definition

A reproducible build produces bit-for-bit identical output from the same source input, regardless of when, where, or by whom the build is performed.

Reproducibility Requirements:

  1. Same source → Same output (deterministic)
  2. Independent rebuilders can verify (verifiable)
  3. Build environment is defined and portable (documented)
  4. No non-determinism in the build process (predictable)

Sources of Non-Reproducibility:

Source Example Fix
Timestamps __DATE__, __TIME__ macros Use SOURCE_DATE_EPOCH
File ordering Non-deterministic directory listing Sort before processing
Random data UUIDs, random seeds Seed deterministically
Build paths Absolute paths in binaries Use relative or normalized paths
Parallel builds Order-dependent outputs Ensure deterministic ordering
Compiler version Different optimization Pin exact toolchain version
Locale settings Sorting, string collation Set consistent locale

Reproducibility in Practice:

# Build #1: Developer machine
$ bazel build //app:binary
# SHA256: abc123...

# Build #2: CI server, different time, different machine
$ bazel build //app:binary
# SHA256: abc123... (identical)

# Build #3: Independent auditor
$ bazel build //app:binary
# SHA256: abc123... (verifiable)

Reproducible builds transform the question "how do you know this binary came from that source?" from one requiring trust to one with cryptographic proof.

Security Benefits

Hermetic and reproducible builds provide multiple security benefits.

Verification:

┌─────────────────────────────────────────────────────────────────┐
│                    VERIFICATION CHAIN                            │
├─────────────────────────────────────────────────────────────────┤
│                                                                  │
│  Source Code ──► Build ──► Binary ──► Deploy                   │
│       │                      │                                   │
│       │ Reproducible        │                                   │
│       └──────────────────────┘                                  │
│              │                                                   │
│              ▼                                                   │
│  ┌─────────────────────────────────────────────────────────┐   │
│  │  Independent Rebuilder                                   │   │
│  │  • Fetch same source                                     │   │
│  │  • Run same build                                        │   │
│  │  • Compare output                                        │   │
│  │  • Match? → Source matches binary (verified)            │   │
│  │  • No match? → Something was modified (alert)           │   │
│  └─────────────────────────────────────────────────────────┘   │
│                                                                  │
└─────────────────────────────────────────────────────────────────┘

Security Benefits Table:

Benefit Explanation
Build tampering detection Modified build produces different hash
Source/binary linkage Prove binary came from source
Supply chain verification Third parties can verify builds
Incident investigation Rebuild historical versions for comparison
SLSA compliance Required for higher SLSA levels
Insider threat defense Can't hide modifications in build

SLSA Relevance:

SLSA (Supply-chain Levels for Software Artifacts) v1.0 defines three Build track levels focusing on build integrity:

SLSA Build Level Build Requirement
Build L1 Provenance exists documenting the build process
Build L2 Builds run on a hosted build platform with signed provenance
Build L3 Hardened build platform with strong tamper protection

While SLSA v0.1 included a Level 4 requiring reproducible builds, the current specification focuses Build L3 on hardened builders. Hermetic and reproducible builds remain best practices that strengthen Build L3 compliance.

Tools Comparison

Several build systems support hermetic and reproducible builds.

Bazel:

Google's build system, designed for hermeticity from the start.

# BUILD.bazel
load("@rules_java//java:defs.bzl", "java_binary")

java_binary(
    name = "app",
    srcs = glob(["src/**/*.java"]),
    deps = [
        "@maven//:com_google_guava_guava",
        "@maven//:org_slf4j_slf4j_api",
    ],
    # All dependencies explicit, no network during build
)
# Hermetic execution with sandbox
bazel build --spawn_strategy=sandboxed //app:app

# Verify reproducibility
bazel build //app:app --experimental_guard_against_concurrent_changes

Bazel Characteristics:

  • Built-in sandboxing and hermeticity
  • Content-addressable caching
  • Language rules for most ecosystems
  • Steep learning curve
  • Strong Google/large company adoption

Nix:

Functional package manager ensuring reproducibility through isolation.

# default.nix
{ pkgs ? import <nixpkgs> {} }:

pkgs.stdenv.mkDerivation {
  name = "myapp-1.0.0";
  src = ./.;

  buildInputs = [ pkgs.nodejs pkgs.yarn ];

  buildPhase = ''
    yarn install --offline
    yarn build
  '';

  installPhase = ''
    mkdir -p $out
    cp -r dist/* $out/
  '';

  # All inputs hashed, build isolated
}

Nix Characteristics:

  • Declarative, functional approach
  • Complete environment isolation
  • Nixpkgs provides huge package set
  • Reproducibility by design
  • Unique (and initially confusing) paradigm

Buck2:

Meta's next-generation build system.

# BUCK
cxx_binary(
    name = "app",
    srcs = glob(["src/**/*.cpp"]),
    deps = [
        "//third-party:boost",
        "//third-party:fmt",
    ],
    # Hermetic by default
)

Buck2 Characteristics:

  • Designed for extreme scale (Meta)
  • Strong isolation and reproducibility
  • Remote execution support
  • Newer, smaller community
  • Excellent performance

Pants:

Modern build system with Python roots, expanding ecosystem support.

# pants.toml
[GLOBAL]
pants_version = "2.18.0"

[python]
interpreter_constraints = ["==3.11.*"]
enable_resolves = true

[python.resolves]
python-default = "3rdparty/python/default.lock"

Tool Comparison Matrix:

Feature Bazel Nix Buck2 Pants
Hermeticity Excellent Excellent Excellent Good
Reproducibility Excellent Excellent Excellent Good
Learning curve Steep Very steep Steep Moderate
Language support Broad Very broad Focused Growing
Remote execution Yes Limited Yes Yes
Community Large Large Growing Growing
Best for Large monorepos System + apps Massive scale Python/growing

Ecosystem-Specific Implementation

Implementing hermeticity varies by language ecosystem.

Java/JVM:

# WORKSPACE.bazel - Pin all Maven dependencies
load("@rules_jvm_external//:defs.bzl", "maven_install")

maven_install(
    name = "maven",
    artifacts = [
        "com.google.guava:guava:32.1.3-jre",
        "org.slf4j:slf4j-api:2.0.9",
    ],
    repositories = [
        "https://repo1.maven.org/maven2",
    ],
    # Lock file ensures reproducibility
    maven_install_json = "//:maven_install.json",
)
# Generate lock file
bazel run @maven//:pin

# Build hermetically
bazel build //app:app --experimental_strict_action_env

Go:

Go's module system supports reproducibility well:

// go.mod
module example.com/app

go 1.21

require (
    github.com/gin-gonic/gin v1.9.1
    github.com/lib/pq v1.10.9
)

// go.sum provides checksums for verification
# Verify dependencies haven't changed
go mod verify

# Build with Bazel for full hermeticity
bazel build //:app

Rust:

Cargo supports reproducibility with locked dependencies:

# Cargo.toml
[package]
name = "myapp"
version = "0.1.0"

[dependencies]
tokio = { version = "1.32", features = ["full"] }
serde = { version = "1.0", features = ["derive"] }
# Lock dependencies
cargo generate-lockfile

# Verify build (Cargo doesn't guarantee bit-for-bit reproducibility)
# Use Bazel or Nix for full hermeticity

Node.js:

Node.js presents challenges due to ecosystem design:

// package.json
{
  "name": "myapp",
  "version": "1.0.0",
  "dependencies": {
    "express": "4.18.2",
    "lodash": "4.17.21"
  }
}
# Lock dependencies
npm ci  # Uses package-lock.json strictly

# For hermeticity, pre-fetch dependencies
npm pack
# Or use Bazel rules_nodejs with strict dependencies

Bazel for Node.js:

# WORKSPACE.bazel
load("@aspect_rules_js//npm:repositories.bzl", "npm_translate_lock")

npm_translate_lock(
    name = "npm",
    pnpm_lock = "//:pnpm-lock.yaml",
    verify_node_modules_ignored = "//:.bazelignore",
)

Caching Strategies

Hermetic builds can be slow without caching. Secure caching accelerates builds while maintaining security properties.

Content-Addressable Caching:

┌─────────────────────────────────────────────────────────────────┐
│                 CONTENT-ADDRESSABLE CACHE                        │
├─────────────────────────────────────────────────────────────────┤
│                                                                  │
│  Build Action:                                                   │
│  inputs = [src/main.go, go.mod, go.sum]                         │
│  action = "go build"                                            │
│  toolchain = "go-1.21.0"                                        │
│                                                                  │
│  Cache Key = hash(inputs + action + toolchain)                  │
│            = sha256:abc123...                                    │
│                                                                  │
│  Cache Lookup:                                                   │
│  If sha256:abc123 exists → Return cached output                 │
│  If not → Execute build, store result at sha256:abc123          │
│                                                                  │
│  Security Property:                                              │
│  Same inputs always produce same key                            │
│  Different inputs produce different key                         │
│  Cannot poison cache without changing inputs                    │
│                                                                  │
└─────────────────────────────────────────────────────────────────┘

Remote Caching:

# Bazel: Use remote cache
bazel build //app:app \
    --remote_cache=grpcs://cache.example.com \
    --remote_upload_local_results=true

# Cache entries are:
# - Keyed by input hash (cannot be poisoned by different inputs)
# - Verified on retrieval (hash must match)
# - Shared across machines and CI

Cache Security Considerations:

Consideration Mitigation
Cache poisoning Content-addressable keys prevent wrong-content attacks
Data exfiltration Authenticate cache access, use mTLS
Cache pollution Verify cache entries match expected hash
Cache timing attacks Use consistent cache lookup patterns

Secure Remote Cache Configuration:

# Remote Build Execution configuration
remote_execution:
  instance_name: "projects/build-system/instances/default"

  # Authentication
  auth:
    type: "service_account"
    credential_file: "/var/run/secrets/build-cache-sa.json"

  # Security settings
  security:
    tls_enabled: true
    verify_outputs: true
    reject_unsigned_cache_entries: true

Independent Verification

Reproducible builds enable independent verification by rebuilders.

Verification Process:

  1. Obtain published artifact with claimed source reference
  2. Fetch source at that reference
  3. Rebuild using documented build process
  4. Compare output to published artifact
  5. Report match or discrepancy

Reproducible Builds Verification:

#!/bin/bash
# verify-build.sh

# 1. Get published artifact
PUBLISHED_HASH=$(curl -s https://releases.example.com/app-1.0.0.sha256)
PUBLISHED_ARTIFACT=$(curl -O https://releases.example.com/app-1.0.0.tar.gz)

# 2. Fetch source at claimed version
git clone https://github.com/example/app.git
cd app
git checkout v1.0.0

# 3. Rebuild
bazel build //app:release --config=release

# 4. Compare
REBUILT_HASH=$(sha256sum bazel-bin/app/release.tar.gz | cut -d' ' -f1)

# 5. Report
if [ "$PUBLISHED_HASH" == "$REBUILT_HASH" ]; then
    echo "✓ Verification passed: Binary matches source"
else
    echo "✗ Verification failed: Binary does not match source"
    echo "  Published: $PUBLISHED_HASH"
    echo "  Rebuilt:   $REBUILT_HASH"
    exit 1
fi

Multi-Party Verification:

┌─────────────────────────────────────────────────────────────────┐
│               MULTI-PARTY VERIFICATION                           │
├─────────────────────────────────────────────────────────────────┤
│                                                                  │
│  Publisher                                                       │
│  ├── Builds from source                                         │
│  ├── Publishes binary with hash: abc123                         │
│  └── Publishes provenance attestation                           │
│                                                                  │
│  Rebuilder 1 (independent)                                      │
│  ├── Fetches source                                             │
│  ├── Rebuilds independently                                     │
│  ├── Gets hash: abc123 ✓                                        │
│  └── Signs attestation "I rebuilt, matches"                     │
│                                                                  │
│  Rebuilder 2 (different org)                                    │
│  ├── Fetches source                                             │
│  ├── Rebuilds independently                                     │
│  ├── Gets hash: abc123 ✓                                        │
│  └── Signs attestation "I rebuilt, matches"                     │
│                                                                  │
│  Consumer                                                        │
│  ├── Sees: Publisher + 2 independent rebuilders agree           │
│  └── High confidence binary matches source                      │
│                                                                  │
└─────────────────────────────────────────────────────────────────┘

Diffoscope for Debugging:

When builds don't match, diffoscope helps identify why:

# Compare two builds to find differences
diffoscope build1/app.tar.gz build2/app.tar.gz --html report.html

# Output shows exactly what differs
# - Timestamps
# - Embedded paths
# - Random seeds
# - Build ordering

Gradual Adoption Path

Moving to hermetic builds is a journey, not a single step.

Adoption Phases:

Phase 1: Awareness and Baseline

# Phase 1: Assessment (Weeks 1-4)

### Goals
- Understand current build reproducibility
- Identify largest sources of non-determinism
- Select pilot project

### Actions
1. Build same commit twice, compare outputs
2. Use diffoscope to identify differences
3. Document current build dependencies
4. Choose build system for evaluation

Phase 2: Lock Dependencies

## Phase 2: Dependency Pinning (Weeks 5-8)

### Goals
- All dependencies version-locked
- Lock files committed to repository
- No floating version specifiers

### Actions
1. Generate lock files for all package managers
2. Implement lock file update process
3. CI fails if lock file doesn't match
4. Document dependency update procedure

Phase 3: Eliminate Non-Determinism

## Phase 3: Deterministic Builds (Weeks 9-16)

### Goals
- Same source produces same output
- No timestamps, random values, or paths in output
- Documented build environment

### Actions
1. Set SOURCE_DATE_EPOCH for timestamps
2. Sort file lists before processing
3. Use deterministic compiler flags
4. Normalize build paths
5. Verify with rebuild comparison

Phase 4: Implement Hermeticity

## Phase 4: Hermetic Builds (Weeks 17-24)

### Goals
- Builds execute in isolation
- No network access during build
- All inputs explicitly declared

### Actions
1. Pre-fetch all dependencies
2. Implement sandbox execution
3. Disable network in build environment
4. Verify no host environment leakage
5. Enable reproducibility verification

Adoption Checklist:

## Hermetic Build Adoption Checklist

### Dependencies
- [ ] All dependencies version-locked
- [ ] Lock files in version control
- [ ] No floating versions (^, ~, *)
- [ ] Pre-fetch mechanism implemented

### Determinism
- [ ] Timestamps eliminated or fixed
- [ ] File ordering deterministic
- [ ] No random values in output
- [ ] Build paths normalized

### Isolation
- [ ] Network disabled during build
- [ ] File system access restricted
- [ ] Environment variables controlled
- [ ] Toolchain versions pinned

### Verification
- [ ] Rebuild produces identical output
- [ ] Diffoscope finds no differences
- [ ] CI verifies reproducibility
- [ ] Independent verification possible

Recommendations

For Build Engineers:

  1. Start with dependency locking. Before pursuing full hermeticity, ensure all dependencies are pinned and locked. This provides immediate security benefit with lower effort.

  2. Use diffoscope to find non-determinism. Build twice, compare with diffoscope, fix differences one by one. Iterate until builds match.

  3. Select appropriate tooling. Bazel and Nix provide the strongest guarantees but have steep learning curves. Evaluate trade-offs for your context.

For Platform Engineers:

  1. Invest in remote caching. Hermetic builds can be slower without caching. Content-addressable remote caches provide security and performance.

  2. Build verification into pipelines. Automatically rebuild and compare as part of release process. Fail if builds don't reproduce.

  3. Document the build environment. Even without full reproducibility, documenting exact toolchain versions enables forensic reconstruction.

For Security Architects:

  1. Target SLSA Level 4. Use SLSA as a framework for improving build security. Hermetic, reproducible builds are required for the highest level.

  2. Enable independent verification. Publish enough information (source refs, build instructions, environment specs) that anyone can verify builds.

  3. Plan for gradual adoption. Full hermeticity takes time. Define phases, measure progress, and celebrate incremental wins.

Hermetic and reproducible builds transform software supply chain security from "trust us" to "verify this." When any party can rebuild from source and verify the output matches published binaries, supply chain attacks become detectable. This doesn't make attacks impossible, but it makes them visible—and visibility is the foundation of defense.