9.6 Infrastructure-as-Code Supply Chains¶

Infrastructure-as-Code (IaC) has transformed how organizations provision and manage infrastructure. Terraform, Ansible, Helm, and similar tools enable version-controlled, repeatable infrastructure deployments. But these tools operate through modules, providers, roles, and charts—dependencies that carry supply chain risk just like application packages. The difference is stakes: a compromised npm package might steal credentials; a compromised Terraform module can provision backdoored infrastructure, create unauthorized access, or exfiltrate secrets from the provisioning process itself.

IaC supply chains deserve the same scrutiny as application dependencies, yet often receive less attention because infrastructure teams may not think of modules as "software dependencies" in the traditional sense.

Terraform Modules and Providers¶

Terraform dominates the IaC landscape and is widely adopted by organizations practicing infrastructure automation. Its supply chain consists of two primary components:

Providers:

Terraform providers are plugins that interface with APIs—AWS, Azure, GCP, Kubernetes, and hundreds of other services. Providers translate Terraform configuration into API calls.

Official providers: Maintained by HashiCorp (AWS, Azure, GCP, Kubernetes)
Partner providers: Maintained by technology partners with HashiCorp verification
Community providers: Maintained by community members without formal verification

Providers execute during terraform apply with access to:

Cloud credentials in environment variables or configuration
State files containing sensitive infrastructure details
The network context of wherever Terraform runs

A malicious provider could exfiltrate credentials, modify infrastructure beyond declared configuration, or establish persistent access.

Modules:

Terraform modules package reusable infrastructure configurations. Rather than writing VPC configuration from scratch, teams reference community or internal modules:

module "vpc" {
  source  = "terraform-aws-modules/vpc/aws"
  version = "5.0.0"

  # configuration...
}

The Terraform Registry hosts thousands of modules:

Thousands of modules are available in the Terraform Registry
The most popular modules have millions of downloads
No formal security review process exists for public modules

Registry Trust Model:

The Terraform Registry operates similarly to npm or PyPI:

Anyone can publish modules with a GitHub account
No mandatory security review
Namespace is tied to GitHub username/organization
Verification badges exist for HashiCorp partners but don't indicate security review

Supply Chain Risks:

Module source substitution: If a module reference uses flexible versioning or unverified sources, attackers can inject malicious versions
Provider impersonation: Typosquatting on provider names (aws vs awss)
Embedded credentials: Modules may contain hardcoded credentials or secrets
Excessive permissions: Modules may create resources with overly permissive IAM policies
Remote execution: Terraform's local-exec and remote-exec provisioners can run arbitrary commands

Version Pinning:

Unlike application dependencies, Terraform historically had weak version pinning:

# DANGEROUS: Uses latest version
module "vpc" {
  source = "terraform-aws-modules/vpc/aws"
}

# BETTER: Pins to specific version
module "vpc" {
  source  = "terraform-aws-modules/vpc/aws"
  version = "5.0.0"
}

# BEST: Also verifies checksums via lock file
# terraform.lock.hcl maintains hashes

Terraform 0.14 introduced .terraform.lock.hcl for provider checksums, providing npm-lockfile-equivalent verification.

Ansible Roles and Galaxy¶

Ansible automates configuration management through playbooks that reference roles—reusable configuration packages. Ansible Galaxy serves as the public repository for community roles.

Galaxy Security Model:

Ansible Galaxy has historically offered minimal security assurances:

No code review for published roles
Limited verification of publisher identity
Namespace based on Galaxy accounts, not strongly tied to identity
No mandatory signing or checksum verification

Role Execution Context:

Ansible roles execute on target systems, typically with elevated privileges:

SSH access to target machines
Often running as root or with sudo
Access to inventory variables containing credentials
Network access from target systems

A malicious role can:

Install backdoors on managed systems
Exfiltrate secrets from inventory or variables
Modify configurations beyond declared scope
Establish persistence mechanisms

Ansible Collections:

Red Hat has promoted Ansible Collections as the modern distribution format, with improved namespacing and the possibility of signature verification. Collections from ansible.builtin and certified collections receive more scrutiny than Galaxy-sourced community content.

Version Pinning:

# requirements.yml
roles:
  - name: geerlingguy.docker
    version: 6.1.0  # Pin specific version

collections:
  - name: community.general
    version: 7.0.0

Without explicit version pinning, ansible-galaxy install fetches the latest version—creating the same update-as-attack-vector risk seen in other ecosystems.

Helm Charts and Kubernetes Operators¶

Kubernetes deployments commonly use Helm charts to package applications and operators to manage complex stateful workloads. Both introduce supply chain considerations.

Helm Charts:

Helm charts contain:

Kubernetes manifests (Deployments, Services, ConfigMaps)
Values files with configuration defaults
Templates with Go templating logic
Hooks that execute during install/upgrade

What to Verify:

Container images: Charts reference images that may come from untrusted registries
Security contexts: Whether pods run as root, with privileged access
RBAC definitions: What permissions the chart requests
NetworkPolicies: Whether the chart restricts network access appropriately
Hooks and jobs: What executes during chart lifecycle events

Chart Repositories:

Charts are distributed through repositories:

Artifact Hub: Aggregates charts from many sources
Bitnami: Popular chart repository with regular updates
Organization-specific registries: Internal Helm repositories

No central authority reviews charts for security. Artifact Hub provides metadata but not security verification.

Helm Chart Signing:

Helm supports provenance files (.prov) for signature verification:

helm verify mychart-1.0.0.tgz

In practice, few charts are signed, and few organizations verify signatures.

Kubernetes Operators:

Operators extend Kubernetes with custom controllers. They run with cluster-level permissions and can:

Create, modify, and delete any Kubernetes resources
Access secrets across namespaces
Execute arbitrary code in the cluster

Operators are typically installed via:

Helm charts (inheriting Helm supply chain risks)
Operator Lifecycle Manager (OLM) with OperatorHub
Direct manifest application

OperatorHub provides some curation for "certified" operators, but community operators have minimal review.

Policy-as-Code Supply Chains¶

Policy-as-code tools like Open Policy Agent (OPA), Kyverno, and Conftest enforce security and compliance policies. These tools have their own supply chains:

OPA and Rego Policies:

OPA policies written in Rego can be:

Bundled and distributed as policy bundles
Referenced from remote locations
Included from shared libraries

A compromised policy could:

Approve resources that should be denied
Deny legitimate resources (availability impact)
Leak information through policy evaluation side effects

Kyverno Policies:

Kyverno policies are Kubernetes resources that can be packaged and distributed. The Kyverno Policy Library provides community policies—useful but unverified.

Policy Supply Chain Risks:

Malicious policy approval: Policies that permit insecure configurations
Policy bypass: Subtle logic errors (intentional or not) that allow circumvention
Policy denial of service: Policies that block legitimate operations

Verification Approaches:

Test policies against known-good and known-bad configurations
Peer review policy changes like code changes
Use policy testing frameworks (OPA's testing, Kyverno's policy testing)

GitOps Security Considerations¶

GitOps uses Git repositories as the source of truth for infrastructure. Tools like Argo CD and Flux continuously reconcile cluster state with repository contents. This creates a direct path from repository to infrastructure.

The Repository as Attack Vector:

In GitOps:

Repository content directly determines infrastructure state
Push access to the repository equals infrastructure access
Repository compromise equals infrastructure compromise

This is simultaneously a security benefit (all changes are version-controlled and auditable) and a concentration of risk (the repository becomes a critical target).

GitOps Supply Chain Risks:

Repository access control: Over-permissive access grants infrastructure control to unintended parties
Branch protection: Without proper protection, unauthorized changes can reach production
External references: Charts, modules, or images referenced in GitOps repos have all their native supply chain risks
Secrets management: GitOps repos may contain or reference secrets (encrypted or via external secret operators)
Sync automation: Continuous sync means malicious changes deploy automatically

GitOps Security Practices:

Strict branch protection on deployment branches
Required reviews for infrastructure changes
Signed commits for audit trail
Separation between application and infrastructure repos
Secret management through external systems (HashiCorp Vault, AWS Secrets Manager)

Parallels to Application Dependencies¶

IaC supply chains mirror application package manager risks:

Aspect	Application (npm/PyPI)	IaC (Terraform/Ansible)
Central registry	npm, PyPI	Terraform Registry, Galaxy
Dependency declaration	package.json, requirements.txt	modules blocks, requirements.yml
Version pinning	Lockfiles	.terraform.lock.hcl, versions
Account compromise	Publish malicious packages	Publish malicious modules
Typosquatting	Misspelled packages	Misspelled modules/providers
Transitive deps	Deep dependency trees	Nested modules
Execution context	Application runtime	Infrastructure provisioning

Key Differences:

Impact scope: IaC dependencies affect infrastructure, not just applications
Privilege level: IaC typically runs with cloud admin or root access
Execution frequency: IaC runs during provisioning, not continuously
Visibility: IaC supply chains may be less monitored than application deps

Vetting and Pinning IaC Dependencies¶

Vetting Practices:

Before adopting IaC modules:

Review source code: Examine what the module actually does, not just what it claims
Check maintenance status: Active maintenance, responsive to issues, recent updates
Assess publisher reputation: Is this from a known, trusted source?
Evaluate permissions: What access does the module require? Is it justified?
Scan for secrets: Check for hardcoded credentials or secrets
Test in isolation: Run in a sandbox environment before production

Version Pinning:

Always pin IaC dependencies to specific versions:

# Terraform
terraform {
  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "5.31.0"  # Exact version
    }
  }
}

module "vpc" {
  source  = "terraform-aws-modules/vpc/aws"
  version = "5.0.0"  # Exact version
}

# Ansible requirements.yml
collections:
  - name: amazon.aws
    version: 6.5.0  # Exact version

Checksum Verification:

Where supported, verify checksums:

Terraform lock files (.terraform.lock.hcl) contain provider checksums
Helm chart provenance files enable signature verification
Container image digests (rather than tags) ensure specific content

Recommendations¶

For Platform Engineers:

Treat IaC modules as code dependencies. Apply the same scrutiny to Terraform modules as to npm packages. They're both software dependencies.
Pin all versions explicitly. Never rely on implicit "latest" versions. Use exact version constraints and lock files.
Maintain an internal module registry. For critical infrastructure, curate approved modules in an internal registry rather than pulling directly from public sources.
Scan IaC for security issues. Use tools like Checkov, tfsec, or Terrascan to analyze Terraform; ansible-lint and molecule for Ansible.
Review module updates before applying. When updating modules, review changes rather than blindly accepting new versions.

For Security Teams:

Audit IaC supply chains. Include Terraform modules, Ansible roles, and Helm charts in dependency inventories.
Monitor for suspicious module activity. Track module publication dates, ownership changes, and unusual update patterns.
Implement policy-as-code carefully. Policies themselves are code dependencies. Test and review policies before deployment.
Secure GitOps repositories. Treat deployment repos as infrastructure admin access. Implement strict access controls and branch protection.
Limit provisioning credentials. Apply least privilege to credentials used during IaC execution. Segment by environment.

For Organizations Adopting GitOps:

Implement strong repository access controls. Push access to GitOps repos equals infrastructure access.
Require signed commits. Ensure audit trails for all infrastructure changes.
Separate concerns. Use different repositories or branches for different environments and criticality levels.
Externalize secrets. Don't store secrets in GitOps repos, even encrypted. Use external secret management.
Monitor sync operations. Alert on unexpected or unauthorized sync events.

Infrastructure-as-Code supply chains represent a convergence of software dependency risks with infrastructure-level impact. The same patterns—untrusted registries, version flexibility, transitive dependencies—that create risk in npm or PyPI create amplified risk when the "dependency" provisions cloud infrastructure or configures production servers. Organizations that have matured their application supply chain practices should extend those practices to their IaC ecosystems before supply chain attackers recognize the opportunity that infrastructure modules present.

Application vs Infrastructure-as-Code dependency risks