12.1 Inventory: Knowing What You Have¶

You cannot secure what you cannot see. This fundamental principle underlies all supply chain security efforts, yet most organizations struggle with basic visibility into what software components actually run in their environments. Organizations significantly underestimate their open source usage—with open source comprising 76-78% of codebases, often far exceeding management expectations—and the software they don't know about represents their greatest vulnerability.

This section establishes software inventory as the foundation for supply chain security. Before you can assess risk, patch vulnerabilities, or respond to incidents, you must first answer: what do we have?

The Challenge of Visibility¶

Modern software environments are extraordinarily complex. A typical enterprise has thousands of applications, each with hundreds of dependencies, running across diverse infrastructure—on-premises servers, multiple cloud providers, container orchestration platforms, serverless functions, and edge deployments.

The Visibility Gap:

Research reveals significant gaps between perceived and actual software inventory:

Organizations typically track only 40-50% of their open source components¹
The average Java application contains 148 open source dependencies², far exceeding most organizations' estimates
96% of commercial codebases contain open source, with open source comprising 78% of code on average³
91% of codebases contain components that are either outdated or have had no development activity in two years⁴

Why Visibility Is Difficult:

Several factors contribute to visibility challenges:

Transitive dependencies: Direct dependencies are trackable; their dependencies (and their dependencies) often aren't
Multiple ecosystems: A single application may include npm, Maven, pip, and container dependencies
Development velocity: New components are added faster than inventory processes can track
Decentralized decisions: Individual developers choose dependencies without central oversight
Infrastructure diversity: Different deployment targets (containers, VMs, serverless) require different discovery methods
Legacy systems: Older applications may lack modern dependency management

Organizations consistently underestimate their software inventory. First-time comprehensive scans typically reveal thousands of components with no formal records—dependencies that have been in production for years without visibility or governance.

Discovery Methods¶

Different discovery approaches provide different perspectives on software inventory, each with strengths and limitations.

Source Code Scanning:

Analyzes source repositories to identify declared dependencies.

How It Works: - Parses manifest files (package.json, pom.xml, requirements.txt) - Analyzes lockfiles for exact versions - Scans for vendored dependencies copied into repositories - Identifies imports and includes in source code

Strengths: - Comprehensive view of declared dependencies - Works before deployment - Can identify dependencies not yet installed - Provides version information

Limitations: - Only sees what's declared, not what's actually running - May miss dynamically loaded dependencies - Requires repository access - Point-in-time snapshot

Build-Time Analysis:

Examines the build process to identify what's actually compiled or bundled.

How It Works: - Hooks into build systems (Maven, Gradle, webpack) - Captures dependency resolution at build time - Records what's actually included in build artifacts - Can generate SBOMs as build output

Strengths: - Reflects what's actually built - Captures build-time dependency resolution - More accurate than source-only scanning - Integrates with CI/CD pipelines

Limitations: - Requires build system integration - Different for each build technology - May miss runtime-downloaded content - Not applicable to interpreted languages without build step

Runtime Detection:

Identifies components actually executing in production environments.

How It Works: - Agents monitor running processes - Analyzes loaded libraries and modules - Inspects container contents at runtime - Monitors network calls to package registries

Strengths: - Shows what's actually running (ground truth) - Captures dynamically loaded components - Detects shadow IT and ungoverned software - Reflects production reality

Limitations: - Requires production access and agents - Performance overhead concerns - May not capture intermittently used components - Deployment complexity

Comparison Matrix:

Method	Coverage	Accuracy	Timing	Complexity
Source Scanning	Declared deps	Medium	Pre-deployment	Low
Build Analysis	Built deps	High	Build time	Medium
Runtime Detection	Running deps	Highest	Production	High

Recommended Approach:

Use multiple methods for comprehensive coverage:

Source scanning for early visibility and developer feedback
Build analysis for accurate artifact inventory
Runtime detection to verify production state and catch gaps

Shadow IT and Ungoverned Open Source¶

Shadow open source refers to open source components used without organizational awareness or approval—the software equivalent of shadow IT.

How Shadow Open Source Accumulates:

Developers add dependencies without formal review
Copy-paste from Stack Overflow includes library references
Proof-of-concept code with ungoverned dependencies reaches production
Acquired companies bring unknown component portfolios
Third-party contractors introduce untracked dependencies

The Scale of the Problem:

Studies suggest 30-40% of open source in enterprise codebases is ungoverned⁵
Average enterprise has thousands of instances of unapproved components
Most organizations have no visibility into development tools and CI/CD dependencies

Risks of Shadow Open Source:

Risk	Description
Unpatched vulnerabilities	Components not in inventory don't get updated
License violations	Unknown components may introduce compliance issues
Unsupported components	Abandoned projects used without awareness
Supply chain attacks	Malicious packages installed without review
Incident response delays	Can't respond to advisories for unknown components

Detection Strategies:

Comprehensive scanning: Scan all repositories, not just "official" ones
Runtime discovery: Identify what's actually running versus what's expected
Network monitoring: Detect package manager traffic from unexpected sources
Build system analysis: Capture dependencies resolved during builds
Developer surveys: Ask teams about tools and libraries they use

Container and Image Inventory¶

Container environments present distinct inventory challenges. Container images bundle application code with operating system components, creating layered dependencies that require specialized approaches.

Container-Specific Challenges:

Image sprawl: Organizations often have thousands of container images
Layer complexity: Base images contain OS packages; application layers add more
Tag mutability: The same tag can point to different images over time
Registry diversity: Images from Docker Hub, vendor registries, internal registries
Runtime dynamics: Orchestrators schedule containers dynamically
Ephemeral nature: Containers come and go; inventory changes constantly

What Container Inventory Must Track:

Image registry locations: Where images are stored and pulled from
Base image lineage: What base images are used and their contents
Layer contents: What's in each image layer
Running containers: What's actually deployed in each environment
Image digests: Immutable identifiers (not just mutable tags)
Deployment mapping: Which images run in which clusters/namespaces

Container Inventory Approaches:

Registry Scanning: - Scan images stored in registries - Provides inventory of available images - Can scan before deployment

Admission Control: - Kubernetes admission controllers track what's deployed - Provides real-time deployment inventory - Can enforce policy at deployment

Runtime Agents: - Agents in clusters track running containers - Shows actual production state - Monitors for drift and changes

Build Pipeline Integration: - Generate inventory during image build - Link images to source code and build provenance - Create SBOMs as build artifacts

Tools for Container Inventory:

Tool	Type	Features
Trivy	Scanner	Image scanning, SBOM generation
Syft	Scanner	Multi-ecosystem SBOM creation
Anchore	Platform	Registry scanning, policy enforcement
Aqua Security	Platform	Full lifecycle container security
Sysdig	Runtime	Runtime visibility and detection

The Importance of Accurate, Up-to-Date Inventory¶

Inventory value depends entirely on accuracy and currency. Outdated or incomplete inventory creates false confidence—worse than having no inventory at all.

Characteristics of Effective Inventory:

Complete: Covers all components across all environments
Accurate: Reflects actual state, not just declared state
Current: Updated as changes occur, not periodically
Detailed: Includes versions, locations, and context
Accessible: Available to those who need it for decision-making
Actionable: Linked to vulnerability data and policy

The Cost of Poor Inventory:

When Log4Shell (CVE-2021-44228) was disclosed in December 2021, organizations faced an urgent question: where is Log4j in our environment?⁶ Companies with accurate inventory responded in hours; those without spent days or weeks searching.

Many organizations discovered during the Log4Shell response that their actual Log4j usage exceeded documented inventory by 5-10x—a gap that significantly extended response times and increased risk exposure.⁷

Inventory Accuracy Metrics:

Track inventory quality over time:

Coverage: Percentage of known environments scanned
Currency: Average age of inventory data
Precision: Rate of false positives in discovered components
Recall: Rate of components discovered vs. ground truth
Consistency: Agreement between different discovery methods

Tooling for Continuous Discovery¶

Effective inventory requires automation—manual processes cannot keep pace with modern development velocity.

Tool Categories:

Software Composition Analysis (SCA): - Snyk, Sonatype Nexus, Checkmarx SCA, Black Duck - Primary function: dependency scanning and vulnerability identification - Inventory is foundation for security analysis

Asset Discovery: - ServiceNow Discovery, Qualys Asset Inventory, Rapid7 - Primary function: IT asset management - Includes software but often less depth on dependencies

Container Security Platforms: - Aqua Security, Prisma Cloud, Sysdig Secure - Primary function: container lifecycle security - Deep container and Kubernetes inventory

SBOM Generation Tools: - Syft, CycloneDX CLI, SPDX tools - Primary function: create standardized component inventories - Output feeds other security and inventory tools

Cloud Security Posture Management (CSPM): - Wiz, Orca, Lacework - Primary function: cloud security - Includes software inventory across cloud workloads

Continuous Discovery Architecture:

┌─────────────────┐     ┌─────────────────┐     ┌─────────────────┐
│  Source Repos   │────▶│  CI/CD Pipeline │────▶│   Registries    │
│  (Source Scan)  │     │  (Build Scan)   │     │ (Registry Scan) │
└─────────────────┘     └─────────────────┘     └─────────────────┘
         │                       │                       │
         ▼                       ▼                       ▼
┌─────────────────────────────────────────────────────────────────┐
│                    Central Inventory Database                    │
│         (Aggregates from all sources, maintains history)         │
└─────────────────────────────────────────────────────────────────┘
         │                       │                       │
         ▼                       ▼                       ▼
┌─────────────────┐     ┌─────────────────┐     ┌─────────────────┐
│ Vulnerability   │     │    Reporting    │     │     Policy      │
│   Correlation   │     │   & Analytics   │     │   Enforcement   │
└─────────────────┘     └─────────────────┘     └─────────────────┘

Integration Points:

Source control hooks: Scan on commit and pull request
Build system plugins: Generate inventory during build
Registry webhooks: Scan images when pushed
Kubernetes operators: Monitor deployments in clusters
Runtime agents: Track what's actually executing
API aggregation: Consolidate data from multiple tools

Recommendations¶

For Security Practitioners:

Assess current visibility. Run discovery tools against all environments. Compare results to known inventory. The gap represents your blind spots.
Deploy multiple discovery methods. Source scanning alone is insufficient. Add build-time and runtime discovery for comprehensive coverage.
Prioritize container inventory. Container environments often have the largest visibility gaps. Deploy container-specific tooling.
Address shadow open source. Scan beyond official repositories. Discover ungoverned usage before it becomes a problem.

For Operations Teams:

Automate discovery. Manual inventory cannot keep pace. Integrate discovery into pipelines and runtime environments.
Track inventory metrics. Measure coverage, currency, and accuracy. Improve systematically.
Consolidate inventory data. Aggregate from multiple sources into unified view. Eliminate silos.
Maintain history. Track inventory changes over time. Historical data supports incident response and compliance.

For Organizations:

Establish inventory requirements. Define what must be inventoried and to what level of detail.
Invest in tooling. Budget for comprehensive discovery tools. The cost is trivial compared to the cost of blind spots.
Connect inventory to SBOM. Inventory feeds SBOM generation (Section 12.2). Build the connection deliberately.
Make inventory actionable. Inventory without action is overhead. Connect to vulnerability management, compliance, and incident response.

Knowing what you have is the unglamorous foundation of supply chain security. Organizations that invest in comprehensive, accurate, continuous inventory build the visibility that makes all other security activities possible. Those that don't respond to every vulnerability advisory with the same question: "Do we even have that?" In a world where minutes matter during incident response, that question is already too late.

Various industry surveys suggest organizations maintain visibility into less than half their dependencies. See Sonatype State of the Software Supply Chain reports and Synopsys OSSRA reports. ↩
Sonatype, "2024 State of the Software Supply Chain Report," 2024, https://www.sonatype.com/state-of-the-software-supply-chain/introduction ↩
Synopsys, "2024 Open Source Security and Risk Analysis Report," 2024, https://www.synopsys.com/software-integrity/resources/analyst-reports/open-source-security-risk-analysis.html ↩
Synopsys, "2020 Open Source Security and Risk Analysis Report," 2020, https://www.synopsys.com/software-integrity/resources/analyst-reports/open-source-security-risk-analysis.html ↩
Tidelift and The New Stack, "2021 Managed Open Source Survey," 2021, https://tidelift.com/subscription/managed-open-source-survey ↩
CISA, "Apache Log4j Vulnerability Guidance," December 2021, https://www.cisa.gov/uscert/apache-log4j-vulnerability-guidance ↩
Sonatype, "Log4j Vulnerability Resource Center," 2022, https://www.sonatype.com/resources/log4j-vulnerability-resource-center ↩