12.1 Inventory: Knowing What You Have¶
You cannot secure what you cannot see. This fundamental principle underlies all supply chain security efforts, yet most organizations struggle with basic visibility into what software components actually run in their environments. Organizations significantly underestimate their open source usage—with open source comprising 76-78% of codebases, often far exceeding management expectations—and the software they don't know about represents their greatest vulnerability.
This section establishes software inventory as the foundation for supply chain security. Before you can assess risk, patch vulnerabilities, or respond to incidents, you must first answer: what do we have?
The Challenge of Visibility¶
Modern software environments are extraordinarily complex. A typical enterprise has thousands of applications, each with hundreds of dependencies, running across diverse infrastructure—on-premises servers, multiple cloud providers, container orchestration platforms, serverless functions, and edge deployments.
The Visibility Gap:
Research reveals significant gaps between perceived and actual software inventory:
- Organizations typically track only 40-50% of their open source components1
- The average Java application contains 148 open source dependencies2, far exceeding most organizations' estimates
- 96% of commercial codebases contain open source, with open source comprising 78% of code on average3
- 91% of codebases contain components that are either outdated or have had no development activity in two years4
Why Visibility Is Difficult:
Several factors contribute to visibility challenges:
- Transitive dependencies: Direct dependencies are trackable; their dependencies (and their dependencies) often aren't
- Multiple ecosystems: A single application may include npm, Maven, pip, and container dependencies
- Development velocity: New components are added faster than inventory processes can track
- Decentralized decisions: Individual developers choose dependencies without central oversight
- Infrastructure diversity: Different deployment targets (containers, VMs, serverless) require different discovery methods
- Legacy systems: Older applications may lack modern dependency management
Organizations consistently underestimate their software inventory. First-time comprehensive scans typically reveal thousands of components with no formal records—dependencies that have been in production for years without visibility or governance.
Discovery Methods¶
Different discovery approaches provide different perspectives on software inventory, each with strengths and limitations.
Source Code Scanning:
Analyzes source repositories to identify declared dependencies.
How It Works:
- Parses manifest files (package.json, pom.xml, requirements.txt)
- Analyzes lockfiles for exact versions
- Scans for vendored dependencies copied into repositories
- Identifies imports and includes in source code
Strengths: - Comprehensive view of declared dependencies - Works before deployment - Can identify dependencies not yet installed - Provides version information
Limitations: - Only sees what's declared, not what's actually running - May miss dynamically loaded dependencies - Requires repository access - Point-in-time snapshot
Build-Time Analysis:
Examines the build process to identify what's actually compiled or bundled.
How It Works: - Hooks into build systems (Maven, Gradle, webpack) - Captures dependency resolution at build time - Records what's actually included in build artifacts - Can generate SBOMs as build output
Strengths: - Reflects what's actually built - Captures build-time dependency resolution - More accurate than source-only scanning - Integrates with CI/CD pipelines
Limitations: - Requires build system integration - Different for each build technology - May miss runtime-downloaded content - Not applicable to interpreted languages without build step
Runtime Detection:
Identifies components actually executing in production environments.
How It Works: - Agents monitor running processes - Analyzes loaded libraries and modules - Inspects container contents at runtime - Monitors network calls to package registries
Strengths: - Shows what's actually running (ground truth) - Captures dynamically loaded components - Detects shadow IT and ungoverned software - Reflects production reality
Limitations: - Requires production access and agents - Performance overhead concerns - May not capture intermittently used components - Deployment complexity
Comparison Matrix:
| Method | Coverage | Accuracy | Timing | Complexity |
|---|---|---|---|---|
| Source Scanning | Declared deps | Medium | Pre-deployment | Low |
| Build Analysis | Built deps | High | Build time | Medium |
| Runtime Detection | Running deps | Highest | Production | High |
Recommended Approach:
Use multiple methods for comprehensive coverage:
- Source scanning for early visibility and developer feedback
- Build analysis for accurate artifact inventory
- Runtime detection to verify production state and catch gaps
Shadow IT and Ungoverned Open Source¶
Shadow open source refers to open source components used without organizational awareness or approval—the software equivalent of shadow IT.
How Shadow Open Source Accumulates:
- Developers add dependencies without formal review
- Copy-paste from Stack Overflow includes library references
- Proof-of-concept code with ungoverned dependencies reaches production
- Acquired companies bring unknown component portfolios
- Third-party contractors introduce untracked dependencies
The Scale of the Problem:
- Studies suggest 30-40% of open source in enterprise codebases is ungoverned5
- Average enterprise has thousands of instances of unapproved components
- Most organizations have no visibility into development tools and CI/CD dependencies
Risks of Shadow Open Source:
| Risk | Description |
|---|---|
| Unpatched vulnerabilities | Components not in inventory don't get updated |
| License violations | Unknown components may introduce compliance issues |
| Unsupported components | Abandoned projects used without awareness |
| Supply chain attacks | Malicious packages installed without review |
| Incident response delays | Can't respond to advisories for unknown components |
Detection Strategies:
- Comprehensive scanning: Scan all repositories, not just "official" ones
- Runtime discovery: Identify what's actually running versus what's expected
- Network monitoring: Detect package manager traffic from unexpected sources
- Build system analysis: Capture dependencies resolved during builds
- Developer surveys: Ask teams about tools and libraries they use
Container and Image Inventory¶
Container environments present distinct inventory challenges. Container images bundle application code with operating system components, creating layered dependencies that require specialized approaches.
Container-Specific Challenges:
- Image sprawl: Organizations often have thousands of container images
- Layer complexity: Base images contain OS packages; application layers add more
- Tag mutability: The same tag can point to different images over time
- Registry diversity: Images from Docker Hub, vendor registries, internal registries
- Runtime dynamics: Orchestrators schedule containers dynamically
- Ephemeral nature: Containers come and go; inventory changes constantly
What Container Inventory Must Track:
- Image registry locations: Where images are stored and pulled from
- Base image lineage: What base images are used and their contents
- Layer contents: What's in each image layer
- Running containers: What's actually deployed in each environment
- Image digests: Immutable identifiers (not just mutable tags)
- Deployment mapping: Which images run in which clusters/namespaces
Container Inventory Approaches:
Registry Scanning: - Scan images stored in registries - Provides inventory of available images - Can scan before deployment
Admission Control: - Kubernetes admission controllers track what's deployed - Provides real-time deployment inventory - Can enforce policy at deployment
Runtime Agents: - Agents in clusters track running containers - Shows actual production state - Monitors for drift and changes
Build Pipeline Integration: - Generate inventory during image build - Link images to source code and build provenance - Create SBOMs as build artifacts
Tools for Container Inventory:
| Tool | Type | Features |
|---|---|---|
| Trivy | Scanner | Image scanning, SBOM generation |
| Syft | Scanner | Multi-ecosystem SBOM creation |
| Anchore | Platform | Registry scanning, policy enforcement |
| Aqua Security | Platform | Full lifecycle container security |
| Sysdig | Runtime | Runtime visibility and detection |
The Importance of Accurate, Up-to-Date Inventory¶
Inventory value depends entirely on accuracy and currency. Outdated or incomplete inventory creates false confidence—worse than having no inventory at all.
Characteristics of Effective Inventory:
- Complete: Covers all components across all environments
- Accurate: Reflects actual state, not just declared state
- Current: Updated as changes occur, not periodically
- Detailed: Includes versions, locations, and context
- Accessible: Available to those who need it for decision-making
- Actionable: Linked to vulnerability data and policy
The Cost of Poor Inventory:
When Log4Shell (CVE-2021-44228) was disclosed in December 2021, organizations faced an urgent question: where is Log4j in our environment?6 Companies with accurate inventory responded in hours; those without spent days or weeks searching.
Many organizations discovered during the Log4Shell response that their actual Log4j usage exceeded documented inventory by 5-10x—a gap that significantly extended response times and increased risk exposure.7
Inventory Accuracy Metrics:
Track inventory quality over time:
- Coverage: Percentage of known environments scanned
- Currency: Average age of inventory data
- Precision: Rate of false positives in discovered components
- Recall: Rate of components discovered vs. ground truth
- Consistency: Agreement between different discovery methods
Tooling for Continuous Discovery¶
Effective inventory requires automation—manual processes cannot keep pace with modern development velocity.
Tool Categories:
Software Composition Analysis (SCA): - Snyk, Sonatype Nexus, Checkmarx SCA, Black Duck - Primary function: dependency scanning and vulnerability identification - Inventory is foundation for security analysis
Asset Discovery: - ServiceNow Discovery, Qualys Asset Inventory, Rapid7 - Primary function: IT asset management - Includes software but often less depth on dependencies
Container Security Platforms: - Aqua Security, Prisma Cloud, Sysdig Secure - Primary function: container lifecycle security - Deep container and Kubernetes inventory
SBOM Generation Tools: - Syft, CycloneDX CLI, SPDX tools - Primary function: create standardized component inventories - Output feeds other security and inventory tools
Cloud Security Posture Management (CSPM): - Wiz, Orca, Lacework - Primary function: cloud security - Includes software inventory across cloud workloads
Continuous Discovery Architecture:
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ Source Repos │────▶│ CI/CD Pipeline │────▶│ Registries │
│ (Source Scan) │ │ (Build Scan) │ │ (Registry Scan) │
└─────────────────┘ └─────────────────┘ └─────────────────┘
│ │ │
▼ ▼ ▼
┌─────────────────────────────────────────────────────────────────┐
│ Central Inventory Database │
│ (Aggregates from all sources, maintains history) │
└─────────────────────────────────────────────────────────────────┘
│ │ │
▼ ▼ ▼
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ Vulnerability │ │ Reporting │ │ Policy │
│ Correlation │ │ & Analytics │ │ Enforcement │
└─────────────────┘ └─────────────────┘ └─────────────────┘
Integration Points:
- Source control hooks: Scan on commit and pull request
- Build system plugins: Generate inventory during build
- Registry webhooks: Scan images when pushed
- Kubernetes operators: Monitor deployments in clusters
- Runtime agents: Track what's actually executing
- API aggregation: Consolidate data from multiple tools
Recommendations¶
For Security Practitioners:
-
Assess current visibility. Run discovery tools against all environments. Compare results to known inventory. The gap represents your blind spots.
-
Deploy multiple discovery methods. Source scanning alone is insufficient. Add build-time and runtime discovery for comprehensive coverage.
-
Prioritize container inventory. Container environments often have the largest visibility gaps. Deploy container-specific tooling.
-
Address shadow open source. Scan beyond official repositories. Discover ungoverned usage before it becomes a problem.
For Operations Teams:
-
Automate discovery. Manual inventory cannot keep pace. Integrate discovery into pipelines and runtime environments.
-
Track inventory metrics. Measure coverage, currency, and accuracy. Improve systematically.
-
Consolidate inventory data. Aggregate from multiple sources into unified view. Eliminate silos.
-
Maintain history. Track inventory changes over time. Historical data supports incident response and compliance.
For Organizations:
-
Establish inventory requirements. Define what must be inventoried and to what level of detail.
-
Invest in tooling. Budget for comprehensive discovery tools. The cost is trivial compared to the cost of blind spots.
-
Connect inventory to SBOM. Inventory feeds SBOM generation (Section 12.2). Build the connection deliberately.
-
Make inventory actionable. Inventory without action is overhead. Connect to vulnerability management, compliance, and incident response.
Knowing what you have is the unglamorous foundation of supply chain security. Organizations that invest in comprehensive, accurate, continuous inventory build the visibility that makes all other security activities possible. Those that don't respond to every vulnerability advisory with the same question: "Do we even have that?" In a world where minutes matter during incident response, that question is already too late.
-
Various industry surveys suggest organizations maintain visibility into less than half their dependencies. See Sonatype State of the Software Supply Chain reports and Synopsys OSSRA reports. ↩
-
Sonatype, "2024 State of the Software Supply Chain Report," 2024, https://www.sonatype.com/state-of-the-software-supply-chain/introduction ↩
-
Synopsys, "2024 Open Source Security and Risk Analysis Report," 2024, https://www.synopsys.com/software-integrity/resources/analyst-reports/open-source-security-risk-analysis.html ↩
-
Synopsys, "2020 Open Source Security and Risk Analysis Report," 2020, https://www.synopsys.com/software-integrity/resources/analyst-reports/open-source-security-risk-analysis.html ↩
-
Tidelift and The New Stack, "2021 Managed Open Source Survey," 2021, https://tidelift.com/subscription/managed-open-source-survey ↩
-
CISA, "Apache Log4j Vulnerability Guidance," December 2021, https://www.cisa.gov/uscert/apache-log4j-vulnerability-guidance ↩
-
Sonatype, "Log4j Vulnerability Resource Center," 2022, https://www.sonatype.com/resources/log4j-vulnerability-resource-center ↩