11.4 Scorecards and Risk Metrics¶

Manual project health assessment (Section 11.3) provides depth but doesn't scale. Organizations with hundreds or thousands of dependencies need automated approaches to evaluate supply chain risk. Several scoring systems have emerged to fill this need—OpenSSF Scorecards, deps.dev, Libraries.io SourceRank, and commercial offerings—each measuring different aspects of project health and security.

Understanding what these scores measure, how they're calculated, and their limitations enables organizations to use them effectively without over-relying on numbers that may not capture their specific risk context.

OpenSSF Scorecards¶

OpenSSF Scorecards is the most comprehensive open source security scoring system, measuring security practices through automated checks against public repositories.

Installation and Usage:

# Install Scorecards
go install github.com/ossf/scorecard/v4/cmd/scorecard@latest

# Score a repository
scorecard --repo=github.com/expressjs/express

# Score with format options
scorecard --repo=github.com/expressjs/express --format=json

# Score a local repository
scorecard --local=./my-project

Scorecard Checks:

Scorecards evaluates projects across multiple security dimensions:

Check	What It Measures	Weight
Code-Review	PRs require review before merge	High
Maintained	Project shows recent activity	High
Branch-Protection	Branch protection rules enforced	High
Vulnerabilities	Known vulnerabilities in OSV database	Critical
Dangerous-Workflow	Risky patterns in GitHub Actions	High
Token-Permissions	Minimal token permissions in workflows	Medium
Dependency-Update-Tool	Uses Dependabot, Renovate, etc.	Medium
Fuzzing	Project is fuzzed (OSS-Fuzz, etc.)	Medium
SAST	Static analysis in CI/CD	Medium
Security-Policy	SECURITY.md exists	Medium
Binary-Artifacts	No binaries in repository	Medium
License	Has standard license	Low
Pinned-Dependencies	Dependencies pinned by hash	Medium
Packaging	Published through secure pipeline	Medium
Signed-Releases	Releases are cryptographically signed	High
Contributors	Multiple organizational contributors	Low
CII-Best-Practices	OpenSSF Best Practices badge	Medium

Scoring Methodology:

Each check produces a score from 0-10:

10: Fully meets criteria
7-9: Mostly meets criteria with minor gaps
4-6: Partial implementation
1-3: Minimal compliance
0: Does not meet criteria
-1: Check not applicable or couldn't run

The aggregate score is a weighted average, with critical and high-weight checks contributing more.

Interpreting Results:

{
  "repo": {
    "name": "github.com/expressjs/express"
  },
  "scorecard": {
    "version": "v4.13.1",
    "commit": "..."
  },
  "score": 7.2,
  "checks": [
    {
      "name": "Code-Review",
      "score": 10,
      "reason": "all changesets reviewed"
    },
    {
      "name": "Maintained",
      "score": 10,
      "reason": "30 commit(s) and 20 issue activity in last 90 days"
    },
    {
      "name": "Signed-Releases",
      "score": -1,
      "reason": "no releases found"
    }
  ]
}

What Scores Mean:

Score Range	Interpretation
8-10	Strong security practices
6-7.9	Good practices with gaps
4-5.9	Moderate concerns
2-3.9	Significant gaps
0-1.9	Minimal security practices

Scorecard API and Integration:

For automated workflows:

# Using the Scorecard API
curl "https://api.securityscorecards.dev/projects/github.com/expressjs/express"

GitHub Actions integration:

- name: Run Scorecards
  uses: ossf/scorecard-action@v2
  with:
    results_file: results.sarif
    publish_results: true

deps.dev: Google's Dependency Insights¶

deps.dev (Open Source Insights) is Google's service providing dependency information across multiple ecosystems.

Coverage:

npm (JavaScript/Node.js)
PyPI (Python)
Go modules
Maven (Java)
Cargo (Rust)
NuGet (.NET)

Features:

Dependency Graphs: - Visualizes complete transitive dependency trees - Shows how packages connect across the ecosystem - Identifies paths to vulnerable packages

Security Advisories: - Aggregates advisories from OSV, NVD, and ecosystem-specific sources - Shows which versions are affected - Provides advisory timeline

Scorecard Integration: - Displays OpenSSF Scorecard results where available - Links to detailed Scorecard reports

License Information: - Detects package licenses - Flags license compatibility issues - Shows license distribution across dependencies

Using deps.dev:

Web interface: https://deps.dev

API access:

# Get package info
curl "https://api.deps.dev/v3alpha/systems/npm/packages/express"

# Get version details
curl "https://api.deps.dev/v3alpha/systems/npm/packages/express/versions/4.18.2"

# Get dependencies
curl "https://api.deps.dev/v3alpha/systems/npm/packages/express/versions/4.18.2:dependencies"

deps.dev Strengths:

Cross-ecosystem visibility
Direct dependency graph exploration
Integration of multiple data sources
Free and accessible

Limitations:

No proprietary scoring (relies on Scorecard)
Limited to supported ecosystems
No private repository support

Ecosyste.ms: Comprehensive Package Metadata¶

Ecosyste.ms Packages is an open API service providing comprehensive package, version, and dependency metadata across multiple open source ecosystems and registries.

Coverage:

As of 2025, Ecosyste.ms has indexed:¹ - 12.1 million packages across 75 package sources - 287 million repositories - 24,000 security advisories across 12 languages

This makes it one of the most comprehensive package metadata aggregators available.

Metadata Provided:

Ecosyste.ms offers detailed information for each package:

Package name, ecosystem, version
Description, license, keywords
Repository URL, homepage
Maintainers and namespaces
Registry information
Complete dependency graphs (direct and transitive)
Version history and release timeline
Security advisories affecting the package
Download statistics where available

Using Ecosyste.ms:

Web interface: https://packages.ecosyste.ms/

API access:

# Search for a package across all ecosystems
curl "https://packages.ecosyste.ms/api/v1/packages/lookup?name=express"

# Get package details for specific ecosystem
curl "https://packages.ecosyste.ms/api/v1/packages/npm/express"

# Get version information
curl "https://packages.ecosyste.ms/api/v1/packages/npm/express/versions/4.18.2"

# Get dependencies
curl "https://packages.ecosyste.ms/api/v1/packages/npm/express/4.18.2/dependencies"

API rate limits: 5000 requests per hour by default (based on IP address). Full API documentation available.

Ecosystem Coverage:

Ecosyste.ms aggregates from 75 sources including:

npm (JavaScript/Node.js)
PyPI (Python)
RubyGems (Ruby)
Maven Central (Java)
Cargo (Rust)
Go modules
NuGet (.NET)
Packagist (PHP)
Hex (Elixir)
CPAN (Perl)
And many others

Open Data:

A key differentiator is Ecosyste.ms's commitment to open data. The project provides downloadable datasets for research and analysis:

Complete package metadata exports
Dependency graph data
Security advisory mappings
Historical release information

These datasets enable researchers and security practitioners to perform large-scale analysis without API rate limits.

Use Cases:

Dependency research: - Cross-reference a package across multiple ecosystems - Identify packages with similar names in different registries - Research dependency patterns across ecosystems

Security analysis: - Map known vulnerabilities to affected packages and versions - Trace transitive dependencies to identify exposure - Monitor for new security advisories

Package discovery: - Find packages by functionality across ecosystems - Compare similar packages across different language ecosystems - Identify alternative packages

Example workflow:

# You're evaluating a Python package and want comprehensive metadata
PACKAGE="requests"

# Get basic info
curl "https://packages.ecosyste.ms/api/v1/packages/pypi/$PACKAGE" | jq '.'

# Get all versions
curl "https://packages.ecosyste.ms/api/v1/packages/pypi/$PACKAGE/versions" | jq '.[] | .number'

# Get dependencies for latest version
VERSION=$(curl -s "https://packages.ecosyste.ms/api/v1/packages/pypi/$PACKAGE" | jq -r '.latest_stable_release_number')
curl "https://packages.ecosyste.ms/api/v1/packages/pypi/$PACKAGE/$VERSION/dependencies" | jq '.'

# Check for security advisories
curl "https://packages.ecosyste.ms/api/v1/packages/pypi/$PACKAGE/advisories" | jq '.'

Strengths:

Extremely comprehensive coverage (75+ sources)
Open data downloads for bulk analysis
Unified API across all ecosystems
Security advisory integration
Free and open source
Active development by Open Source Collective

Limitations:

No proprietary scoring (provides data, not scores)
API rate limits for individual queries
Metadata quality depends on source registries
No private repository support

Comparison with other services:

Feature	Ecosyste.ms	deps.dev	Libraries.io
Ecosystems covered	75+	6	40+
Open data exports	Yes	No	Limited
Security advisories	Yes	Yes	Yes
Scoring	No	Scorecard	SourceRank
Free tier	Full access	Full access	Limited

Ecosyste.ms excels at breadth of coverage and data accessibility, making it particularly valuable for research, cross-ecosystem analysis, and building custom tooling.

Integration example:

import requests

def get_package_metadata(ecosystem, package_name):
    """Fetch comprehensive package metadata from Ecosyste.ms"""
    url = f"https://packages.ecosyste.ms/api/v1/packages/{ecosystem}/{package_name}"
    response = requests.get(url)
    if response.status_code == 200:
        return response.json()
    return None

def check_security_advisories(ecosystem, package_name):
    """Check for security advisories affecting a package"""
    url = f"https://packages.ecosyste.ms/api/v1/packages/{ecosystem}/{package_name}/advisories"
    response = requests.get(url)
    if response.status_code == 200:
        advisories = response.json()
        return [a for a in advisories if a['severity'] in ['high', 'critical']]
    return []

# Example usage
metadata = get_package_metadata("npm", "express")
advisories = check_security_advisories("npm", "express")

if advisories:
    print(f"⚠️  {len(advisories)} high/critical advisories found")

Resources:

Ecosyste.ms complements other tools by providing the raw metadata foundation that can feed into scoring, analysis, and decision-making workflows. Use it when you need comprehensive, cross-ecosystem package data without proprietary scoring interpretations.

Libraries.io SourceRank¶

Libraries.io tracks open source packages across 40+ package managers, providing the SourceRank algorithm for scoring package quality.

SourceRank Factors:

Factor	Points	Description
Basic info present	1	Description, homepage, keywords
Repository present	1	Linked source repository
Readme present	1	Has README documentation
License present	1	Declared license
Has multiple versions	1	Published updates
Recent release	1	Released in last 6 months
Not brand new	1	> 6 months old
1.0.0 or greater	1	Reached stable version
Has dependents	1-5	Other packages depend on it
Has contributors	1-2	Multiple contributors
Has stars	1-3	GitHub stars
Has subscribers	1-2	GitHub watchers

Maximum Score: ~30 points (varies by factors achieved)

Using Libraries.io:

Web interface: https://libraries.io

API (requires key):

# Get package info
curl "https://libraries.io/api/npm/express?api_key=YOUR_KEY"

# Search packages
curl "https://libraries.io/api/search?q=express&platforms=npm&api_key=YOUR_KEY"

SourceRank Interpretation:

25+: Well-established, healthy project
20-24: Good project with solid fundamentals
15-19: Reasonable project, some gaps
10-14: Basic project, limited signals
<10: Minimal metadata, caution warranted

Limitations:

Quantity-focused (metadata presence vs. quality)
Gaming potential (easy to add stars, keywords)
Doesn't evaluate security practices
Historical focus (rewards age, established packages)

Commercial Offerings¶

Several commercial tools provide dependency scoring and risk metrics:

Snyk Advisor:

Snyk Advisor provides package health scores for npm and Python packages.

Scoring Factors:

Maintenance: Recent commits, release frequency
Security: Known vulnerabilities, security policy
Community: Contributors, stars, forks
Popularity: Downloads, dependents

Score Calculation: Each factor contributes to an overall "health score" on a 0-100 scale.

Differentiators: - Integration with Snyk vulnerability database - Actionable security recommendations - Commercial support and SLAs

Sonatype OSS Index:

Sonatype OSS Index provides free vulnerability database with risk scoring:

Vulnerability scoring: CVSS-based
Component age: Freshness metrics
Popularity: Download statistics

Socket.dev:

Socket.dev focuses on supply chain-specific risks:

Behavioral analysis: Detects suspicious package behavior
Typosquatting detection: Identifies impersonation attempts
Install script analysis: Flags dangerous install-time behavior

Comparison Matrix:

Tool	Coverage	Security Focus	Cost	API Access
OpenSSF Scorecards	GitHub projects	High	Free	Yes
deps.dev	6 ecosystems	Medium	Free	Yes
Libraries.io	40+ ecosystems	Low	Free/Paid	Yes
Snyk Advisor	npm, PyPI	Medium	Free	Limited
Socket.dev	npm, PyPI	High	Paid	Yes
Sonatype	Maven, npm, PyPI	Medium	Free/Paid	Yes

Limitations of Scoring Systems¶

All scoring systems have significant limitations that users must understand:

Gaming Potential:

Scores can be artificially inflated:

Fake stars: Purchased or automated GitHub stars
Superficial compliance: Adding SECURITY.md without actual process
Checkbox checking: Implementing checks without substance
Social proof: Coordinated reviews and recommendations

Sophisticated attackers can maintain good scores while preparing attacks. For example, the XZ Utils backdoor (CVE-2024-3094) was introduced by a maintainer who had spent years building trust and would have scored well on most automated checks.

False Precision:

A score of 7.2 vs. 7.4 is not meaningful:

Underlying data has uncertainty
Weighting is somewhat arbitrary
Different checks may matter more in different contexts
Small changes in scoring algorithms shift numbers

Context Dependence:

The same score means different things in different contexts:

A package used in your build pipeline needs different scrutiny than one used at runtime
Critical infrastructure has different requirements than experimental tools
Regulated industries may require specific practices not captured in generic scores
Your threat model determines which checks matter most

What Scores Miss:

No automated system captures:

Maintainer trustworthiness and intent
Subtle code quality issues
Business continuity risks
Contextual appropriateness for your use case
Emerging threats not yet patterns

Temporal Limitations:

Scores represent a point in time:

Projects change between score calculation and your evaluation
Maintainer status can shift rapidly (the XZ Utils backdoor, CVE-2024-3094, was introduced by a trusted maintainer who had contributed for nearly three years before inserting the malicious code)
Vulnerabilities are discovered after scoring

Building Organization-Specific Scoring¶

Generic scores may not reflect your specific risk context. Consider building composite scores tailored to your needs.

Step 1: Identify Your Risk Priorities

What matters most to your organization?

Regulated industry: Compliance-relevant factors weighted higher
High-security application: Security practices weighted higher
Rapid development: Maintenance and community weighted higher
Long-term investment: Maturity and stability weighted higher

Step 2: Select Relevant Inputs

Combine data sources:

# Conceptual composite scoring
def calculate_risk_score(package):
    scorecard = get_scorecard(package)
    deps_info = get_deps_dev(package)
    internal_data = get_internal_metrics(package)

    # Weight according to your priorities
    security_score = scorecard['score'] * 0.4
    vulnerability_score = calculate_vuln_score(deps_info) * 0.3
    maintenance_score = calculate_maintenance(deps_info) * 0.2
    internal_score = internal_data['usage_risk'] * 0.1

    return security_score + vulnerability_score + maintenance_score + internal_score

Step 3: Incorporate Internal Context

Add organization-specific factors:

Usage criticality: How critical is this dependency to your systems?
Exposure level: Is it in security-sensitive code paths?
Replacement difficulty: How hard would it be to replace?
Historical issues: Have you had problems with this dependency?

Step 4: Define Thresholds

Establish decision thresholds:

Composite Score	Policy
8-10	Approved for any use
6-7.9	Approved with monitoring
4-5.9	Requires security review
<4	Not approved without exception

Step 5: Automate and Integrate

Integrate scoring into development workflow:

# Example CI integration
- name: Check dependency scores
  run: |
    for dep in $(list-dependencies); do
      score=$(get-composite-score $dep)
      if [ $score -lt 4 ]; then
        echo "Dependency $dep score too low: $score"
        exit 1
      fi
    done

Recommendations¶

For Developers:

Use Scorecards as starting point. Run Scorecards on dependencies you're evaluating. Low scores warrant investigation; high scores don't guarantee safety.
Check multiple sources. Cross-reference Scorecards, deps.dev, and ecosystem-specific tools. Different tools catch different issues.
Understand what scores measure. A high score means good practices are implemented, not that the package is secure. Read the check details, not just the number.
Watch for red flags beyond scores. Scores miss behavioral issues, malicious intent, and context-specific risks.

For Security Practitioners:

Build context-aware scoring. Generic scores don't reflect your specific risk tolerance. Weight factors according to your threat model.
Don't treat scores as absolutes. Use scores for prioritization and screening, not final decisions. Review borderline cases manually.
Monitor score changes. Sudden score drops may indicate problems. Subscribe to alerts for critical dependencies.
Understand gaming potential. Sophisticated attackers can maintain good scores while preparing attacks. Scores are one input, not the answer.

For Organizations:

Define acceptable thresholds. Establish policies for minimum scores by dependency criticality. Document exceptions and their rationale.
Integrate scoring into pipelines. Automate score checking in CI/CD. Block deployments with unreviewed low-scoring dependencies.
Complement automation with review. Automated scoring screens; human review decides. Critical dependencies deserve manual assessment regardless of score.
Track effectiveness. When incidents occur, evaluate whether scoring would have flagged them. Adjust weights and thresholds based on outcomes.

Scoring systems provide valuable automation for supply chain risk assessment, transforming qualitative health indicators into comparable metrics. But scores are proxies for security, not security itself. Organizations that use scores effectively understand their construction, acknowledge their limitations, and integrate them into broader risk management processes rather than treating them as definitive answers.

Ecosyste.ms, "Packages API Statistics," 2025, https://packages.ecosyste.ms/. Statistics reflect platform data as of early 2025. ↩