14.4 Security Regression Testing¶
When Lodash released version 4.17.12 in July 2019 to fix a prototype pollution vulnerability (CVE-2019-10744), some applications that updated found their security tests now failed—not because Lodash was broken, but because their code relied on the previously vulnerable behavior. Conversely, other applications updated without running security tests, only to discover weeks later that the update had broken authentication middleware that depended on Lodash's object manipulation. Security regression testing ensures that dependency updates don't reintroduce vulnerabilities, break security controls, or change security-relevant behavior unexpectedly.
This section covers designing and implementing security regression tests for dependency updates, integrating them with automated update tools, and establishing criteria for when to roll back problematic changes.
Security Regression Test Design Principles¶
Security regression tests verify that security properties remain intact as code changes—including when dependencies update.
Design Principles:
1. Test Security Properties, Not Implementations:
Test that authentication works, not how a library implements it:
// Good: Tests security property
test('invalid tokens are rejected', async () => {
const response = await request(app)
.get('/protected')
.set('Authorization', 'Bearer invalid-token');
expect(response.status).toBe(401);
});
// Fragile: Tests implementation detail
test('jwt.verify is called with correct options', () => {
// Breaks if library internals change
});
2. Cover Security Boundaries:
Focus tests on where your code meets dependencies:
// Test at the boundary between your code and dependency
describe('Input sanitization boundary', () => {
test('XSS payloads are neutralized', () => {
const input = '<script>alert("xss")</script>';
const output = sanitize(input);
expect(output).not.toContain('<script>');
});
test('SQL injection payloads are escaped', () => {
const input = "'; DROP TABLE users; --";
const query = buildQuery(input);
expect(query).not.toMatch(/DROP TABLE/i);
});
});
3. Include Known Vulnerability Tests:
When a vulnerability is discovered and patched, add a test preventing reintroduction:
// Prevent regression to CVE-2019-10744 (Lodash prototype pollution)
test('prototype pollution is prevented', () => {
const payload = JSON.parse('{"__proto__": {"polluted": true}}');
processUserData(payload);
expect({}.polluted).toBeUndefined();
});
4. Test Negative Cases:
Verify that invalid inputs are rejected:
describe('Authentication rejection', () => {
test.each([
['empty token', ''],
['malformed token', 'not.a.jwt'],
['expired token', EXPIRED_TOKEN],
['invalid signature', TAMPERED_TOKEN],
])('rejects %s', async (name, token) => {
const result = await authenticate(token);
expect(result.authenticated).toBe(false);
});
});
Test Case Categories for Dependencies¶
Structure security regression tests around dependency interaction patterns.
Category 1: Input Handling
Tests for libraries that process external input:
| Test Type | Example |
|---|---|
| Boundary values | Max lengths, unicode, null bytes |
| Malicious payloads | XSS, SQLi, command injection |
| Malformed input | Invalid JSON, truncated data |
| Encoding variations | Different character encodings |
describe('JSON parser security', () => {
test('handles deeply nested objects', () => {
const deep = buildNestedObject(1000);
expect(() => parse(deep)).toThrow(/depth/);
});
test('handles large arrays', () => {
const large = '[' + '1,'.repeat(1000000) + '1]';
expect(() => parse(large)).toThrow(/size/);
});
});
Category 2: Authentication/Authorization
Tests for security decision libraries:
describe('Authorization library behavior', () => {
test('denies by default', () => {
const result = authorize(unknownUser, unknownResource);
expect(result).toBe(false);
});
test('role hierarchy is enforced', () => {
expect(authorize(admin, adminResource)).toBe(true);
expect(authorize(user, adminResource)).toBe(false);
});
});
Category 3: Cryptographic Operations
Tests for crypto libraries:
describe('Encryption library behavior', () => {
test('encrypted output differs from input', () => {
const plaintext = 'sensitive data';
const encrypted = encrypt(plaintext, key);
expect(encrypted).not.toContain(plaintext);
});
test('decryption reverses encryption', () => {
const plaintext = 'sensitive data';
const encrypted = encrypt(plaintext, key);
const decrypted = decrypt(encrypted, key);
expect(decrypted).toBe(plaintext);
});
test('wrong key fails decryption', () => {
const encrypted = encrypt('data', key1);
expect(() => decrypt(encrypted, key2)).toThrow();
});
});
Category 4: Output Encoding
Tests for rendering and output libraries:
describe('Template engine output encoding', () => {
test('HTML entities are escaped by default', () => {
const output = render('{{ userInput }}', { userInput: '<script>' });
expect(output).toContain('<script>');
expect(output).not.toContain('<script>');
});
});
Integration with Automated Update Tools¶
Connect security regression tests to Dependabot, Renovate, and similar tools.
Dependabot Integration:
# .github/workflows/dependency-update.yml
name: Dependency Update Validation
on:
pull_request:
branches: [main]
paths:
- 'package-lock.json'
- 'package.json'
jobs:
security-regression:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Install dependencies
run: npm ci
- name: Run security regression tests
run: npm run test:security
- name: Run security scanning
run: npm audit --audit-level=high
- name: Check for breaking changes
run: npm run test:integration
Renovate Integration:
// renovate.json
{
"extends": ["config:base"],
"packageRules": [
{
"matchDepTypes": ["dependencies"],
"requiredStatusChecks": [
"security-regression",
"integration-tests",
"security-scan"
]
}
],
"postUpdateOptions": ["npmDedupe"],
"prCreation": "not-pending"
}
Status Check Configuration:
Require security tests to pass before auto-merge:
# Branch protection rules (GitHub)
# Settings > Branches > Branch protection rules
required_status_checks:
strict: true
contexts:
- "security-regression"
- "integration-tests"
- "npm-audit"
Test Execution Strategy:
| Update Type | Test Suite | Gate |
|---|---|---|
| Patch (x.x.1) | Unit + Security regression | Auto-merge if pass |
| Minor (x.1.0) | Full test suite | Auto-merge if pass |
| Major (2.0.0) | Full suite + manual review | Never auto-merge |
| Security fix | Security-focused suite | Expedited review |
CI/CD Pipeline Configuration¶
Structure pipelines to catch security regressions at multiple stages.
Multi-Stage Pipeline:
# .github/workflows/security-pipeline.yml
name: Security Regression Pipeline
on: [push, pull_request]
jobs:
unit-tests:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- run: npm ci
- run: npm test
security-regression:
runs-on: ubuntu-latest
needs: unit-tests
steps:
- uses: actions/checkout@v4
- run: npm ci
- name: Security regression tests
run: npm run test:security
env:
TEST_TIMEOUT: 30000
integration-security:
runs-on: ubuntu-latest
needs: security-regression
services:
database:
image: postgres:15
steps:
- uses: actions/checkout@v4
- run: npm ci
- name: Integration security tests
run: npm run test:integration:security
dependency-audit:
runs-on: ubuntu-latest
needs: unit-tests
steps:
- uses: actions/checkout@v4
- run: npm ci
- run: npm audit --audit-level=moderate
- name: Check for known vulnerable patterns
run: npx semgrep --config p/security-audit
gate:
runs-on: ubuntu-latest
needs: [security-regression, integration-security, dependency-audit]
steps:
- name: All security checks passed
run: echo "Security regression pipeline complete"
Parallel vs. Sequential Execution:
# Run fast checks first, expensive checks after
jobs:
# Fast: Run immediately
lint-and-audit:
# ~30 seconds
# Medium: Run after fast checks
security-unit-tests:
needs: lint-and-audit
# ~2 minutes
# Slow: Run only if medium checks pass
integration-security:
needs: security-unit-tests
# ~10 minutes
Breaking Change Detection Heuristics¶
Identify when updates break security-relevant behavior.
Heuristic Categories:
| Heuristic | Detection Method |
|---|---|
| API signature changes | TypeScript compilation, type tests |
| Behavioral changes | Output comparison tests |
| Configuration changes | Default value tests |
| Error handling changes | Exception type tests |
| Timing changes | Performance regression tests |
API Signature Detection:
// Type tests detect signature changes
import { expectType } from 'tsd';
import { authenticate } from 'auth-library';
// If function signature changes, this fails
expectType<Promise<AuthResult>>(authenticate(token, options));
Behavioral Comparison:
// Golden file testing for consistent output
test('encryption output format unchanged', () => {
const result = encrypt('test', knownKey, knownIV);
expect(result).toMatchSnapshot();
});
// Or explicit comparison
test('hash algorithm unchanged', () => {
const hash = computeHash('known-input');
expect(hash).toBe('expected-hash-value-abc123');
});
Default Value Tests:
// Ensure secure defaults remain
test('secure defaults preserved', () => {
const config = createDefaultConfig();
expect(config.httpsOnly).toBe(true);
expect(config.secureCookies).toBe(true);
expect(config.csrfProtection).toBe(true);
expect(config.minTLSVersion).toBe('1.2');
});
Exception Behavior Tests:
test('invalid input throws SecurityError', () => {
expect(() => validate(maliciousInput)).toThrow(SecurityError);
// Ensure it's not silently ignored or returns null
});
Rollback Triggers and Criteria¶
Define when to automatically or manually roll back dependency updates.
Automatic Rollback Triggers:
| Trigger | Threshold | Action |
|---|---|---|
| Security test failure | Any | Block merge, alert |
| Critical vulnerability introduced | CVSS ≥ 9.0 | Block, revert if merged |
| Authentication tests fail | Any | Block, investigate |
| Error rate spike post-deploy | > 2x baseline | Auto-rollback |
Rollback Automation:
# Post-deployment validation with rollback
name: Canary Deployment
on:
push:
branches: [main]
jobs:
deploy-canary:
runs-on: ubuntu-latest
steps:
- name: Deploy to canary
run: ./deploy.sh canary
- name: Run smoke tests
run: ./smoke-tests.sh
continue-on-error: true
id: smoke
- name: Run security validation
run: ./security-validation.sh
continue-on-error: true
id: security
- name: Rollback on failure
if: steps.smoke.outcome == 'failure' || steps.security.outcome == 'failure'
run: |
./rollback.sh canary
echo "::error::Deployment failed validation, rolled back"
exit 1
Manual Rollback Criteria:
Document criteria requiring human decision:
# Rollback Decision Matrix
### Automatic Rollback
- [ ] Security regression tests fail
- [ ] Error rate > 5% for 5 minutes
- [ ] Critical security scan findings
### Requires Manual Decision
- [ ] Performance degradation > 20%
- [ ] Non-security test failures
- [ ] Deprecation warnings
### Continue with Monitoring
- [ ] Minor test flakiness
- [ ] Low-severity vulnerability introduced
- [ ] Non-critical deprecation warnings
Metrics for Regression Test Effectiveness¶
Measure whether your security regression tests provide value.
Key Metrics:
| Metric | Target | Meaning |
|---|---|---|
| Security test coverage | > 80% of security code paths | Coverage of security-relevant code |
| Regression detection rate | Track over time | Regressions caught by tests vs. production |
| False positive rate | < 5% | Tests failing without real issues |
| Time to detection | < 1 hour | Time from regression introduction to detection |
| Mean time to fix | < 24 hours | Time from detection to fix deployed |
Tracking Dashboard:
-- Security regression test metrics
SELECT
date_trunc('week', created_at) as week,
COUNT(*) FILTER (WHERE caught_by = 'security_tests') as caught_by_tests,
COUNT(*) FILTER (WHERE caught_by = 'production') as caught_in_prod,
COUNT(*) FILTER (WHERE false_positive = true) as false_positives
FROM security_regressions
GROUP BY 1
ORDER BY 1 DESC;
Test Quality Indicators:
- Mutation testing score: Do tests catch intentional bugs?
- Boundary coverage: Are edge cases tested?
- Negative test ratio: Ratio of "should fail" to "should pass" tests
Mature testing programs track "escapes"—security issues that reach production despite testing—and treat each as a learning opportunity. Escape-driven retrospectives that add new tests to prevent recurrence can reduce production security issues by 80% or more over time.
Recommendations¶
For Developers:
-
Write security tests for boundaries. Test where your code meets dependencies. That's where regressions most likely affect you.
-
Add regression tests for every vulnerability. When you fix or work around a vulnerability, add a test ensuring it doesn't return.
-
Test secure defaults. Verify that library defaults remain secure across updates. Changing defaults is a common source of silent security regressions.
For QA Engineers:
-
Integrate with update automation. Security regression tests must run automatically on dependency update PRs. Manual testing doesn't scale.
-
Define clear pass/fail criteria. Ambiguous results lead to ignored failures. Be explicit about what blocks merges.
-
Track test effectiveness. Measure whether tests catch real issues. Tests that never fail and never catch anything have low value.
For Organizations:
-
Require security tests for merges. Make security regression tests required status checks. Don't allow bypassing them.
-
Automate rollback capability. When security tests fail post-deployment, automatic rollback minimizes exposure window.
-
Invest in test quality. Poor tests give false confidence. Invest in mutation testing and boundary analysis to ensure tests actually catch issues.
Security regression testing bridges the gap between "this dependency is currently secure" and "this dependency remains secure as it evolves." Without regression tests, each update is a gamble—hoping nothing security-relevant changed. With comprehensive regression tests, updates become routine rather than risky.