14.4 Security Regression Testing¶

When Lodash released version 4.17.12 in July 2019 to fix a prototype pollution vulnerability (CVE-2019-10744), some applications that updated found their security tests now failed—not because Lodash was broken, but because their code relied on the previously vulnerable behavior. Conversely, other applications updated without running security tests, only to discover weeks later that the update had broken authentication middleware that depended on Lodash's object manipulation. Security regression testing ensures that dependency updates don't reintroduce vulnerabilities, break security controls, or change security-relevant behavior unexpectedly.

This section covers designing and implementing security regression tests for dependency updates, integrating them with automated update tools, and establishing criteria for when to roll back problematic changes.

Security Regression Test Design Principles¶

Security regression tests verify that security properties remain intact as code changes—including when dependencies update.

Design Principles:

1. Test Security Properties, Not Implementations:

Test that authentication works, not how a library implements it:

// Good: Tests security property
test('invalid tokens are rejected', async () => {
  const response = await request(app)
    .get('/protected')
    .set('Authorization', 'Bearer invalid-token');
  expect(response.status).toBe(401);
});

// Fragile: Tests implementation detail
test('jwt.verify is called with correct options', () => {
  // Breaks if library internals change
});

2. Cover Security Boundaries:

Focus tests on where your code meets dependencies:

// Test at the boundary between your code and dependency
describe('Input sanitization boundary', () => {
  test('XSS payloads are neutralized', () => {
    const input = '<script>alert("xss")</script>';
    const output = sanitize(input);
    expect(output).not.toContain('<script>');
  });

  test('SQL injection payloads are escaped', () => {
    const input = "'; DROP TABLE users; --";
    const query = buildQuery(input);
    expect(query).not.toMatch(/DROP TABLE/i);
  });
});

3. Include Known Vulnerability Tests:

When a vulnerability is discovered and patched, add a test preventing reintroduction:

// Prevent regression to CVE-2019-10744 (Lodash prototype pollution)
test('prototype pollution is prevented', () => {
  const payload = JSON.parse('{"__proto__": {"polluted": true}}');
  processUserData(payload);
  expect({}.polluted).toBeUndefined();
});

4. Test Negative Cases:

Verify that invalid inputs are rejected:

describe('Authentication rejection', () => {
  test.each([
    ['empty token', ''],
    ['malformed token', 'not.a.jwt'],
    ['expired token', EXPIRED_TOKEN],
    ['invalid signature', TAMPERED_TOKEN],
  ])('rejects %s', async (name, token) => {
    const result = await authenticate(token);
    expect(result.authenticated).toBe(false);
  });
});

Test Case Categories for Dependencies¶

Structure security regression tests around dependency interaction patterns.

Category 1: Input Handling

Tests for libraries that process external input:

Test Type	Example
Boundary values	Max lengths, unicode, null bytes
Malicious payloads	XSS, SQLi, command injection
Malformed input	Invalid JSON, truncated data
Encoding variations	Different character encodings

describe('JSON parser security', () => {
  test('handles deeply nested objects', () => {
    const deep = buildNestedObject(1000);
    expect(() => parse(deep)).toThrow(/depth/);
  });

  test('handles large arrays', () => {
    const large = '[' + '1,'.repeat(1000000) + '1]';
    expect(() => parse(large)).toThrow(/size/);
  });
});

Category 2: Authentication/Authorization

Tests for security decision libraries:

describe('Authorization library behavior', () => {
  test('denies by default', () => {
    const result = authorize(unknownUser, unknownResource);
    expect(result).toBe(false);
  });

  test('role hierarchy is enforced', () => {
    expect(authorize(admin, adminResource)).toBe(true);
    expect(authorize(user, adminResource)).toBe(false);
  });
});

Category 3: Cryptographic Operations

Tests for crypto libraries:

describe('Encryption library behavior', () => {
  test('encrypted output differs from input', () => {
    const plaintext = 'sensitive data';
    const encrypted = encrypt(plaintext, key);
    expect(encrypted).not.toContain(plaintext);
  });

  test('decryption reverses encryption', () => {
    const plaintext = 'sensitive data';
    const encrypted = encrypt(plaintext, key);
    const decrypted = decrypt(encrypted, key);
    expect(decrypted).toBe(plaintext);
  });

  test('wrong key fails decryption', () => {
    const encrypted = encrypt('data', key1);
    expect(() => decrypt(encrypted, key2)).toThrow();
  });
});

Category 4: Output Encoding

Tests for rendering and output libraries:

describe('Template engine output encoding', () => {
  test('HTML entities are escaped by default', () => {
    const output = render('{{ userInput }}', { userInput: '<script>' });
    expect(output).toContain('&lt;script&gt;');
    expect(output).not.toContain('<script>');
  });
});

Integration with Automated Update Tools¶

Connect security regression tests to Dependabot, Renovate, and similar tools.

Dependabot Integration:

# .github/workflows/dependency-update.yml
name: Dependency Update Validation

on:
  pull_request:
    branches: [main]
    paths:
      - 'package-lock.json'
      - 'package.json'

jobs:
  security-regression:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Install dependencies
        run: npm ci

      - name: Run security regression tests
        run: npm run test:security

      - name: Run security scanning
        run: npm audit --audit-level=high

      - name: Check for breaking changes
        run: npm run test:integration

Renovate Integration:

// renovate.json
{
  "extends": ["config:base"],
  "packageRules": [
    {
      "matchDepTypes": ["dependencies"],
      "requiredStatusChecks": [
        "security-regression",
        "integration-tests",
        "security-scan"
      ]
    }
  ],
  "postUpdateOptions": ["npmDedupe"],
  "prCreation": "not-pending"
}

Status Check Configuration:

Require security tests to pass before auto-merge:

# Branch protection rules (GitHub)
# Settings > Branches > Branch protection rules

required_status_checks:
  strict: true
  contexts:
    - "security-regression"
    - "integration-tests"
    - "npm-audit"

Test Execution Strategy:

Update Type	Test Suite	Gate
Patch (x.x.1)	Unit + Security regression	Auto-merge if pass
Minor (x.1.0)	Full test suite	Auto-merge if pass
Major (2.0.0)	Full suite + manual review	Never auto-merge
Security fix	Security-focused suite	Expedited review

CI/CD Pipeline Configuration¶

Structure pipelines to catch security regressions at multiple stages.

Multi-Stage Pipeline:

# .github/workflows/security-pipeline.yml
name: Security Regression Pipeline

on: [push, pull_request]

jobs:
  unit-tests:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - run: npm ci
      - run: npm test

  security-regression:
    runs-on: ubuntu-latest
    needs: unit-tests
    steps:
      - uses: actions/checkout@v4
      - run: npm ci
      - name: Security regression tests
        run: npm run test:security
        env:
          TEST_TIMEOUT: 30000

  integration-security:
    runs-on: ubuntu-latest
    needs: security-regression
    services:
      database:
        image: postgres:15
    steps:
      - uses: actions/checkout@v4
      - run: npm ci
      - name: Integration security tests
        run: npm run test:integration:security

  dependency-audit:
    runs-on: ubuntu-latest
    needs: unit-tests
    steps:
      - uses: actions/checkout@v4
      - run: npm ci
      - run: npm audit --audit-level=moderate
      - name: Check for known vulnerable patterns
        run: npx semgrep --config p/security-audit

  gate:
    runs-on: ubuntu-latest
    needs: [security-regression, integration-security, dependency-audit]
    steps:
      - name: All security checks passed
        run: echo "Security regression pipeline complete"

Parallel vs. Sequential Execution:

# Run fast checks first, expensive checks after
jobs:
  # Fast: Run immediately
  lint-and-audit:
    # ~30 seconds

  # Medium: Run after fast checks
  security-unit-tests:
    needs: lint-and-audit
    # ~2 minutes

  # Slow: Run only if medium checks pass
  integration-security:
    needs: security-unit-tests
    # ~10 minutes

Breaking Change Detection Heuristics¶

Identify when updates break security-relevant behavior.

Heuristic Categories:

Heuristic	Detection Method
API signature changes	TypeScript compilation, type tests
Behavioral changes	Output comparison tests
Configuration changes	Default value tests
Error handling changes	Exception type tests
Timing changes	Performance regression tests

API Signature Detection:

// Type tests detect signature changes
import { expectType } from 'tsd';
import { authenticate } from 'auth-library';

// If function signature changes, this fails
expectType<Promise<AuthResult>>(authenticate(token, options));

Behavioral Comparison:

// Golden file testing for consistent output
test('encryption output format unchanged', () => {
  const result = encrypt('test', knownKey, knownIV);
  expect(result).toMatchSnapshot();
});

// Or explicit comparison
test('hash algorithm unchanged', () => {
  const hash = computeHash('known-input');
  expect(hash).toBe('expected-hash-value-abc123');
});

Default Value Tests:

// Ensure secure defaults remain
test('secure defaults preserved', () => {
  const config = createDefaultConfig();

  expect(config.httpsOnly).toBe(true);
  expect(config.secureCookies).toBe(true);
  expect(config.csrfProtection).toBe(true);
  expect(config.minTLSVersion).toBe('1.2');
});

Exception Behavior Tests:

test('invalid input throws SecurityError', () => {
  expect(() => validate(maliciousInput)).toThrow(SecurityError);
  // Ensure it's not silently ignored or returns null
});

Rollback Triggers and Criteria¶

Define when to automatically or manually roll back dependency updates.

Automatic Rollback Triggers:

Trigger	Threshold	Action
Security test failure	Any	Block merge, alert
Critical vulnerability introduced	CVSS ≥ 9.0	Block, revert if merged
Authentication tests fail	Any	Block, investigate
Error rate spike post-deploy	> 2x baseline	Auto-rollback

Rollback Automation:

# Post-deployment validation with rollback
name: Canary Deployment

on:
  push:
    branches: [main]

jobs:
  deploy-canary:
    runs-on: ubuntu-latest
    steps:
      - name: Deploy to canary
        run: ./deploy.sh canary

      - name: Run smoke tests
        run: ./smoke-tests.sh
        continue-on-error: true
        id: smoke

      - name: Run security validation
        run: ./security-validation.sh
        continue-on-error: true
        id: security

      - name: Rollback on failure
        if: steps.smoke.outcome == 'failure' || steps.security.outcome == 'failure'
        run: |
          ./rollback.sh canary
          echo "::error::Deployment failed validation, rolled back"
          exit 1

Manual Rollback Criteria:

Document criteria requiring human decision:

# Rollback Decision Matrix

### Automatic Rollback
- [ ] Security regression tests fail
- [ ] Error rate > 5% for 5 minutes
- [ ] Critical security scan findings

### Requires Manual Decision
- [ ] Performance degradation > 20%
- [ ] Non-security test failures
- [ ] Deprecation warnings

### Continue with Monitoring
- [ ] Minor test flakiness
- [ ] Low-severity vulnerability introduced
- [ ] Non-critical deprecation warnings

Metrics for Regression Test Effectiveness¶

Measure whether your security regression tests provide value.

Key Metrics:

Metric	Target	Meaning
Security test coverage	> 80% of security code paths	Coverage of security-relevant code
Regression detection rate	Track over time	Regressions caught by tests vs. production
False positive rate	< 5%	Tests failing without real issues
Time to detection	< 1 hour	Time from regression introduction to detection
Mean time to fix	< 24 hours	Time from detection to fix deployed

Tracking Dashboard:

-- Security regression test metrics
SELECT 
  date_trunc('week', created_at) as week,
  COUNT(*) FILTER (WHERE caught_by = 'security_tests') as caught_by_tests,
  COUNT(*) FILTER (WHERE caught_by = 'production') as caught_in_prod,
  COUNT(*) FILTER (WHERE false_positive = true) as false_positives
FROM security_regressions
GROUP BY 1
ORDER BY 1 DESC;

Test Quality Indicators:

Mutation testing score: Do tests catch intentional bugs?
Boundary coverage: Are edge cases tested?
Negative test ratio: Ratio of "should fail" to "should pass" tests

Mature testing programs track "escapes"—security issues that reach production despite testing—and treat each as a learning opportunity. Escape-driven retrospectives that add new tests to prevent recurrence can reduce production security issues by 80% or more over time.

Recommendations¶

For Developers:

Write security tests for boundaries. Test where your code meets dependencies. That's where regressions most likely affect you.
Add regression tests for every vulnerability. When you fix or work around a vulnerability, add a test ensuring it doesn't return.
Test secure defaults. Verify that library defaults remain secure across updates. Changing defaults is a common source of silent security regressions.

For QA Engineers:

Integrate with update automation. Security regression tests must run automatically on dependency update PRs. Manual testing doesn't scale.
Define clear pass/fail criteria. Ambiguous results lead to ignored failures. Be explicit about what blocks merges.
Track test effectiveness. Measure whether tests catch real issues. Tests that never fail and never catch anything have low value.

For Organizations:

Require security tests for merges. Make security regression tests required status checks. Don't allow bypassing them.
Automate rollback capability. When security tests fail post-deployment, automatic rollback minimizes exposure window.
Invest in test quality. Poor tests give false confidence. Invest in mutation testing and boundary analysis to ensure tests actually catch issues.

Security regression testing bridges the gap between "this dependency is currently secure" and "this dependency remains secure as it evolves." Without regression tests, each update is a gamble—hoping nothing security-relevant changed. With comprehensive regression tests, updates become routine rather than risky.