The Closure Gap in Offensive Security

How to think about vulnerability scanning, penetration testing, and red teaming

You probably inherited three different approaches that all claim to find security risks: scanners running constantly, annual pen tests that cost $40K, and maybe a red team engagement if the board got nervous. Here's what nobody tells you: none of them answer the question you actually need answered.

A scanner flags 3,000 findings. You can't tell which 50 matter. Your team wastes cycles triaging noise.

A pen test proves something was exploitable six months ago. Your environment has changed twenty times since then. That proof expired weeks ago.

A red team shows you could get breached. It doesn't tell you whether the seventeen "high-priority" findings from last year are actually fixed.

The hard question keeps coming back: "We fixed this last quarter. With this new deployment, how do we know we didn't reopen it?"

This is the closure gap. And it's where many security programs break.

The Trade-offs at a High Level

Vulnerability scanning is broad and cheap. It finds possible exposure fast — but it rarely proves exploitability in your environment or business impact, and it can flood teams with work that doesn't materially reduce real risk.

Penetration testing is deeper and human-led. A good test can prove attacker paths and sharpen prioritization — but it's still point-in-time, quality varies, and closure confidence decays after remediation and change.

Red teaming is the most realistic simulation. It can show what a capable adversary could achieve — but it's expensive, time-consuming, and typically focused on a narrow set of objectives rather than producing comprehensive, durable closure evidence as environments change.

The Hard Problems Show Up Afterward

Regardless of which tool or engagement found the issue, the hard problems show up afterward:

Prioritization: what's truly exploitable and impactful here (not just "high CVSS")
Susceptibility: under what conditions the attacker path is actually reachable — identity/permissions, boundaries, compensating controls
Remediation reality: teams often inherit a research project, not a fix path — so backlogs grow and only a fraction gets closed
Verification: once it's "fixed," how do you prove the path is closed — and stays closed after deployments, drift, break-glass access, vendor changes, or "accepted risk" that quietly never resurfaces?

What Vulnerability Scanning Actually Does

Scanning answers: "What might be exposed?"

What it actually produces: mostly noise.

Scanner output is a flood of findings — thousands of them, and only a small portion tend to be truly actionable. The rest? Not reachable because of segmentation. Not exploitable in your environment. Not impactful even if someone did exploit it. False positives. Compensating controls make them irrelevant. Findings in dev environments nobody can reach.

You're drowning in a list trying to figure out which handful are actually dangerous. A CVSS 9.8 might be completely isolated. A medium-severity API flaw might be the front door to your database when chained with a role issue in AD.

Scanners generate coverage. Signal is often buried in the noise without validation in context.

That said, scanners are good at:

Finding known CVEs and common misconfigurations at scale
Running frequently to flag new potential exposure and obvious configuration drift
Supporting baseline hygiene and some compliance requirements

Where scanners typically fall short is the step leadership actually cares about: does this translate into a real attacker path with meaningful business impact? Scanner output is usually detection + generic severity (often CVSS) — not proof of:

Reachability: can an attacker actually reach the vulnerable service from a realistic entry point?
Exploitability: can the weakness be reliably exploited under real constraints?
Chaining: can it be combined with other weaknesses into an end-to-end attacker path?
Impact: if exploited, what would the attacker actually gain, and what does that mean for the business?

What Penetration Testing Is Designed to Prove

Penetration testing answers: "Can an attacker actually exploit this here — and what can they reach next?"

A good pen test uses human judgment to:

Prove exploitability, not just the "presence" of a CVE or misconfiguration
Chain weaknesses across systems into an end-to-end attacker path
Provide context on real business impact
Prioritize what matters most in your specific environment

The constraint is time. Pen tests are point-in-time engagements. The tester moves on. Many firms offer a short re-test window (typically 15–60 days). Remediation often takes longer. Even with notes and reproduction steps, post-fix validation becomes your internal problem if you don't retain the resource.

Where Red Teaming Fits

Red teaming answers: "If a capable adversary targeted us, what could they actually achieve?"

Red teams simulate realistic attack scenarios end-to-end. They test not just technical controls but also detection, response, and organizational resilience.

The limitation is scope and sustainability. Red team engagements are expensive, time-consuming, and typically focused on specific objectives. They're excellent for pressure-testing defenses around crown-jewel assets — but they don't produce comprehensive, durable closure evidence across the estate.

The Closure Gap

The operational question keeps coming back: "We fixed this late last year. With this new release, how do we know we didn't reintroduce the Security Risk?"

None of these approaches — scanning, pen testing, or red teaming — are designed to answer that question continuously. Each has value. None delivers verified closure that persists as environments change.

That's the closure gap: the space between "we found it" and "we can prove it's closed — and stays closed."