2026-03-01 · 12 min

methodology validation growth

How I Stopped Reporting Noise

The 5-gate validation framework that killed 17 of my 31 reports — and why that's the best thing I ever built

The Problem

I wrote 40+ reports in my first two weeks of bug bounty. Most of them were garbage.

Not "rough around the edges" garbage. Not "needs a little polish" garbage. Actively wasting triagers' time garbage. Reports that claimed HIGH severity for findings I hadn't tested. Reports targeting assets that were explicitly excluded from scope. Reports that breathlessly announced "I can see your JavaScript" as if that constituted a security vulnerability.

My estimated validation rate was 35%. That means 65% of my work product was noise — reports that would be rejected, ignored, or, worst of all, dinged against my reputation on the platform. I was training algorithms to deprioritize my submissions. I was the boy who cried wolf, except the wolf was a source map and the villagers were exhausted triagers with 50 other reports in their queue.

The root causes were clear once I looked honestly: breadth addiction (test everything, report everything, move to the next target), a fundamental confusion between intelligence and vulnerability (seeing something interesting and assuming it's exploitable), and zero validation framework (no process between "I found something" and "I submitted a report").

I needed a system that would physically prevent me from writing bad reports. Not guidelines. Not "best practices I'll follow when I remember." Enforcement.

Intelligence vs. Vulnerability

This is the distinction that changed everything.

Source maps, debug endpoints, stack traces, verbose error messages, exposed configuration files — these are intelligence. They're discovery tools. They help you see further into a target's architecture, identify technology stacks, find hidden endpoints, and map internal data flows. They are the telescope.

They are not the threat.

Finding source maps is like finding a telescope on a rooftop. The question isn't "look, a telescope!" The question is: "What did you see through it?" If the answer is "I could read the JavaScript in a more readable format" — congratulations, you have a telescope. That's not a vulnerability. That's reconnaissance output.

The test is one sentence:

"As an attacker, I could ___."

Fill in the blank. If you can't finish that sentence with a real action — something beyond "read source code" or "see debug information" — you don't have a vulnerability. You have intelligence that might lead you to one.

Here's what "real actions" look like:

"As an attacker, I could access any user's order history by changing the order ID parameter."
"As an attacker, I could authenticate as any user by manipulating the OAuth token exchange."
"As an attacker, I could read the production database using credentials exposed in a public repository."
"As an attacker, I could execute arbitrary JavaScript in another user's browser session."

And here's what intelligence looks like, dressed up as a finding:

"As an attacker, I could read minified JavaScript in a more readable format." (Source maps)
"As an attacker, I could see the application framework and version number." (Stack trace)
"As an attacker, I could identify internal API endpoints." (Debug page)

The second list describes steps in an attack, not completed attacks. Use them to find real bugs. Report the real bugs. Cite the intelligence as your discovery method. Don't report the telescope.

I went back through all my reports and applied this sentence test. At least 15 failed outright. Fifteen reports where the "finding" was that something was visible, not that something was exploitable. That's a 37.5% false positive rate from a single mental model failure.

The telescope principle

Intelligence helps you find vulnerabilities. It is not itself a vulnerability. Report the stars, not the telescope.

The 5-Gate Validation Framework

The sentence test catches the worst offenders, but it's not enough on its own. I needed a systematic framework that catches every category of bad report — not just intelligence-as-vulnerability, but scope failures, unsupported severity claims, untested kill chain links, and predictable triager objections.

Every finding must pass through five gates before a report can be written. No exceptions. No "but this one is obviously valid." Five gates, in order:

Gate 1: Scope

Is the exact hostname listed in the program's named in-scope assets? Is the vulnerability type explicitly excluded by the program policy?

This sounds trivially obvious. It isn't. Wildcard exclusions are sneaky. A program might list *.example.com as in scope but then exclude *.internal.example.com. If you're testing api.internal.example.com, you're out of scope — even though it matches the wildcard. I learned this the hard way when 75% of one engagement's findings pointed at excluded assets.

Failure at Gate 1 means: stop. Do not write. Do not pass go.

Gate 2: Classification

"As an attacker, I could ___" — finish the sentence with a real action. Assign the correct vulnerability category (CWE, VRT, or whatever taxonomy the platform uses).

This gate catches intelligence-as-vulnerability. If you can't articulate a real attack action, you either need to find the real bug that the intelligence points to, or you don't have a reportable finding.

Gate 3: Evidence Tier

This is the gate that hurts the most and helps the most. Your severity claim is capped by the quality of your evidence:

Tier 1: End-to-end proof of concept
  → You connected to the database. You exfiltrated the session.
    You took over the account. You demonstrated the full attack.
  → Eligible for: P1 (Critical) or P2 (High)

Tier 2: Code analysis + partial live confirmation
  → The code shows a vulnerable pattern. You confirmed part of it
    works against the live target. But you didn't complete the
    full attack chain.
  → Maximum severity: P3 (Medium)

Tier 3: Code or configuration observation only
  → You see the code. You see the misconfiguration. You haven't
    tested anything against the live target.
  → Maximum severity: P4 (Low)

The cap rule is absolute: never claim severity above what your evidence tier supports.

Found a potential IDOR pattern in client-side JavaScript but didn't test it against the live API? That's Tier 2 at best. Your maximum severity is P3/Medium — even if the IDOR would theoretically expose every user's data. Get the PoC or accept the cap. No more speculative fiction.

Gate 4: Kill Chain

Every exploit has a kill chain: Source → Transport → Execution → Impact.

Source: where does the attacker input enter the system? Transport: how does it reach the vulnerable component? Execution: how does the payload trigger? Impact: what damage results?

Every link in the chain must be tested, or your severity is capped at the weakest tested link. If you proved source, transport, and execution but only assumed impact, your finding's strength is limited by that assumption.

Missing a link? Either test it or accept the downgrade. Don't paper over gaps with "could potentially" language.

Gate 5: Pre-Mortem

This is the gate that separates decent reports from great ones. Before writing, ask yourself:

"What will the triager's first objection be?"

Then answer it. With evidence.

Common triager objections:

"This is by design." — Show why the design creates exploitable risk.
"This requires unlikely user interaction." — Demonstrate a realistic attack scenario.
"The impact is theoretical." — Prove it with a working PoC.
"This is a duplicate." — Check previous disclosures and differentiate your finding.
"This doesn't affect user data." — Show what data is actually accessible.

If you can't answer the most likely objection with evidence, strengthen your evidence before writing. The goal isn't to eliminate all possible objections — it's to address the obvious one so the triager sees you've done your homework.

Validation Gates

Evidence Tiers

Kill Chain Links

Sentence Test

Chain-First Thinking

A primitive alone is rarely a report.

A CORS misconfiguration alone is P4 at best — you've shown that the Access-Control-Allow-Origin header is permissive, but you haven't demonstrated cross-origin data theft. An open redirect alone is P4 — you've shown a redirect, but you haven't stolen a token through it. Information disclosure alone is P4 — you've shown data leaks, but you haven't demonstrated what an attacker does with that data.

But chain them together and the story changes entirely.

A CORS misconfiguration on an authenticated endpoint that returns sensitive user data? Now you have cross-origin data theft — that's P2. An open redirect near an OAuth callback endpoint? Now you have a token theft chain — that's potentially P1. Source maps that reveal a hardcoded API key that accesses a production database? Now the source maps were intelligence, the API key was the bug, and the database access is the impact — that's CRITICAL.

The doctrine is simple: before reporting any primitive, attempt to chain it.

Find a primitive (CORS, redirect, info disclosure, misconfiguration).
Ask: "What adjacent components could this combine with?"
Test the chain. Spend real time on it — at least 30 minutes.
If the chain works: report at the chain's end impact.
If the chain fails after genuine effort: report the primitive at Tier 3 severity max (P4/Low), and document the chain attempt. It shows thoroughness.

The chain attempt itself is valuable even when it fails. A report that says "I found X, attempted to chain it with Y and Z, and here's why the chain didn't complete" is vastly more credible than one that just says "I found X." It shows you understand impact. It shows you tested beyond the surface. And sometimes it reveals to the triager a risk they hadn't considered.

Chain-first doctrine

Never report a primitive without first attempting to chain it. The primitive alone is a P4. The chain determines real severity. Document the attempt either way.

Evidence Tiers in Practice

Abstract tier definitions are useful. Concrete examples are better. Here's what each tier looks like in the field:

Tier 1 — End-to-End PoC

You connected to the database with the exposed credentials and listed the collections. You used the IDOR to access another user's data and showed the response. You injected XSS, captured a session cookie, and replayed it to access the victim's account. You demonstrated every link in the kill chain against the live target.

Tier 1 evidence is unambiguous. The triager can reproduce it. There's no "could potentially" — there's "here's the response, here's the data, here's the impact."

Tier 2 — Code Analysis + Partial Live Test

The JavaScript reveals an API endpoint that accepts a user ID parameter without authorization checks. You confirmed the endpoint exists and responds. But you didn't (or couldn't) test with a second account to prove cross-user access. The vulnerability pattern is clear in the code, and partial live testing supports it, but the full attack chain is incomplete.

Tier 2 is reasonable for medium-severity reports. It shows you've done real work and the risk is credible. But it's not proof — it's informed inference. Cap at P3.

Tier 3 — Observation Only

You see a misconfigured header. You see an exposed service. You see debug information. You haven't tested exploitation at all. The finding is real, but the impact is assumed.

Tier 3 findings are configuration issues, information disclosures, and defense-in-depth concerns. They're valid reports at P4, but claiming anything higher is unsupported speculation.

The cardinal rule: you can believe something is critical all you want. If you haven't proved it end-to-end, you don't get to claim it. Get the PoC or accept the cap.

Enforcement: Making It Stick

Frameworks are worthless if you can bypass them when you're excited about a finding. And you will be excited. You'll find something that looks critical, your pulse will spike, and every instinct will scream "write the report NOW before someone else finds it."

That instinct is the enemy. It produced 65% of my garbage reports.

So I built enforcement into the tooling itself. Two automated agents that hook into my workflow and physically block bad behavior:

The Scope Guardian intercepts every outbound request command. It extracts the target hostname, checks it against a locked scope file, and warns (or blocks) if the target isn't in scope. It's not perfect — it can't catch every edge case — but it catches the obvious ones, which is where most of my scope failures lived.

The Report Gate intercepts every attempt to write a report file. Before the write is allowed, it checks three things: does a threat model exist for this engagement? Does the validation checklist show a PASS verdict? Is the claimed severity within the evidence tier cap? If any check fails, the write is blocked. You literally cannot create a report file until the prerequisites are met.

The distinction matters: some parts of my workflow are advisory (skills that suggest best practices) and some are blocking (agents that prevent action). The advisory parts are helpful but optional. The blocking parts are the framework's teeth. You can ignore a suggestion. You can't ignore a gate that refuses to open.

The framework only works if you can't bypass it when you're excited about a finding. Excitement is not evidence. Adrenaline is not a severity rating. Build the gates, then trust them.

The excitement trap

"This looks critical, I should report it immediately before someone else does." That sentence has produced more rejected reports than any scanner ever has. Slow down. Validate. The 20 minutes you spend on Gate 5 saves the 3 days you'd spend arguing with a triager.

Results

I applied this framework retroactively to 31 existing reports. The results were brutal and necessary.

17 reports were killed outright. They failed Gate 1 (scope), Gate 2 (intelligence-not-vulnerability), or Gate 3 (severity above tier cap). Seventeen reports that I had been proud of, that I had spent hours writing, that were — by any honest assessment — noise.

The surviving 14 had higher acceptance probability. Not because the framework made them better — it didn't change a single finding. It just forced me to be honest about which findings were real and which were wishful thinking.

Two reports on a large VDP passed all 5 gates and were submitted with CRITICAL and HIGH severity. Full Tier 1 evidence: live database access confirmed, live API session manipulation demonstrated, end-to-end kill chain tested. No speculation. No "could potentially." Just evidence.

The framework didn't make me a better researcher overnight. It made me a more honest one. And in bug bounty, honesty about your own evidence is the single highest-leverage skill you can develop.

Reports Reviewed

Killed

Survived

Critical/High

Takeaways for Other Researchers

Build your validation framework before you need it, not after. I built mine after 40 reports of mostly garbage. You don't have to repeat that mistake. Even a simple checklist — scope check, sentence test, evidence tier assignment — would have caught 80% of my bad reports before I wrote them.

Report the stars, not the telescope. Source maps, debug info, stack traces, verbose errors — these are discovery tools. Use them to find real bugs. Report the real bugs. Cite the intelligence as your method. Don't confuse seeing further with having found something.

Depth beats breadth. One feature, two hours, no scanners. Understand how the feature works before trying to break it. The best findings come from comprehension, not from running nuclei against every subdomain you can enumerate. Scanners find low-hanging fruit. Understanding finds the bugs that pay.

Read the program policy before writing a single word. Not skim. Read. Every exclusion, every rule, every required header. The number of reports killed by Gate 1 (scope) is embarrassing. All of them were preventable by spending 10 minutes reading what the program actually wanted.

The best report is the one you don't write. This sounds counterintuitive, but it's the most important lesson. Every garbage report you don't submit is a triager who doesn't learn to ignore you, a reputation point you don't lose, and time you can spend on findings that actually matter. The framework's highest value isn't in the reports it produces — it's in the reports it prevents.

Final thought

The framework isn't about being cautious. It's about being honest. Honest about your evidence, honest about your impact claims, honest about whether you've actually tested the thing you're about to assert is broken. In a field where reputation compounds, honesty isn't just ethical — it's strategic.