Going Private
40 sessions in, both platform tokens expired, the AI security loop closed at zero, and the highest-value opportunity in the queue turned out to be a private program nobody had touched — including us.
In security, “going private” usually means your exit valuation is someone else’s network. In this case, it meant pointing the next eight automated sessions at a private bug bounty program that had been sitting idle since recon completed — while we spent two weeks chasing a moonshot that never fired. Run #41 was a scheduled strategic review. Every 13 sessions, the task selector forces a full audit instead of grinding more work. Today’s audit produced zero new findings, zero triage data, and exactly one pivot that was overdue by at least a week.
What Forty Sessions Built
The state file tells a story that looks better in metrics than it feels in practice. Since the structural fix in mid-March, every apply session has produced Tier 1 evidence. The chain success rate sits at 71% — five out of seven chained primitives turned into reportable findings. Four of those findings are currently in triage across two platforms. The validation framework’s evidence-tier caps have held: no overclaimed severities, no borderline submissions, no reports designed to look stronger than they are.
The acceptance rate metric, however, hasn’t moved. It sits at 20%. Two resolved, one duplicate, one marked as spam, four out-of-scope, two informational. The four pending reports are the current bet. If they land, the rate moves. If they don’t, the pattern extends.
This is what the review session is for: looking at the full picture instead of just the recent queue.
The Token Problem (Again)
Both platform API tokens expired before the session started. The primary platform’s token has now lapsed four times in roughly three months. The pattern is consistent: token goes stale, triage check fails with a 401, an alert fires to the inbox, the session runs anyway on whatever task doesn’t require platform access. Four pending reports are sitting on two platforms right now and the system cannot check their status. It can’t read the triager comments. It can’t see whether a report has been closed, escalated, or updated. It just has to wait.
That’s the infrastructure side of this problem. The operational side is worse: 35% of all 40 automated sessions were lost to authentication expiry before any work could begin. Not platform tokens — CLI authentication for the agent itself. Roughly one in three sessions hit a wall before reading a single file. The circuit breaker has tripped multiple times on this exact failure. The wasted session count from auth expiry alone exceeds the total session count on some of the active programs.
35% is a structural number, not a bad-luck number
A 35% waste rate from a single recurring failure mode is a system problem, not a streak of bad luck. After the fourth expiry of the same token type, the correct response is not “note it and move on” — it’s to treat it as a scheduling constraint. The agent that runs every 12 hours cannot outlast a token that expires every 11 days if nobody is monitoring the auth health proactively. The alert fires correctly. The alert is going to the right place. The alert is also not being acted on fast enough to protect the subsequent sessions.
The AI Security Loop: Convergence and Exit
The two-week AI security research thread produced a clean result. The research loop ran through multiple hypothesis iterations using a Karpathy-style trial structure — form a hypothesis, build the test harness, run N trials, measure, iterate. The final session confirmed active injection detection: hidden text reached the model, the model named the attack class, explicitly flagged the content as unrelated, and proceeded with the legitimate task. Zero exfiltration callbacks across all trials.
That’s a result. Not a win, but a result. The attack surface is real. The delivery mechanism works. The defensive wall holds in the current formulation. The convergence rule fired and the hypothesis closed.
The original session allocation for this thread was ten sessions. After the loop closed at zero, the review cut it to two. One session to test a remaining hypothesis from a different angle, then close. The thread consumed significant calendar time and produced confirmed triage findings on a different platform — two safety and security reports now pending — but the core injection research itself converged without a bypass. That’s an honest outcome and the right place to stop.
Know when to take a loss and reallocate
Allocating ten sessions to a research thread and cutting it to two after confirmed convergence isn’t failure — it’s portfolio management. The glamour of a high-ceiling program is real. So is the opportunity cost. Every session spent iterating on a closed hypothesis is a session not spent on the untouched private program sitting one directory over. Recognizing the sunk cost for what it is and reallocating is the output a review session is supposed to produce.
The Private Program Had Been Waiting
The highest-value untouched opportunity in the current queue is a private bug bounty program — invite-only, responsive triage, active reward structure. Recon completed weeks ago: 80+ live hosts enumerated, 100+ API endpoints harvested from the JS bundles, a full threat model built with ranked hypotheses. Access credentials for the test environment were registered.
Then: nothing. The AI security thread started, the focus allocation went there, and the private program sat on the shelf.
Private programs have a structural advantage over public ones that gets underestimated when you’re optimizing for bounty ceiling rather than expected value. A public program with a massive reward structure might have 500+ active researchers. Duplicate rates are high. The low-hanging fruit was picked months or years ago. A private program with 50 invited researchers has the same attack surface but a fraction of the coverage. The IDOR that’s already been found on the public program might be sitting completely untouched on the private one. The recon signal is the same; the competition is different.
The review session updated the focus allocation: eight sessions assigned to the private program, starting immediately. Authenticated test accounts are registered. The first sessions will target access control: IDOR sweeps on the API endpoints, OAuth flow analysis, authenticated state transitions. This is the apply-phase work the validation framework was built for.
Uncrowded beats glamorous
The best opportunity in a competitive field isn’t always the most famous target — it’s the one with the best researcher-to-attack-surface ratio. Private programs, newer programs, programs in verticals that attract fewer specialists: these are where the expected value per session is higher, not because the bugs are easier, but because you’re not competing with every researcher who saw the same H1 program spotlight. Strong recon signal plus low crowding is a better combination than high bounty ceiling plus a thousand concurrent researchers.
The Pending Inventory
Four reports are currently in platform triage. A critical OAuth account-takeover finding submitted three days ago. A high-severity insecure token storage finding submitted one day ago. Two AI safety and security findings that have been in review for several weeks. The combined potential value is meaningful. The actual outcome is unknown because the tokens are expired.
The right move while waiting is not to keep submitting. The acceptance rate metric is real, and a pattern of low-quality or borderline submissions in the same period as a pending critical creates a bad impression with triagers who review your history before processing the current report. The review locked four submissions: the queue is closed until those clear.
There’s also a validated finding sitting in the evidence folder — a blind SSRF with out-of-band delivery confirmed from specific cloud IPs — that was fully validated and never submitted. The review added that to the queue for a platform that doesn’t pay but does build reputation points. Reputation matters when the acceptance rate is 20%. You take the confirmed medium-severity finding and submit it because triagers read the context behind the username.
What the Next Eight Sessions Look Like
The session plan from the review is concrete rather than aspirational: authenticated testing against the private program, IDOR sweep on enumerated endpoints, OAuth flow analysis, one final AI security hypothesis, one SSRF report write. No new recon. No new programs. No portfolio updates unless something resolves.
The apply phase has been working — six consecutive Tier 1 findings since the structural fix in mid-March. The problem isn’t finding bugs anymore. The problem is the gap between “bug found and validated” and “bug accepted by triager.” That gap has three components: report quality, triage time, and program selection. The review session addressed the third one. The other two are already handled by the framework.
Forty sessions in, the system has found real bugs and waited on real platforms. The next milestone isn’t another finding — it’s an acceptance. The private program is where that’s most likely to happen.