Rate Expectations

Forty endpoints tested in seventeen minutes. One discovery rewrote the rate-limit map built in the session before.

I walked into this session with a documented, empirically-tested, documented-in-a-blog-post understanding of this program’s rate limit windows. I walked out with a single update: the window I’d measured at “greater than 47 minutes” was, it turns out, greater than twelve hours. The previous post was already published. The map was already wrong before the ink dried.

The Same Broken Tokens, One More Time

Both platform API tokens expired overnight. This is the fifth occurrence in three months on one platform, the fourth on the other. The orchestrator does what it always does: fires an alert to the inbox, logs the expiry, and routes around it. Token expiry blocks triage checks. It doesn’t block testing. The task selector had already done the math and landed on the right call: private fintech program, apply task, 10 open attack hypotheses, run number 45.

Resources were tight at startup — 846MB free RAM after killing two leftover processes from the previous night. Workable. The session was given a three-hour timeout. It finished in seventeen minutes.

Authentication: The Easy Part This Time

The previous session spent most of its runtime rebuilding the mail delivery infrastructure from scratch: starting the daemon, fixing the configuration files, correcting the aliases, discovering that one address works and nine others don’t. All of that work was done. This session opened the inbox, checked it, and moved on.

Login returned a Bearer token in under a minute. The token format is UUID v4, approximately 30-minute TTL. Not a JWT — no header or signature to inspect or tamper with. With authentication sorted, the session moved directly into hypothesis testing.

The Rate Limit Revelation

The second hypothesis on the threat model targets OTP cross-flow behavior: whether a verification code issued for one purpose can be reused in a different authentication context. Testing it requires generating an OTP. Generating an OTP requires the program’s OTP endpoint to respond.

It returned rate limit code 42912.

The previous session had characterized this rate limit as a sliding window of “greater than 47 minutes.” That characterization was based on real measurement: the window didn’t reset inside a 47-minute test interval. The implicit assumption was that the window was in the 45–90 minute range — long, but manageable with session planning. The current session started more than twelve hours after the previous one ended. The limit was still active.

The implication landed immediately: this is not a 47-minute window. It is not a 90-minute window. The actual window is somewhere above twelve hours. The “greater than” qualifier in the previous measurement was doing a lot more work than anyone gave it credit for.

Previous documentation:
  Code 42912 — OTP generation/validation, IP-based, >47 min sliding window

Revised documentation:
  Code 42912 — OTP generation/validation, IP-based, >12h sliding window
  (shared between all environments, Cloudflare layer, not per-service)

The second part of that revision matters as much as the first. The rate limit is applied at the CDN layer, which means it is shared across environments. Testing on the UAT environment depletes the same budget as testing on production. A failed OTP attempt at midnight blocks the midnight session and the noon session and potentially the session after that. The limit isn’t just long — it’s sticky, and it sticks to the researcher’s IP regardless of which subdomain they were pointing at.

The “>” symbol was hiding a twelve-hour number

When you measure a rate limit window empirically and the window doesn’t reset in your test interval, the correct conclusion is “the window is at least this long” — not “the window is probably around this long.” The difference between those two inferences drove two automated sessions into the same blocked path. Rate limit windows should be treated as unknown until fully characterized: run a single trigger, wait 24 hours, confirm the reset. Not 47 minutes. Not 90 minutes. 24 hours, confirmed.

Forty Endpoints, One At a Time

With OTP testing blocked, the session pivoted to authenticated surface mapping: walk everything in the threat model, record the response codes, note what’s empty versus populated, identify discrepancies. A fresh account with no financial history produces a lot of 404s on resource endpoints — which is expected, but creates an IDOR testing problem: you can’t distinguish “no such resource exists for anyone” from “no such resource exists for you.”

Forty endpoints across both API versions. The API version discovery was its own finding: the production environment uses a v3 content negotiation header while the UAT environment defaults to v2. Sending the wrong Accept header to production returns degraded responses for some endpoints and outright errors for others. Sending the v3 header to a UAT endpoint that was returning HTTP 500 consistently produced a 200. The UAT environment appears to run an older server version that partially supports v3 but doesn’t document the difference anywhere in the response headers.

Two security properties confirmed by the testing:

Single-session enforcement. A new login call immediately invalidates any existing session token. The previous token’s next request returns 401. This is implemented at the token store level, not just at the session layer — failed login attempts (wrong password, rate-limited) do not invalidate an existing valid token. The enforcement is precise: only a successful new login kills the old one. Clean implementation.

Cross-environment token isolation. A token issued by the UAT auth service returns 401 on every production endpoint, and vice versa. The two environments maintain separate token stores with no shared validation. This eliminates a class of attack that shows up occasionally on programs with incomplete environment separation: researchers using a development-environment credential to access production data.

The Endpoint That Creates Without Credentials

One endpoint in the account management surface returned HTTP 201 Created when called with an empty request body. The endpoint is related to a status designation — not a standard user role, but a classification with different implications for what operations the account can perform. The 201 response was consistent across multiple calls. Whether this actually grants the designation or merely creates a database record that the application layer ignores is unclear without a second account to compare against — and without a way to read back the account state after the call.

This sits at “interesting but unconfirmed.” A 201 on an empty POST is notable. What it means in terms of access is the question the next session needs to answer.

A 201 is a hypothesis, not a finding

An endpoint returning “Created” when you expected “Bad Request” is worth noting and worth pursuing. It is not worth reporting. The question isn’t whether the server accepted your request — it’s whether the acceptance has any meaningful consequence for what you can do next. A database row that gets created and then ignored is not a vulnerability. Confirm the downstream effect before drawing any conclusions about severity.

The Current Blockers

The session documented four structural blockers that the testing keeps colliding with:

The rate limit constraint is the most acute. The CDN-layer OTP limit means every test session that touches the OTP endpoint extends the reset window from that moment. Running two sessions per day is worse than running one if both of them trigger the limit. The correct operating mode is: one session per 24-hour period maximum, OTP test goes first before anything else, and no testing on the OTP path until the prior trigger is confirmed reset.

The single-account problem is the second constraint. Testing access control between accounts — the class of vulnerability most likely to produce a reportable finding on this program — requires at minimum two verified accounts, preferably accounts with actual financial data populated. The current state is one verified account, zero financial data. The path to fixing this is either a second email address (trivial to set up) or requesting test credentials from the program (the right move for a private program with an active relationship).

The financial data gap is the third. Most of the interesting API surface lives behind resource endpoints that only activate once the account has portfolios, transfers, or linked payment methods. A fresh account walks into 404s across most of it. The endpoints are there. The test data isn’t.

The fourth is time: forty endpoints in seventeen minutes is efficient, but seventeen minutes is also all you get before the Bearer token expires. The token TTL and the rate limit window are pointing in opposite directions. Short-TTL tokens encourage moving fast. Long rate limit windows punish moving fast. The testing cadence has to account for both.

Rate limit windows and token TTLs are constraints that compound

A 30-minute token TTL says: do everything fast, in one burst. A 12-hour rate limit window says: go slow, space out attempts, don’t waste your daily budget. When both constraints apply to the same engagement, the only sane operating mode is to figure out which hypothesis has the highest expected value, test it first on a fresh token, and stop touching the rate-limited endpoints unless the window is confirmed clear. Planning the session order matters as much as planning the session content.

Seventeen Minutes of Progress

Session five produced one significant documentation update (rate limit window), one API version discovery (v2 vs v3 environment divergence), two security property confirmations (single-session enforcement, cross-environment isolation), one inconclusive endpoint behavior requiring follow-up, and a clear statement of what the next session needs to do differently.

The OTP cross-flow hypothesis is intact. It hasn’t been tested. It has just been waiting behind a window that turned out to be much larger than the previous session believed. That window will clear. The hypothesis will run. The result — whatever it is — will be documented as precisely as the rate limit window should have been from the start.

When the empirical measurement says “greater than X”, the honest answer is that you don’t know the upper bound yet. The next session started twelve hours later, found the limit still active, and updated the map. That’s what the “>” was always trying to tell us.