fix(proxy): narrow budget-safe gate to primary usage#561
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 963e61a60f
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| and state_list | ||
| and all(_state_above_budget_threshold(state, budget_threshold_pct) for state in state_list) | ||
| ): | ||
| return SelectionResult(min(state_list, key=_primary_budget_pressure_sort_key), None) |
There was a problem hiding this comment.
Route all-pressured fallback through availability checks
When every candidate is above the primary threshold, this branch returns min(state_list, ...) directly instead of delegating to select_account. That bypasses the availability filtering in app/core/balancer/logic.py (RATE_LIMITED/QUOTA_EXCEEDED reset checks, cooldown suppression, and error-backoff handling), so under usage_weighted the balancer can pick an account that is currently unavailable just because it has lower primary usage. This regression appears whenever all primary usages are > threshold and at least one candidate is blocked/unrecoverable at selection time.
Useful? React with 👍 / 👎.
963e61a to
b10cabe
Compare
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: b10cabe0f9
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| for state in sorted(state_list, key=_primary_budget_pressure_sort_key): | ||
| candidate = select_account( | ||
| [state], |
There was a problem hiding this comment.
Preserve health-tier preference in all-pressured fallback
This loop evaluates each account in isolation, which bypasses select_account’s pool-level health-tier logic (healthy > probing > draining). In the usage_weighted + all-primary-pressured path, a draining account with slightly lower primary usage can now be selected before a healthy account, even though the normal selector intentionally prefers healthy capacity first. That is a behavioral regression in failover safety and can route traffic to degraded accounts when healthy ones are available.
Useful? React with 👍 / 👎.
|
Thanks @Komzpa -- the #446 narrowing is exactly right. Codex review just landed two P1 findings on P1 -- bypassing availability filteringThe new all-primary-pressured branch iterates P1 -- bypassing health-tier preferenceSame shape: Suggested shapeKeep the gate-narrowing change as-is (that's the right fix for #446), but rework the all-pressured fallback so it stays inside one
The second option is cleaner architecturally, the first is less invasive. Either is fine. Could you also add a regression test for each P1: def test_all_primary_pressured_fallback_skips_unavailable_account()
def test_all_primary_pressured_fallback_prefers_healthy_over_draining()Tests can stub The OpenSpec proposal copy is correct -- both scenarios ( Ping me with the new commit and I'll re-run |
|
To use Codex here, create an environment for this repo. |
2b22cf6 to
f3f12d9
Compare
|
@codex review |
|
Codex Review: Didn't find any major issues. Another round soon, please! ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
|
@all-contributors please add @Komzpa for code, test |
…l fetch timeouts, image rewrite) Upstream batch v1.16.0 highlights: - proxy: expose drain status for graceful deploys (Soju06#564) - proxy: narrow budget-safe gate to primary usage (Soju06#561) - proxy: handle model fetch timeouts (Soju06#541) - proxy: inline-rewrite input_image file references (Soju06#571) - proxy: retry transient stream timeouts (Soju06#551) - proxy: use DEFAULT_HOME_DIR for oversized response.create dumps (Soju06#575) - accounts: split compact quota row + recover quota status (Soju06#562, Soju06#559) - db: size background pool for burst traffic (Soju06#563) - upstream: drop top_p for gpt-5 (Soju06#538) Conflict resolved: - app/modules/proxy/service.py: kept fork's multi-line import + FailurePhase (used by deterministic failover at line 8294). Merge cleanup: - app/main.py: removed pre-existing duplicate InFlightMiddleware class + duplicate _is_benign_metrics_bind_failure helper that shadowed canonical versions and broke new upstream test (test_in_flight_middleware_does_not_count_drain_status). Verification: - pytest tests/unit: 1459 passed, 17 failed (all pre-existing on backup branch backup/pre-upstream-merge-20260511 — date-bound API key tests, test_db_migrate.py SQLite env, settings duplicate validators). - openspec validate --specs: 19/19 passed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Summary
usage_weightedfallback prefer lower primary pressure before lower weekly usage_state_above_budget_thresholdprimary-only gating #446 edge casesCloses #446.
Verification
openspec validate narrow-budget-safe-primary-gate --strictUV_PROJECT_ENVIRONMENT=/home/kom/.venvs/codex-lb-fix-446 uv run pytest tests/unit/test_proxy_load_balancer_refresh.py -k 'budget_safe or reads_cached_usage_once' -q\n-UV_PROJECT_ENVIRONMENT=/home/kom/.venvs/codex-lb-fix-446 uv run ruff check app/modules/proxy/load_balancer.py tests/unit/test_proxy_load_balancer_refresh.py\n-UV_PROJECT_ENVIRONMENT=/home/kom/.venvs/codex-lb-fix-446 uv run ruff format --check app/modules/proxy/load_balancer.py tests/unit/test_proxy_load_balancer_refresh.py\n-UV_PROJECT_ENVIRONMENT=/home/kom/.venvs/codex-lb-fix-446 uv run ty check app/modules/proxy/load_balancer.py tests/unit/test_proxy_load_balancer_refresh.py