fix(proxy): slim oversized response.create history by Komzpa · Pull Request #560 · Soju06/codex-lb

Komzpa · 2026-05-06T20:42:56Z

Summary

Continue response.create slimming by omitting the oldest historical input items when image/tool-output slimming is still over the websocket budget.
Preserve the recent user-turn suffix and add a single assistant notice with the omitted item count.
Move oversized response.create debug dumps under the default app home directory so local user installs use ~/.codex-lb while containers keep /var/lib/codex-lb.

Fixes #556.

Validation

openspec validate --specs
uv run pytest tests/unit/test_proxy_utils.py -k 'slim_response_create or oversized_response_create_dump_dir' -q
uv run ruff check app/modules/proxy/service.py tests/unit/test_proxy_utils.py
uv run ty check app/modules/proxy/service.py tests/unit/test_proxy_utils.py

Soju06 · 2026-05-09T11:07:14Z

@codex review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 2b3479d4f3

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-05-09T11:10:36Z

+        candidate_payload["input"] = [
+            _response_create_history_omission_notice_item(omitted_count),
+            *remaining_historical,
+            *recent,
+        ]


Retry fit without notice when all history is dropped

When _omit_historical_response_input_items_to_fit removes every historical item, it still always prepends _response_create_history_omission_notice_item, and if that final payload is over max_bytes the function returns it anyway. This creates avoidable 413 payload_too_large failures for near-limit requests where recent alone would fit but notice + recent does not (for example, tiny historical items that are smaller than the notice text). Please add a final fallback that checks the recent-only payload before returning an oversized result.

Useful? React with 👍 / 👎.

Soju06 · 2026-05-09T11:15:48Z

Thanks @Komzpa -- the historical-item omission and DEFAULT_HOME_DIR move both look right and #556 will close cleanly. Codex review landed one P2 finding I think is worth a small fix:

P2 -- recent-only fallback when notice + recent still over budget

In _omit_historical_response_input_items_to_fit, after every historical item has been removed the loop still prepends _response_create_history_omission_notice_item. If notice + recent is itself over max_bytes (small history items, large notice text, or already-large recent), the function returns the oversized payload anyway and the caller still 413s -- even though dropping the notice would have fit.

Suggested addition right before the function returns the final candidate_payload:

if _serialized_json_size(candidate_payload) > max_bytes:
    # Final fallback: notice + recent itself is oversized; try recent
    # alone. Drops the omission disclosure but lets the request go
    # through instead of hard-failing the user-visible turn.
    recent_only = dict(payload)
    recent_only["input"] = list(recent)
    if _serialized_json_size(recent_only) <= max_bytes:
        return recent_only, omitted_count

The trade-off is that the user no longer sees the "omitted N items" notice, but they also do not get a 413 in a case where the request could have succeeded. If you'd rather keep the notice always and just log a warning + still 413, that's a defensible call too -- the current behavior is silently returning oversized.

Quick regression to lock the contract:

def test_slim_response_create_drops_notice_when_notice_plus_recent_oversized():
    # tiny historical items, large enough recent + notice text combined > max_bytes,
    # but recent alone <= max_bytes
    # assert summary['historical_items_omitted'] == len(history)
    # assert _serialized_json_size(slimmed_payload) <= max_bytes
    # assert no notice item in slimmed_payload['input']

Not a hard blocker -- if you'd rather keep this PR scoped to the original ask and ship as-is, fine, the current behavior is no worse than today's pre-PR 413. Up to you. The dump-dir move and the omission-pass core both look right.

If you push the fallback I'll re-run @codex review and we can land it. Otherwise let me know and I'll go ahead and merge with the P2 as a known soft edge.

chatgpt-codex-connector · 2026-05-09T11:16:02Z

To use Codex here, create an environment for this repo.

Soju06 · 2026-05-09T12:49:13Z

@codex review

Soju06 · 2026-05-09T12:57:33Z

@codex review

chatgpt-codex-connector · 2026-05-09T13:01:12Z

Codex Review: Didn't find any major issues. Hooray!

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Soju06 · 2026-05-09T13:03:49Z

@all-contributors please add @Komzpa for code, test

allcontributors · 2026-05-09T13:03:53Z

@Soju06

@Komzpa already contributed before to code, test

Soju06 · 2026-05-09T13:13:52Z

@codex review

chatgpt-codex-connector · 2026-05-09T13:26:51Z

Codex Review: Didn't find any major issues. 🎉

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

) This reverts merge commit e42af5e. The slimming approach rewrites conversation history on the codex-lb side (a stateless forwarding proxy) and conflicts with several stateful surfaces that have to stay consistent with the upstream view: - Prompt-cache invalidation: every slimmed turn changes the input prefix and forces a cache miss against the upstream prompt cache, silently re-billing input tokens. This intersects with #515's pin / affinity work and is invisible to the user. - Assistant-role omission notice ('[codex-lb omitted N items]'): the model treats this as its own prior speech and starts apologising for missing context, hallucinating that earlier history is known, or contradicting the trimmed turns. - WebSocket turn_state mismatch: the Codex WS path is stateful through previous_response_id. After slimming, codex-lb's request view diverges from the upstream session view; replays of the slimmed payload can reference items the upstream copy expected but we already trimmed. These are not knobs we can tune: history compaction belongs in the Codex client (or upstream Responses API), not in a forwarding proxy. The 413 -> slim transition is strictly better for the immediate symptom, but it shifts the failure mode from a clear hard fail into silent degradation of cache cost, model trust, and WS continuity. Reverting before this lands in a release. Tracking the broader follow-up in #568. The original 413 reporter (#556) reopens for now while we route this through Codex CLI / file-upload paths instead.

…575) The oversized response.create payload dump dir was hardcoded to `/var/lib/codex-lb/debug/response-create-dumps`. That path is correct inside the container image (where DEFAULT_HOME_DIR resolves to /var/lib/codex-lb), but on macOS `uv tool` / LaunchAgent installs the process has no writable access to /var/lib/codex-lb and the dump path silently fails when the proxy tries to write a guarded oversized payload (originally reported in #556). Resolve the dump dir from DEFAULT_HOME_DIR so the path tracks the deploy's data directory: - inside the container image: `/var/lib/codex-lb/debug/response-create-dumps` (unchanged) - macOS `uv tool` install: `~/.codex-lb/debug/response-create-dumps` This is the dump-path slice of the reverted #560 (#569). It does not include any of the history-slimming changes from #560 -- only the portable dump path, which has no transport-layer interaction and only fires when the proxy is *guarding* against an oversized request that already exceeds the upstream WebSocket frame budget. Verified against the dev container: DEFAULT_HOME_DIR resolves to `/var/lib/codex-lb`, dump_dir resolves to `/var/lib/codex-lb/debug/response-create-dumps`, smoke test on /v1/responses returns 'OK', and the existing test suite passes (1493 passed, 3 skipped, 4 warnings).

…try, add #569/#575

* chore(main): release 1.16.0 * docs(changelog): mark #560 reverted, fold #574 narrowing into #571 entry, add #569/#575

fix(proxy): slim oversized response.create history

2b3479d

Komzpa force-pushed the fix/response-create-history-slimming branch from 3c311bf to 2b3479d Compare May 6, 2026 20:45

Komzpa mentioned this pull request May 6, 2026

Tested live aggregate stack and suggested merge order #554

Open

chatgpt-codex-connector Bot reviewed May 9, 2026

View reviewed changes

Komzpa added 5 commits May 9, 2026 15:18

fix(proxy): drop omission notice when recent input fits

18ccf5e

fix(proxy): slim core websocket response history

4d2b0ed

fix(proxy): bound response history omission search

4a5a7c6

fix(proxy): keep tool outputs paired while trimming history

bb3aac5

test(proxy): type websocket slimmed payload assertions

5ccc01e

Soju06 merged commit e42af5e into Soju06:main May 9, 2026
18 checks passed

Soju06 mentioned this pull request May 9, 2026

chore(main): release 1.16.0 #490

Merged

This was referenced May 9, 2026

follow-up: side effects of response.create history slimming (#560) #568

Open

Revert #560: slim oversized response.create history #569

Merged

Soju06 mentioned this pull request May 9, 2026

Large response.create payloads still 413 after slimming; dump path unwritable on macOS uv install #556

Closed

Soju06 added a commit that referenced this pull request May 10, 2026

docs(changelog): mark #560 reverted, fold #574 narrowing into #571 en…

0effe36

…try, add #569/#575

Soju06 added a commit that referenced this pull request May 10, 2026

chore(main): release 1.16.0 (#490)

297cb29

* chore(main): release 1.16.0 * docs(changelog): mark #560 reverted, fold #574 narrowing into #571 entry, add #569/#575

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(proxy): slim oversized response.create history#560

fix(proxy): slim oversized response.create history#560
Soju06 merged 6 commits intoSoju06:mainfrom
Komzpa:fix/response-create-history-slimming

Komzpa commented May 6, 2026

Uh oh!

Soju06 commented May 9, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot May 9, 2026

Uh oh!

Soju06 commented May 9, 2026

Uh oh!

chatgpt-codex-connector Bot commented May 9, 2026

Uh oh!

Soju06 commented May 9, 2026

Uh oh!

Soju06 commented May 9, 2026

Uh oh!

chatgpt-codex-connector Bot commented May 9, 2026

Uh oh!

Uh oh!

Soju06 commented May 9, 2026

Uh oh!

allcontributors Bot commented May 9, 2026

Uh oh!

Soju06 commented May 9, 2026

Uh oh!

chatgpt-codex-connector Bot commented May 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Komzpa commented May 6, 2026

Summary

Validation

Uh oh!

Soju06 commented May 9, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot May 9, 2026

Choose a reason for hiding this comment

Uh oh!

Soju06 commented May 9, 2026

P2 -- recent-only fallback when notice + recent still over budget

Uh oh!

chatgpt-codex-connector Bot commented May 9, 2026

Uh oh!

Soju06 commented May 9, 2026

Uh oh!

Soju06 commented May 9, 2026

Uh oh!

chatgpt-codex-connector Bot commented May 9, 2026

Uh oh!

Uh oh!

Soju06 commented May 9, 2026

Uh oh!

allcontributors Bot commented May 9, 2026

Uh oh!

Soju06 commented May 9, 2026

Uh oh!

chatgpt-codex-connector Bot commented May 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants