Skip to content

[fix](scan) Respect byte budget when merging scan blocks#63296

Draft
mrhhsg wants to merge 1 commit into
branch-4.1from
codex/fix-scanner-merge-byte-budget-4.1
Draft

[fix](scan) Respect byte budget when merging scan blocks#63296
mrhhsg wants to merge 1 commit into
branch-4.1from
codex/fix-scanner-merge-byte-budget-4.1

Conversation

@mrhhsg
Copy link
Copy Markdown
Member

@mrhhsg mrhhsg commented May 15, 2026

Summary

Fix scanner scheduler block merging so the adaptive batch size byte budget is respected when multiple scanned blocks are stitched into a cached block.

Root Cause

The scheduler merge path only checked the row count against batch_size(). When adaptive batch size produced multiple blocks that were individually acceptable, the scheduler could still merge them into a much larger block because it ignored preferred_block_size_bytes().

Changes

  • Capture preferred_block_size_bytes() for the scan task.
  • Merge into the last cached block only when both the row budget and byte budget are satisfied.
  • Keep empty-block merge behavior unchanged so eos/filtered-empty blocks are not emitted separately.
  • Preserve allocated_bytes() for memory accounting while using bytes() for the adaptive data-size budget.

Validation

  • git diff --check -- be/src/exec/scan/scanner_scheduler.cpp
  • ninja -C be/ut_build_ASAN src/exec/CMakeFiles/Exec.dir/scan/scanner_scheduler.cpp.o

Note: ./run-be-ut.sh --run --filter=ScannerContextTest.* was started earlier but stopped after it triggered a broad ASAN UT build; the changed object had already compiled successfully.

@hello-stephen
Copy link
Copy Markdown
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@mrhhsg mrhhsg changed the title [codex] Respect byte budget when merging scan blocks [fix](scan) Respect byte budget when merging scan blocks May 15, 2026
@mrhhsg
Copy link
Copy Markdown
Member Author

mrhhsg commented May 15, 2026

/review

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed the full PR diff and existing review context. No existing inline comments were present, and there was no additional user focus.

Critical checkpoints: correctness has a blocking issue because disabled adaptive batch sizing now still enforces a 512MB merge cap; compatibility/regression risk is therefore not acceptable until fixed. I did not identify concurrency, memory-accounting, security, or API-surface issues beyond this behavior regression. Tests/build were not run by me in this review environment; the PR description reports object compilation only.


const auto free_block_data_bytes = free_block->bytes();
const bool within_byte_budget = preferred_block_size_bytes == 0 ||
last_block->bytes() + free_block_data_bytes <=
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This treats preferred_block_size_bytes == 0 as the disabled case, but RuntimeState::preferred_block_size_bytes() never returns 0 when adaptive batch sizing is disabled; it returns kMax (512MB) and the accessor comment says callers that need the active/disabled state must check config::enable_adaptive_batch_size explicitly. As a result, with enable_adaptive_batch_size=false, this PR still stops merging once the combined data bytes exceed 512MB, whereas the old behavior was row-budget-only. That is a behavior regression for disabled adaptive mode and can produce extra cached blocks / earlier scan-task yielding for wide rows. Please gate the byte-budget check on config::enable_adaptive_batch_size (or otherwise pass a disabled byte budget) instead of relying on 0 here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants