[feature](file-cache) Disable cache writes after remote scan threshold#65058
Draft
bobhan1 wants to merge 9 commits into
Draft
[feature](file-cache) Disable cache writes after remote scan threshold#65058bobhan1 wants to merge 9 commits into
bobhan1 wants to merge 9 commits into
Conversation
…file cache write index only (apache#9450) pick selectdb/selectdb-core#9096 - Cherry-pick `a37184e9f097e789c4f4e0b40725a98fc49f2851` from `tag-selectdb-cloud-26.0.3-minimax`. - Support file cache write index-only behavior and related segment index/footer preload plumbing. - Adapt conflicts for this branch local writer, packed file, and vertical compaction interfaces. - Preserved this branch existing index writer interface instead of importing source-branch-only `IndexFileWriterPtr` / `create_index_file_writer` plumbing. - Preserved packed-file behavior and added branch-local `PackedAppendContext::write_file_cache` compatibility with legacy default behavior. - Adapted the new vertical compaction test to this branch 4-argument `RowsetWriter::add_columns` API. - Did not introduce `enable_file_cache_write_cumu_compaction_index_only` or `enable_file_cache_write_base_compaction_index_only`, because this branch did not originally have those configs. - `git diff HEAD^ HEAD --check` - `rg -n "\\bIndexFileWriterPtr\\b|has_ann_index|create_index_file_writer|enable_file_cache_write_(base|cumu)_compaction_index_only|compaction_output_write_index_only|should_enable_compaction_cache_index_only" be/src be/test regression-test docker/runtime/doris-compose/command.py -S` - `./run-be-ut.sh --run --filter=CloudFileCacheWriteIndexOnlyConfigTest.* -j120` - `./run-be-ut.sh --run --filter=CloudFileCacheWriteIndexOnly* -j120` - `./build.sh --be --fe --cloud -j120` - Rebuilt `foundationdb/foundationdb:7.1.26-single-layer` from remote registry layers to avoid the local containerd overlay mount failure. Verified the imported image has one rootfs layer. - Rebuilt `bh-cluster-2` with cloud FE enterprise guard jar present in `output/fe/lib/fe-enterprise.jar`, so FE can load `org.apache.doris.cluster.ClusterGuard` in cloud docker mode. - `env -u HTTP_PROXY -u HTTPS_PROXY -u http_proxy -u https_proxy -u ALL_PROXY -u all_proxy DORIS_FDB_IMAGE=foundationdb/foundationdb:7.1.26-single-layer ./run-regression-test.sh --run -d regression-test/suites/cloud_p0/cache/write_index_only -runMode=cloud` - Result: `Test 3 suites, failed 0 suites, fatal 0 scripts, skipped 0 scripts`
### What problem does this PR solve?
Issue Number: None
Related PR: None
Problem Summary: The empty-range file cache loader unit test expected load_segment_index_to_file_cache to skip opening segment data, but it did not enable cloud mode, so the function returned at the cloud-mode config gate before reaching the empty-range guard. Enable cloud mode in the fixture, restore it in teardown, and make the S3 open path fail if an empty range ever tries to open the segment.
### Release note
None
### Check List (For Author)
- Test: Unit Test
- `./run-be-ut.sh --run --filter=CloudFileCacheWriteIndexOnlyConfigTest.* -j100`
- Behavior changed: No
- Does this need documentation: No
### What problem does this PR solve?
Issue Number: None
Related PR: None
Problem Summary: TopN lazy materialization phase 2 may populate file cache while fetching deferred columns. This can pollute cache when the requested ranges are cache misses. Add a cloud-only session switch for the new PMultiGetRequestV2 path so phase-2 reads use cached blocks only when the full range is already downloaded, and otherwise read remote data directly without writing file cache. The change also exposes phase-2 file-cache counters in the MaterializeNode profile and covers row-store and column-store fetch paths.
### Release note
Added session variable `enable_topn_lazy_mat_phase2_no_write_file_cache` to avoid file-cache writes on TopN lazy materialization phase-2 cache misses.
### Check List (For Author)
- Test:
- Unit Test: ./run-be-ut.sh --run --filter=BlockFileCacheTest.get_downloaded_blocks_if_fully_covered_is_read_only:BlockFileCacheTest.cached_remote_file_reader_remote_only_on_miss -j20
- Build: ./build.sh --be --fe --cloud -j100
- Format: build-support/check-format.sh
- Regression test: env -u HTTP_PROXY -u HTTPS_PROXY -u http_proxy -u https_proxy -u ALL_PROXY -u all_proxy ./run-regression-test.sh --run -d cloud_p0/cache/topn_lazy_file_cache -s test_topn_lazy_mat_phase2_no_write_file_cache -g docker -runMode=cloud -dockerSuiteParallel 1
- Behavior changed: Yes. When the new session variable is enabled in cloud mode, TopN lazy materialization phase-2 cache misses read remote data without writing file cache.
- Does this need documentation: No
### What problem does this PR solve?
Issue Number: None
Related PR: None
Problem Summary: TopN lazy materialization phase 2 exposed only aggregated profile counters, which makes backend-level skew and IO differences hard to identify when phase-2 fetch fans out to multiple backends. Add aggregate rows/segments counters and per-backend rows, segments, and file-cache statistics for the new TopN lazy materialization V2 path. The per-backend values are accumulated in MaterializationSharedState so multiple phase-2 fetch calls in one query are reflected in the final profile.
### Release note
Added per-backend TopN lazy materialization phase-2 profile counters.
### Check List (For Author)
- Test:
- Build: ./build.sh --be --fe --cloud -j100
- Format: build-support/clang-format.sh; build-support/check-format.sh; git diff --check
- Regression test: env -u HTTP_PROXY -u HTTPS_PROXY -u http_proxy -u https_proxy -u ALL_PROXY -u all_proxy ./run-regression-test.sh --run -d cloud_p0/cache/topn_lazy_file_cache -s test_topn_lazy_mat_phase2_no_write_file_cache -g docker -runMode=cloud -dockerSuiteParallel 1
- Behavior changed: Yes. TopN lazy materialization phase-2 profiles now include aggregate row/segment counts and per-backend detail counters.
- Does this need documentation: No
…able cache writes after remote scan threshold (apache#9151) (apache#9531) pick selectdb/selectdb-core#9151 without the dependency of fe session var enable_file_cache, and fix the control for inverted index file - Backport remote scan cache write limiting for `remote_scan_no_write_file_cache_threshold_bytes`. - Remove the dependency on FE session variable `enable_file_cache` so the threshold still disables cache writes when session file-cache reads are disabled. - Add and update BE UT / cloud docker regression coverage for the threshold behavior. - `git diff --check` - `./run-be-ut.sh --run --filter=BlockFileCacheTest.file_cache_profile_remote_only_on_miss_state_counters:BlockFileCacheTest.remote_scan_cache_write_limiter_strict_budget:BlockFileCacheTest.remote_scan_cache_write_limiter_threshold_zero_and_negative:BlockFileCacheTest.remote_scan_cache_write_limiter_concurrent_budget:BlockFileCacheTest.get_or_set_remote_scan_cache_write_limiter_admission:BlockFileCacheTest.cached_remote_file_reader_policy_remote_only_with_scan_limiter:BlockFileCacheTest.cached_remote_file_reader_remote_scan_cache_write_limiter -j100` - `./build.sh --be --cloud -j100` - `./build.sh --be -j100` - Verified FE license jar in docker image `bh-cluster-2`. - `env -u HTTP_PROXY -u HTTPS_PROXY -u http_proxy -u https_proxy -u ALL_PROXY -u all_proxy DORIS_FDB_IMAGE=foundationdb/foundationdb:7.1.26-single-layer ./run-regression-test.sh --run -d regression-test/suites/cloud_p0/cache/remote_scan_no_write_file_cache -s test_remote_scan_no_write_file_cache_threshold -runMode=cloud -dockerSuiteParallel 1` (cherry picked from commit c45f7c05039841bc1c97be497b325cc8b015e116)
Contributor
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
…remote scan cache limit session var (apache#9578) - Rename `remote_scan_no_write_file_cache_threshold_bytes` to `file_cache_query_limit_bytes` across FE session variables, thrift, and BE query options. - Keep the remote scan cache write limiter behavior unchanged while aligning the byte-based option with `file_cache_query_limit_percent` naming. - Update the remote scan cache regression suite and make the segment-footer cache-write assertion conditional on actual footer remote IO. - `./run-fe-ut.sh --run org.apache.doris.qe.SessionVariablesTest` - `./build.sh --be --fe --cloud -j100` - `mvn package -pl fe-enterprise/fe-license-cloud -am -DskipTests` - `docker build -f docker/runtime/doris-compose/Dockerfile -t bh-cluster-2 .` - `env -u HTTP_PROXY -u HTTPS_PROXY -u http_proxy -u https_proxy -u ALL_PROXY -u all_proxy DORIS_FDB_IMAGE=foundationdb/foundationdb:7.1.26-single-layer ./run-regression-test.sh --run -d regression-test/suites/cloud_p0/cache/remote_scan_no_write_file_cache -s test_remote_scan_no_write_file_cache_threshold -runMode=cloud -dockerSuiteParallel 1 -image bh-cluster-2`
Add specialized file cache write metrics for inverted index and segment footer index reads. The query profile already exposed detailed file cache read counters for `InvertedIndex` and `SegmentFooterIndex`, but write-back into file cache was only visible through the aggregate `BytesWriteIntoCache` and `WriteCacheIOUseTimer` counters. This change adds the corresponding specialized write counters: - `InvertedIndexWriteCacheIOUseTimer` - `InvertedIndexBytesWriteIntoCache` - `SegmentFooterIndexWriteCacheIOUseTimer` - `SegmentFooterIndexBytesWriteIntoCache` - Extend `FileCacheStatistics` with per-category write-cache bytes and timer fields. - Register and update the new `RuntimeProfile` counters under `FileCache`. - Route `CachedRemoteFileReader` write-back bytes and local write timer through the existing `FileCacheReadType` classification. - Add BE UT coverage for both profile counter reporting and real cached remote reader miss-fill stats. - `git diff --check` - `./run-be-ut.sh --run --filter=BlockFileCacheTest.file_cache_profile_specialized_write_cache_counters:BlockFileCacheTest.cached_remote_file_reader_specialized_write_cache_stats:BlockFileCacheTest.cached_remote_file_reader_remote_scan_cache_write_limiter:BlockFileCacheTest.cached_remote_file_reader_policy_remote_only_with_scan_limiter -j100`
… segment meta query limit accounting (apache#9591) Add a BE config `file_cache_query_limit_segment_meta` to control whether segment footer and other segment metadata cache writes are counted by `file_cache_query_limit_bytes`. The default remains unchanged: segment metadata cache writes do not consume the per-query write limit unless the new config is enabled. Inverted index cache writes keep their existing behavior. `file_cache_query_limit_bytes` is enforced through the per-query `RemoteScanCacheWriteLimiter`, but several segment metadata read paths created fresh `IOContext` instances or loaded metadata without receiving the query `IOContext`. As a result, segment footer and related segment metadata cache writes were not consistently visible to the query-level limiter. - Add mutable BE config `file_cache_query_limit_segment_meta`. - Update `CacheContext` admission so segment metadata participates in query write-limit accounting only when the config is enabled. - Propagate query `IOContext` through segment footer, primary-key index, column reader cache, and variant subcolumn reader paths. - Keep the metadata classification as segment metadata (`is_index_data=true`, `is_inverted_index=false`) so the behavior is not confused with inverted indexes. - Add BE UT coverage for CacheContext admission and segment footer `IOContext` propagation. - Add a cloud docker profile case that validates: - segment metadata not counted: profile `BytesWriteIntoCache` can exceed the query threshold by metadata writes; - segment metadata counted: admitted writes respect the threshold; - tiny threshold: config off gives `BytesWriteIntoCache > 0`, config on gives `BytesWriteIntoCache = 0`. - `./run-be-ut.sh --run --filter=BlockFileCacheTest.get_or_set_remote_scan_cache_write_limiter_segment_meta_config:SegmentFooterCacheTest.GetSegmentFooterPropagatesIoContext:DorisFSDirectoryTest.FSIndexInputSetIoContextPropagatesQueryLimiter -j100` - `./build.sh --be -j100` - `cd fe && mvn -pl fe-enterprise/fe-license-cloud -am -DskipTests package` - Rebuilt docker image `bh-cluster-2` with rebuilt `doris_be` and `fe-license-cloud-1.2-SNAPSHOT.jar` copied into `output/fe/lib/fe-enterprise.jar`. - Docker validation used `DORIS_FDB_IMAGE=foundationdb/foundationdb:7.1.26-single-layer` and no license URL. - Existing added file-cache docker cases passed: - `remote_scan_no_write_file_cache/test_remote_scan_no_write_file_cache_threshold.groovy` - `topn_lazy_file_cache/test_topn_lazy_mat_phase2_no_write_file_cache.groovy` - all cases under `write_index_only` - New docker case passed: - `remote_scan_no_write_file_cache/test_file_cache_query_limit_segment_meta_profile.groovy`
…ery IO context to preload segment meta (apache#9601) fix selectdb/selectdb-core#9591 Fix query file-cache limit accounting for segment footer/meta reads done by the parallel scanner preload path. `file_cache_query_limit_segment_meta` only worked on paths that already carried the query `IOContext`. `ParallelScannerBuilder::_load()` preloads segment row counts by opening segment footers before scanner construction, but that path did not pass the query IO context, so footer/meta reads did not see the query id, file-cache profile stats, or the query-wide `RemoteScanCacheWriteLimiter`. This change threads an optional `IOContext` through `SegmentLoader`, `BetaRowset::get_segment_num_rows`, and `Segment::open`, and builds a query IO context for the parallel preload step. It also extends coverage for `Segment::open` IOContext propagation and adds docker regression coverage for the parallel preload behavior without `dry_run_query`. - `git diff --check` - `./run-be-ut.sh --run --filter=SegmentFooterCacheTest.* -j100` - `mvn package -pl fe-enterprise/fe-license-cloud -am -DskipTests` - `cp fe/fe-enterprise/fe-license-cloud/target/fe-license-cloud-1.2-SNAPSHOT.jar output/fe/lib/fe-enterprise.jar` - `./build.sh --be -j100` - `docker build -f docker/runtime/doris-compose/Dockerfile -t bh-cluster-2 .` - `env -u HTTP_PROXY -u HTTPS_PROXY -u http_proxy -u https_proxy -u ALL_PROXY -u all_proxy DORIS_FDB_IMAGE=foundationdb/foundationdb:7.1.26-single-layer ./run-regression-test.sh --run -d regression-test/suites/cloud_p0/cache/remote_scan_no_write_file_cache -s test_remote_scan_no_write_file_cache_threshold -runMode=cloud -dockerSuiteParallel 1 -image bh-cluster-2` - `env -u HTTP_PROXY -u HTTPS_PROXY -u http_proxy -u https_proxy -u ALL_PROXY -u all_proxy DORIS_FDB_IMAGE=foundationdb/foundationdb:7.1.26-single-layer ./run-regression-test.sh --run -d regression-test/suites/cloud_p0/cache/remote_scan_no_write_file_cache -s test_file_cache_query_limit_segment_meta_profile -runMode=cloud -dockerSuiteParallel 1 -image bh-cluster-2`
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What problem does this PR solve?
Issue Number: None
Related PR: None
Problem Summary:
Remote scans can read a large amount of data from object storage. Before this change, a query that had already exceeded its useful cache-write budget could still keep writing later remote cache misses into file cache, increasing cache churn and IO overhead.
This PR adds a query-wide remote-scan file-cache write limiter. After the configured per-query threshold is reached, later cache misses continue reading from remote storage but skip additional file-cache writes.
Segment footer and metadata reads can also write file-cache blocks. This PR adds optional accounting for those metadata reads and propagates the query IO context through segment preload, segment loading, lazy segment iterator initialization, column-reader metadata, and index metadata paths so the same query-wide limiter is applied consistently.
Release note
Add query-level remote-scan file-cache write limiting and optional segment metadata accounting.
Check List (For Author)
git diff --check./build.sh --be --fe --cloud -j100./run-be-ut.sh --run --filter=BlockFileCacheTest.get_or_set_remote_scan_cache_write_limiter_segment_meta_config:BlockFileCacheTest.cached_remote_file_reader_specialized_write_cache_stats:SegmentFooterCacheTest.GetSegmentFooterPropagatesIoContext:SegmentFooterCacheTest.OpenPropagatesIoContextToFooter:DorisFSDirectoryTest.FSIndexInputSetIoContextPropagatesQueryLimiter -j100env -u HTTP_PROXY -u HTTPS_PROXY -u http_proxy -u https_proxy -u ALL_PROXY -u all_proxy ./run-regression-test.sh --run -d cloud_p0/cache/remote_scan_no_write_file_cache -s test_file_cache_query_limit_segment_meta_profile -g docker -runMode=cloud -dockerSuiteParallel 1