Skip to content

LAC: fallback to legacy group when pd not support keyspace#10929

Open
yongman wants to merge 5 commits into
pingcap:masterfrom
yongman:fix-resource-control-compatible
Open

LAC: fallback to legacy group when pd not support keyspace#10929
yongman wants to merge 5 commits into
pingcap:masterfrom
yongman:fix-resource-control-compatible

Conversation

@yongman

@yongman yongman commented Jun 25, 2026

Copy link
Copy Markdown
Member

What problem does this PR solve?

Issue Number: close #10939

Problem Summary:
There is compatible issue when the tiflash runs with pd without keyspace resource group.
Ref: #10218

  • This change teaches LocalAdmissionController to keep working when TiFlash runs with keyspace support enabled but PD still serves only legacy resource groups. It does that by:
    • falling back from a keyspace-scoped GetResourceGroup lookup to the legacy nullspace lookup,
    • synthesizing a local "default" group when PD does not persist one,
    • watching both the keyspace and legacy etcd paths so cached legacy groups still receive updates.

What is changed and how it works?

  • ResourceGroup now has an enable_gac flag. Reserved local defaults are created with enable_gac=false, so normal token acquisition/reporting skips them. buildRequestInfoIfNecessary() returns no request when that flag is disabled. (dbms/src/Flash/ResourceControl/LocalAdmissionController.h:80-90, dbms/src/Flash/ResourceControl/LocalAdmissionController.cpp:77-117)
  • LocalAdmissionController now resolves lookups through findResourceGroupWithCompatWithoutLock(): it first checks the exact (keyspace_id, name) cache entry, then falls back to (NullspaceID, name) for keyspace requests. All normal runtime callers (getPriority, consumeResource, estWaitDuraMS) now use that compatibility lookup. (dbms/src/Flash/ResourceControl/LocalAdmissionController.h:566-599)
  • warmupResourceGroupInfoCache() changed from a single keyspace-scoped PD lookup into a compatibility sequence:
    • try GetResourceGroup with keyspace_id,
    • on resp.has_error(), retry without keyspace_id,
    • if that legacy lookup succeeds, cache the result under the resolved keyspace from the returned protobuf,
    • if both fail and the name is "default", synthesize a reserved local default group.
      Returned protobufs are normalized so missing keyspace_id becomes NullspaceID. (dbms/src/Flash/ResourceControl/LocalAdmissionController.cpp:334-420)
  • GAC request building now omits keyspace_id entirely for nullspace groups, which keeps fallback traffic compatible with legacy PD semantics. Response handling and metrics also use the cached group’s actual keyspace instead of the original request keyspace. (dbms/src/Flash/ResourceControl/LocalAdmissionController.cpp:503-587, dbms/src/Flash/ResourceControl/LocalAdmissionController.cpp:694-840)
  • When with_keyspace=true, the constructor now starts two watch loops, one for resource_group/keyspace/settings and one for resource_group/settings. To support two concurrent watchers, the code replaced the single stored grpc::ClientContext with an active_watch_gac_grpc_contexts set so stop() can cancel every active watch stream. (dbms/src/Flash/ResourceControl/LocalAdmissionController.h:367-388, dbms/src/Flash/ResourceControl/LocalAdmissionController.h:724-728, dbms/src/Flash/ResourceControl/LocalAdmissionController.cpp:843-936, dbms/src/Flash/ResourceControl/LocalAdmissionController.cpp:1059-1099)
  • The new gtests cover:
    • reserved default fallback,
    • exact keyspace hit,
    • fallback to a legacy cached group,
    • normalization of legacy watch updates to nullspace.
      (dbms/src/Flash/ResourceControl/tests/gtest_local_admission_controller.cpp:59-238)

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No code

Side effects

  • Performance regression: Consumes more CPU
  • Performance regression: Consumes more Memory
  • Breaking backward compatibility

Documentation

  • Affects user behaviors
  • Contains syntax changes
  • Contains variable changes
  • Contains experimental features
  • Changes MySQL compatibility

Release note

None

Manual Test

  1. Deploy cluster with legacy pd/tidb from pd-cse/tidb-cse.
  2. Run tpcc workload with tiup bench.
  3. There is no exception happens after this compatible fix.

Summary by CodeRabbit

  • New Features
    • Enhanced resource-group cache warmup with keyspace normalization and reserved-default group fallback/promotion.
    • Added smarter handling for keyspace-scoped updates, including restoring refill behavior when upgrading from a compatibility entry.
    • Improved GAC request/metric behavior when keyspace info is missing.
  • Bug Fixes
    • Prevented duplicate cache warmups and improved RU-per-second handling for fallback scenarios.
    • Reworked background watch context handling to make updates safer and shutdown more reliable.
  • Tests
    • Added gtests covering fallback-to-exact promotion, non-not-found error behavior, and reserved-default fallback.

Signed-off-by: yongman <yming0221@gmail.com>
@ti-chi-bot

ti-chi-bot Bot commented Jun 25, 2026

Copy link
Copy Markdown
Contributor

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@ti-chi-bot ti-chi-bot Bot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. release-note-none Denotes a PR that doesn't merit a release note. labels Jun 25, 2026
@ti-chi-bot

ti-chi-bot Bot commented Jun 25, 2026

Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign wshwsh12 for approval. For more information see the Code Review Process.
Please ensure that each of them provides their approval before proceeding.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@coderabbitai

coderabbitai Bot commented Jun 25, 2026

Copy link
Copy Markdown

Review Change Stack

📝 Walkthrough

Walkthrough

LocalAdmissionController now supports compatibility lookup for keyspace-less resource groups, reserved-default fallback behavior, keyspace-normalized GAC handling, multi-context etcd watch management, and updated tests for these flows.

Changes

LAC Old-PD Compatibility and Watch Hardening

Layer / File(s) Summary
ResourceGroup and controller contracts
dbms/src/Flash/ResourceControl/LocalAdmissionController.h, dbms/src/Flash/ResourceControl/LocalAdmissionController.cpp
Adds ResourceGroupCompatLookupResult, the enable_gac constructor/member, the reserved-default RU constant, test-only cache helpers, and the start_background_threads constructor option.
Compat lookup and insertion
dbms/src/Flash/ResourceControl/LocalAdmissionController.h, dbms/src/Flash/ResourceControl/LocalAdmissionController.cpp
Adds NullspaceID fallback lookup, routes resource-group finders through it, extends addResourceGroup with enable_gac handling, and records low-token bookkeeping with the resolved group's own keyspace and name.
Warmup fallback and reserved default
dbms/src/Flash/ResourceControl/LocalAdmissionController.cpp
Adds reserved-default construction and fallback gating, skips duplicate warmups on exact cache hits, and updates warmup to handle PD errors, legacy fallback, reserved-default insertion, and normalized non-error inserts.
GAC request and response normalization
dbms/src/Flash/ResourceControl/LocalAdmissionController.cpp
Builds GAC requests without keyspace_id for NullspaceID, maps missing request keyspaces back to NullspaceID, and derives handled tuples and metrics from the resolved in-memory resource-group keyspace.
Watch lifecycle and PUT updates
dbms/src/Flash/ResourceControl/LocalAdmissionController.cpp
Manages per-watch grpc contexts in a set, rewires doWatch to take a grpc context, normalizes PUT keyspace ids, promotes compat fallback entries to exact groups on keyspace-scoped updates, and cancels all active contexts on stop().
Compatibility tests
dbms/src/Flash/ResourceControl/tests/gtest_local_admission_controller.cpp
Adds a mock PD client and tests for exact-group promotion through watch updates, non-not-found PD errors, and reserved-default fallback when legacy lookup throws.

Estimated code review effort: 4 (Complex) | ~60 minutes

Sequence Diagram(s)

sequenceDiagram
  participant Client
  participant LocalAdmissionController
  participant PDClient
  participant EtcdWatch

  Client->>LocalAdmissionController: warmupResourceGroupInfoCache(keyspace, name)
  LocalAdmissionController->>LocalAdmissionController: check exact cache hit
  alt cache miss
    LocalAdmissionController->>PDClient: getResourceGroup(keyspace, name)
    alt PD error
      LocalAdmissionController->>PDClient: legacy getResourceGroup(name)
      alt legacy succeeds
        LocalAdmissionController->>LocalAdmissionController: insert legacy or reserved-default group
      else legacy fails
        LocalAdmissionController->>LocalAdmissionController: return error
      end
    else PD success
      LocalAdmissionController->>LocalAdmissionController: normalize and insert group
    end
  end
  EtcdWatch->>LocalAdmissionController: PUT event
  LocalAdmissionController->>LocalAdmissionController: normalize keyspace id
  LocalAdmissionController->>LocalAdmissionController: promote compat fallback to exact group
Loading

Poem

A rabbit hops where keys run thin,
And finds the default tucked within.
With watches swapped and buckets right,
The cache stays warm through day and night.
🐇

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 8.57% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title is concise and accurately summarizes the main legacy-fallback compatibility change.
Linked Issues check ✅ Passed The changes implement legacy resource-group fallback for keyspace-unsupported PD/TiDB, matching #10939 and the reported default(keyspace=4294967295) issue.
Out of Scope Changes check ✅ Passed The watch, cache, and test changes all support the compatibility fix and do not appear unrelated to the PR goal.
Description check ✅ Passed The PR description matches the required template and includes the problem summary, change summary, checklist, manual test, and release note.
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@ti-chi-bot ti-chi-bot Bot added the size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. label Jun 25, 2026
Signed-off-by: yongman <yming0221@gmail.com>
@yongman yongman marked this pull request as ready for review June 29, 2026 13:06
@ti-chi-bot ti-chi-bot Bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jun 29, 2026

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (3)
dbms/src/Flash/ResourceControl/tests/gtest_local_admission_controller.cpp (2)

27-33: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Align the new test variables with TiFlash camelCase naming.

This file introduces several snake_case variables in new C++ code (get_resource_group, acquire_token_buckets, keyspace_id, group_request, etc.). Renaming them now keeps the tests consistent with the rest of the codebase.

Suggested rename pattern
-        std::function<resource_manager::GetResourceGroupResponse(const resource_manager::GetResourceGroupRequest &)>
-            get_resource_group_,
-        std::function<resource_manager::TokenBucketsResponse(const resource_manager::TokenBucketsRequest &)>
-            acquire_token_buckets_ = {})
-        : get_resource_group(std::move(get_resource_group_))
-        , acquire_token_buckets(std::move(acquire_token_buckets_))
+        std::function<resource_manager::GetResourceGroupResponse(const resource_manager::GetResourceGroupRequest &)>
+            getResourceGroupFn,
+        std::function<resource_manager::TokenBucketsResponse(const resource_manager::TokenBucketsRequest &)>
+            acquireTokenBucketsFn = {})
+        : getResourceGroupCb(std::move(getResourceGroupFn))
+        , acquireTokenBucketsCb(std::move(acquireTokenBucketsFn))
@@
-        return get_resource_group(req);
+        return getResourceGroupCb(req);
@@
-        if (acquire_token_buckets)
-            return acquire_token_buckets(req);
+        if (acquireTokenBucketsCb)
+            return acquireTokenBucketsCb(req);
@@
-        get_resource_group;
+        getResourceGroupCb;
@@
-        acquire_token_buckets;
+        acquireTokenBucketsCb;

As per coding guidelines, **/*.{cpp,h,hpp}: Method and variable names should use camelCase.

Also applies to: 49-53, 59-223

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@dbms/src/Flash/ResourceControl/tests/gtest_local_admission_controller.cpp`
around lines 27 - 33, The new test code in TestPDClient and the related gtest
helpers uses snake_case names, which conflicts with TiFlash camelCase
conventions. Rename the affected variables and any associated helper locals in
gtest_local_admission_controller.cpp, such as the fields in TestPDClient and the
request/keyspace-related test variables, to camelCase so the new tests match the
surrounding codebase style and naming guidelines.

Source: Coding guidelines


155-223: 🎯 Functional Correctness | 🔵 Trivial | ⚡ Quick win

Exercise the keyspace-less matching path with duplicate group names.

Line 155 only drives a single "default" request, so this still passes even if handleTokenBucketsResp ignores the new request-order fallback entirely. Add at least one more request for "default" under a different keyspace and assert the handled pairs come back in request order.

Minimal way to harden this test
-    constexpr KeyspaceID keyspace_id = 13;
+    constexpr KeyspaceID keyspaceId = 13;
+    constexpr KeyspaceID otherKeyspaceId = 17;
@@
-            EXPECT_EQ(req.requests_size(), 1);
-            if (req.requests_size() != 1)
+            EXPECT_EQ(req.requests_size(), 2);
+            if (req.requests_size() != 2)
                 return resp;
-            EXPECT_EQ(req.requests(0).resource_group_name(), "default");
-            EXPECT_TRUE(req.requests(0).has_keyspace_id());
-            if (!req.requests(0).has_keyspace_id())
-                return resp;
-            EXPECT_EQ(req.requests(0).keyspace_id().value(), keyspace_id);
+            EXPECT_EQ(req.requests(0).resource_group_name(), "default");
+            EXPECT_EQ(req.requests(1).resource_group_name(), "default");
+            EXPECT_EQ(req.requests(0).keyspace_id().value(), keyspaceId);
+            EXPECT_EQ(req.requests(1).keyspace_id().value(), otherKeyspaceId);
 
-            auto * one_resp = resp.add_responses();
-            one_resp->set_resource_group_name("default");
-            auto * granted = one_resp->add_granted_r_u_tokens();
-            granted->set_type(resource_manager::RequestUnitType::RU);
-            granted->set_trickle_time_ms(0);
-            granted->mutable_granted_tokens()->set_tokens(512);
-            granted->mutable_granted_tokens()->mutable_settings()->set_burst_limit(2048);
+            for (size_t i = 0; i < 2; ++i)
+            {
+                auto * oneResp = resp.add_responses();
+                oneResp->set_resource_group_name("default");
+                auto * granted = oneResp->add_granted_r_u_tokens();
+                granted->set_type(resource_manager::RequestUnitType::RU);
+                granted->set_trickle_time_ms(0);
+                granted->mutable_granted_tokens()->set_tokens(512);
+                granted->mutable_granted_tokens()->mutable_settings()->set_burst_limit(2048);
+            }
             return resp;
         });
@@
-    auto * group_request = req.add_requests();
-    group_request->set_resource_group_name("default");
-    group_request->mutable_keyspace_id()->set_value(keyspace_id);
-    auto * ru_items = group_request->mutable_ru_items();
-    auto * request_ru = ru_items->add_request_r_u();
-    request_ru->set_type(resource_manager::RequestUnitType::RU);
-    request_ru->set_value(512);
-
-    const auto req_rg_names = std::vector<std::pair<KeyspaceID, std::string>>{{keyspace_id, "default"}};
-    auto handled = lac.handleTokenBucketsResp(cluster.pd_client->acquireTokenBuckets(req), req_rg_names);
-    ASSERT_EQ(handled.size(), 1);
-    EXPECT_EQ(handled[0], std::make_pair(keyspace_id, std::string("default")));
+    for (const auto currentKeyspaceId : {keyspaceId, otherKeyspaceId})
+    {
+        auto * groupRequest = req.add_requests();
+        groupRequest->set_resource_group_name("default");
+        groupRequest->mutable_keyspace_id()->set_value(currentKeyspaceId);
+        auto * ruItems = groupRequest->mutable_ru_items();
+        auto * requestRu = ruItems->add_request_r_u();
+        requestRu->set_type(resource_manager::RequestUnitType::RU);
+        requestRu->set_value(512);
+    }
+
+    const auto reqRgNames = std::vector<std::pair<KeyspaceID, std::string>>{
+        {keyspaceId, "default"},
+        {otherKeyspaceId, "default"},
+    };
+    auto handled = lac.handleTokenBucketsResp(cluster.pd_client->acquireTokenBuckets(req), reqRgNames);
+    ASSERT_EQ(handled.size(), 2);
+    EXPECT_EQ(handled[0], std::make_pair(keyspaceId, std::string("default")));
+    EXPECT_EQ(handled[1], std::make_pair(otherKeyspaceId, std::string("default")));
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@dbms/src/Flash/ResourceControl/tests/gtest_local_admission_controller.cpp`
around lines 155 - 223, This test only covers a single "default" token-bucket
request, so it can pass even if the request-order fallback in
handleTokenBucketsResp is broken. Update
TokenBucketResponseWithoutKeyspaceMatchesKeyspaceRequest to add a second
"default" request for a different KeyspaceID, then verify the returned handled
pairs preserve the original request order. Keep the existing
warmupResourceGroupInfoCache, acquireTokenBuckets, and handleTokenBucketsResp
flow, but extend the req_rg_names expectations and assertions to cover both
requests.
dbms/src/Flash/ResourceControl/LocalAdmissionController.h (1)

513-513: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Use TiFlash width aliases for the new constant.

uint64_t/int32_t bypass the project aliases; use UInt64/Int32 here. As per coding guidelines, "Use explicit width types from dbms/src/Core/Types.h: UInt8, UInt32, Int64, Float64, String."

Proposed change
-    static constexpr uint64_t RESERVED_DEFAULT_RESOURCE_GROUP_RU_PER_SEC = std::numeric_limits<int32_t>::max();
+    static constexpr UInt64 RESERVED_DEFAULT_RESOURCE_GROUP_RU_PER_SEC = std::numeric_limits<Int32>::max();
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@dbms/src/Flash/ResourceControl/LocalAdmissionController.h` at line 513, The
new reservation constant in LocalAdmissionController should use the project’s
TiFlash width aliases instead of raw fixed-width types. Update
RESERVED_DEFAULT_RESOURCE_GROUP_RU_PER_SEC to use UInt64 and Int32 (from
Core/Types.h) so it matches the codebase typing guidelines and keeps the
declaration consistent with other aliases in this class.

Source: Coding guidelines

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@dbms/src/Flash/ResourceControl/LocalAdmissionController.cpp`:
- Around line 62-63: `LocalAdmissionController::buildGACRequest` currently
bypasses `enable_gac` only for normal request building, but the final-report
path still emits GAC requests and `stop()` can send them even when GAC is
disabled. Update the `buildGACRequest(true)` flow and the `stop()` send path to
honor `enable_gac` consistently, so fallback groups with `enable_gac=false` do
not report or transmit unsupported GAC data. Use the existing `enable_gac`,
`buildGACRequest`, and `stop` symbols to gate both cached-group reporting and
final request submission.
- Around line 349-367: Track the actual success of the legacy PD lookup in
LocalAdmissionController::warmupResourceGroupInfoCache instead of relying only
on legacy_resp.has_error(), because a thrown getResourceGroup(legacy_req) leaves
legacy_resp default-constructed and can incorrectly pass validation. Add an
explicit success flag around the
cluster->pd_client->getResourceGroup(legacy_req) call, set it only when the
request returns normally, and require that flag before validating or accepting
the legacy response; otherwise fall through to the reserved-default fallback
path.

In `@dbms/src/Flash/ResourceControl/LocalAdmissionController.h`:
- Around line 618-624: The removal path in LocalAdmissionController is
recomputing max RU while the old reserved group is still present, so the stale
entry can still influence the global maximum. In the cleanup logic that checks
iter->second->enable_gac and calls updateMaxRUPerSecAfterDeleteWithoutLock(),
erase the old keyspace_resource_groups entry first, then recompute
max_ru_per_sec using the removed group’s user_ru_per_sec so the replacement
GAC-enabled group is reflected correctly. Use the existing
keyspace_resource_groups, iter, and updateMaxRUPerSecAfterDeleteWithoutLock flow
to keep the change localized.

---

Nitpick comments:
In `@dbms/src/Flash/ResourceControl/LocalAdmissionController.h`:
- Line 513: The new reservation constant in LocalAdmissionController should use
the project’s TiFlash width aliases instead of raw fixed-width types. Update
RESERVED_DEFAULT_RESOURCE_GROUP_RU_PER_SEC to use UInt64 and Int32 (from
Core/Types.h) so it matches the codebase typing guidelines and keeps the
declaration consistent with other aliases in this class.

In `@dbms/src/Flash/ResourceControl/tests/gtest_local_admission_controller.cpp`:
- Around line 27-33: The new test code in TestPDClient and the related gtest
helpers uses snake_case names, which conflicts with TiFlash camelCase
conventions. Rename the affected variables and any associated helper locals in
gtest_local_admission_controller.cpp, such as the fields in TestPDClient and the
request/keyspace-related test variables, to camelCase so the new tests match the
surrounding codebase style and naming guidelines.
- Around line 155-223: This test only covers a single "default" token-bucket
request, so it can pass even if the request-order fallback in
handleTokenBucketsResp is broken. Update
TokenBucketResponseWithoutKeyspaceMatchesKeyspaceRequest to add a second
"default" request for a different KeyspaceID, then verify the returned handled
pairs preserve the original request order. Keep the existing
warmupResourceGroupInfoCache, acquireTokenBuckets, and handleTokenBucketsResp
flow, but extend the req_rg_names expectations and assertions to cover both
requests.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: f50ddec2-458c-4f53-ba69-95104e7293f2

📥 Commits

Reviewing files that changed from the base of the PR and between 63bb1ab and 04608f6.

📒 Files selected for processing (3)
  • dbms/src/Flash/ResourceControl/LocalAdmissionController.cpp
  • dbms/src/Flash/ResourceControl/LocalAdmissionController.h
  • dbms/src/Flash/ResourceControl/tests/gtest_local_admission_controller.cpp

Comment thread dbms/src/Flash/ResourceControl/LocalAdmissionController.cpp
Comment thread dbms/src/Flash/ResourceControl/LocalAdmissionController.cpp Outdated
Comment thread dbms/src/Flash/ResourceControl/LocalAdmissionController.h
@yongman yongman marked this pull request as draft July 1, 2026 05:20
@ti-chi-bot ti-chi-bot Bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jul 1, 2026
yongman added 2 commits July 1, 2026 13:21
Signed-off-by: yongman <yming0221@gmail.com>
Signed-off-by: yongman <yming0221@gmail.com>
@yongman yongman force-pushed the fix-resource-control-compatible branch 2 times, most recently from 7f0b351 to cd382a5 Compare July 1, 2026 09:53
@yongman yongman marked this pull request as ready for review July 1, 2026 10:11
@ti-chi-bot ti-chi-bot Bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jul 1, 2026

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
dbms/src/Flash/ResourceControl/tests/gtest_local_admission_controller.cpp (1)

51-155: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low value

Consider extracting shared test setup.

The three tests repeat the same pattern: construct pingcap::kv::Cluster, wrap a lambda in TestPDClient, and instantiate LocalAdmissionController with identical fixed args. A small helper (or a TEST_F fixture) building the cluster/LAC pair, parameterized by the response callback, would reduce duplication.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@dbms/src/Flash/ResourceControl/tests/gtest_local_admission_controller.cpp`
around lines 51 - 155, The three LocalAdmissionController tests duplicate the
same cluster and controller setup, so extract that shared setup into a helper or
a TEST_F fixture. Factor out the repeated pingcap::kv::Cluster construction,
TestPDClient lambda wiring, and LocalAdmissionController instantiation in
gtest_local_admission_controller.cpp, then have
KeyspaceWatchUpdatePromotesLegacyFallbackToExactGroup,
KeyspaceLookupDoesNotFallbackForNonNotFoundError, and
ReservedDefaultGroupStillFallbacksWhenLegacyLookupThrows reuse it with only
their response callback differing.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@dbms/src/Flash/ResourceControl/LocalAdmissionController.cpp`:
- Around line 1025-1067: The refill callback in
LocalAdmissionController::modifyResourceGroup is being invoked after releasing
mu, which can race with a concurrent clear/destroy of the RCQ. Move the
refill_token_callback invocation so it happens inside the same locked scope that
checks need_refill, matching the existing locked callback path used elsewhere in
LocalAdmissionController to keep the resource control queue alive during the
call.

---

Nitpick comments:
In `@dbms/src/Flash/ResourceControl/tests/gtest_local_admission_controller.cpp`:
- Around line 51-155: The three LocalAdmissionController tests duplicate the
same cluster and controller setup, so extract that shared setup into a helper or
a TEST_F fixture. Factor out the repeated pingcap::kv::Cluster construction,
TestPDClient lambda wiring, and LocalAdmissionController instantiation in
gtest_local_admission_controller.cpp, then have
KeyspaceWatchUpdatePromotesLegacyFallbackToExactGroup,
KeyspaceLookupDoesNotFallbackForNonNotFoundError, and
ReservedDefaultGroupStillFallbacksWhenLegacyLookupThrows reuse it with only
their response callback differing.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 88ad5ffb-979d-4ff7-8eef-483d44116e44

📥 Commits

Reviewing files that changed from the base of the PR and between 04608f6 and cd382a5.

📒 Files selected for processing (3)
  • dbms/src/Flash/ResourceControl/LocalAdmissionController.cpp
  • dbms/src/Flash/ResourceControl/LocalAdmissionController.h
  • dbms/src/Flash/ResourceControl/tests/gtest_local_admission_controller.cpp
🚧 Files skipped from review as they are similar to previous changes (1)
  • dbms/src/Flash/ResourceControl/LocalAdmissionController.h

Comment thread dbms/src/Flash/ResourceControl/LocalAdmissionController.cpp Outdated
@yongman

yongman commented Jul 1, 2026

Copy link
Copy Markdown
Member Author

/retest

Signed-off-by: yongman <yming0221@gmail.com>
@yongman yongman force-pushed the fix-resource-control-compatible branch from cd382a5 to 82147c7 Compare July 1, 2026 13:52

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
dbms/src/Flash/ResourceControl/tests/gtest_local_admission_controller.cpp (1)

55-71: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low value

Consider extracting the shared response-building lambda.

The success-path GetResourceGroupResponse construction (group name, mode, priority, fill/burst rate) is duplicated almost verbatim between KeyspaceWatchUpdatePromotesLegacyFallbackToExactGroup and KeyspaceLookupDoesNotFallbackForNonNotFoundError. A small helper (e.g., makeSuccessResponse(priority)) would reduce duplication.

Also applies to: 110-126

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@dbms/src/Flash/ResourceControl/tests/gtest_local_admission_controller.cpp`
around lines 55 - 71, The success-path GetResourceGroupResponse construction is
duplicated in the TestPDClient lambda used by
KeyspaceWatchUpdatePromotesLegacyFallbackToExactGroup and
KeyspaceLookupDoesNotFallbackForNonNotFoundError. Extract the shared
response-building logic into a small helper, such as a local
makeSuccessResponse(priority) function or lambda, and have both tests call it so
the group name, mode, priority, and RU settings are built in one place. Keep the
error branch for the has_keyspace_id() case separate, and reuse the helper in
the duplicated test setup code.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@dbms/src/Flash/ResourceControl/tests/gtest_local_admission_controller.cpp`:
- Around line 55-71: The success-path GetResourceGroupResponse construction is
duplicated in the TestPDClient lambda used by
KeyspaceWatchUpdatePromotesLegacyFallbackToExactGroup and
KeyspaceLookupDoesNotFallbackForNonNotFoundError. Extract the shared
response-building logic into a small helper, such as a local
makeSuccessResponse(priority) function or lambda, and have both tests call it so
the group name, mode, priority, and RU settings are built in one place. Keep the
error branch for the has_keyspace_id() case separate, and reuse the helper in
the duplicated test setup code.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 55ae1bb6-35a3-486d-b118-0c83fb95ae70

📥 Commits

Reviewing files that changed from the base of the PR and between cd382a5 and 82147c7.

📒 Files selected for processing (3)
  • dbms/src/Flash/ResourceControl/LocalAdmissionController.cpp
  • dbms/src/Flash/ResourceControl/LocalAdmissionController.h
  • dbms/src/Flash/ResourceControl/tests/gtest_local_admission_controller.cpp
🚧 Files skipped from review as they are similar to previous changes (2)
  • dbms/src/Flash/ResourceControl/LocalAdmissionController.h
  • dbms/src/Flash/ResourceControl/LocalAdmissionController.cpp

@ti-chi-bot

ti-chi-bot Bot commented Jul 1, 2026

Copy link
Copy Markdown
Contributor

@yongman: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-integration-next-gen-columnar 82147c7 link true /test pull-integration-next-gen-columnar
pull-unit-next-gen 82147c7 link true /test pull-unit-next-gen

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

release-note-none Denotes a PR that doesn't merit a release note. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

columnar cannot find resource group: default(keyspace=4294967295)

1 participant