Skip to content

[BugFix] Scope SQL cursor continuation to original query indices under FGAC#5399

Open
mengweieric wants to merge 1 commit intoopensearch-project:mainfrom
mengweieric:fix/sql-cursor-fgac-indices
Open

[BugFix] Scope SQL cursor continuation to original query indices under FGAC#5399
mengweieric wants to merge 1 commit intoopensearch-project:mainfrom
mengweieric:fix/sql-cursor-fgac-indices

Conversation

@mengweieric
Copy link
Copy Markdown
Collaborator

Summary

  • Legacy (V1) SQL cursor continuation built its SearchRequest with new SearchRequest() (no indices). Under OpenSearch Security Fine-Grained Access Control, an indices-less SearchRequest resolves to a wildcard during authorization, so users who only have permissions on the queried index get 403 no permissions for [indices:data/read/search] on page 2 and later. Page 1 worked because the initial SearchRequest was built via SearchRequestBuilder and carried the concrete indices.
  • Persist the resolved indices into DefaultCursor at page-1 time and scope the continuation SearchRequest with them. Security now authorizes against the same concrete indices across every page.

Root cause

  1. PrettyFormatRestExecutor.buildProtocolWithPagination creates the page-1 SearchRequest via queryAction.getRequestBuilder(), which carries queryAction.getSelect().getIndexArr(). Security sees concrete indices and grants.
  2. On continuation, CursorResultExecutor.handleDefaultCursorRequest built new SearchRequest() with no indices and only a SearchSourceBuilder + PIT id. Security resolves the empty indices array to the cluster-wide wildcard, evaluates the user's role against *, and denies.

The PIT is correctly scoped on both page 1 and continuation; the denial happens in Security's pre-PIT authorization of the outer SearchRequest.

Fix

  • DefaultCursor: new String[] indices field, serialized under a new short cursor-payload key "x". Deserialization uses optJSONArray and defaults to new String[0] when the key is absent, so cursor tokens issued by pre-fix nodes continue to deserialize cleanly after a rolling upgrade.
  • PrettyFormatRestExecutor.createCursorWithPit: populates cursor.setIndices(queryAction.getSelect().getIndexArr()) at page-1 time.
  • CursorResultExecutor.handleDefaultCursorRequest: continuation SearchRequest now new SearchRequest(cursor.getIndices()).

Scope: only the legacy V1 cursor path changes. The V2 cursor path (buildProtocolWithPagination going through SearchRequestBuilder.get()) already carried indices end-to-end and is untouched; PaginationIT stays green.

Back-compat

  • Old in-flight cursor tokens (no "x" field) deserialize with indices=[]. Those continuations still hit the original bug for FGAC users but behave exactly as they did before the fix, i.e. nothing gets worse during a rolling upgrade.
  • New cursor tokens written by upgraded nodes contain "x" and authorize correctly on every page.
  • Non-FGAC clusters are unaffected in either direction.

Test plan

  • ./gradlew :legacy:spotlessCheck :integ-test:spotlessCheck
  • ./gradlew :legacy:test (all pass, including 4 new DefaultCursorTest cases: serialization of indices, null-indices default, round-trip, legacy-cursor-without-x back-compat)
  • ./gradlew build -x integTest -x integTestWithSecurity -x yamlRestTest (all module unit tests)
  • ./gradlew :integ-test:integTestWithSecurity --tests org.opensearch.sql.security.SQLCursorPermissionsIT (both tests pass)
  • Verified the new IT catches the bug: with the three source edits stashed, cursorPaginationUnderFgacSucceedsAcrossPages fails with the exact customer symptom (403 no permissions for [indices:data/read/search]), then passes once the fix is restored.
  • ./gradlew :integ-test:integTest --tests org.opensearch.sql.legacy.CursorIT (54/54 pass; serialization change is back-compat with existing V1 cursor flows)
  • ./gradlew :integ-test:integTest --tests org.opensearch.sql.sql.PaginationIT (24/24 pass; V2 cursor path unaffected)

Files changed

  • legacy/src/main/java/org/opensearch/sql/legacy/cursor/DefaultCursor.java
  • legacy/src/main/java/org/opensearch/sql/legacy/executor/cursor/CursorResultExecutor.java
  • legacy/src/main/java/org/opensearch/sql/legacy/executor/format/PrettyFormatRestExecutor.java
  • legacy/src/test/java/org/opensearch/sql/legacy/unittest/cursor/DefaultCursorTest.java
  • integ-test/src/test/java/org/opensearch/sql/security/SQLCursorPermissionsIT.java (new)

…r FGAC

The legacy (V1) SQL cursor continuation path built its SearchRequest with
`new SearchRequest()` (no indices). Under OpenSearch Security Fine-Grained
Access Control, an indices-less SearchRequest resolves to a wildcard during
authorization, so users who only have permissions on the queried index are
denied `indices:data/read/search` on the second and subsequent pages. Page 1
worked because the initial SearchRequest was built via SearchRequestBuilder
and carried the concrete indices.

Persist the resolved indices into DefaultCursor at page-1 time and pass them
to the continuation SearchRequest so Security authorizes against the same
concrete indices across all pages. The cursor payload uses a new short key
'x' and falls back to an empty array when absent, so cursor tokens issued by
pre-fix nodes continue to deserialize cleanly.

Regression coverage:
- Unit test for DefaultCursor round-trip including the legacy-without-x case.
- Integration test SQLCursorPermissionsIT that exercises V1 cursor paging
  under an FGAC role scoped to a single index. The new test fails on HEAD
  with 403 on page 2 and passes after the fix.

Signed-off-by: Eric Wei <mengwei.eric@gmail.com>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 1, 2026

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

🧪 PR contains tests
🔒 No security concerns identified
✅ No TODO sections
🔀 Multiple PR themes

Sub-PR theme: Core FGAC cursor fix with unit tests

Relevant files:

  • legacy/src/main/java/org/opensearch/sql/legacy/cursor/DefaultCursor.java
  • legacy/src/main/java/org/opensearch/sql/legacy/executor/cursor/CursorResultExecutor.java
  • legacy/src/main/java/org/opensearch/sql/legacy/executor/format/PrettyFormatRestExecutor.java
  • legacy/src/test/java/org/opensearch/sql/legacy/unittest/cursor/DefaultCursorTest.java

Sub-PR theme: Integration test for SQL cursor FGAC regression

Relevant files:

  • integ-test/src/test/java/org/opensearch/sql/security/SQLCursorPermissionsIT.java

⚡ Recommended focus areas for review

Test Reliability

The initialized flag is an instance field, but SQLIntegTestCase may create a new instance per test method. If so, createSecurityRolesAndUsers() will be called multiple times, and the guard if (initialized) return; will never fire. This could cause redundant role/user creation requests or failures if the security API returns an error on duplicate creation (rather than 200/201). Consider using a @BeforeClass static flag or assumeTrue to guard idempotent setup.

private boolean initialized = false;

@Override
protected void init() throws Exception {
  loadIndex(Index.ACCOUNT);
  createSecurityRolesAndUsers();
}

private void createSecurityRolesAndUsers() throws IOException {
  if (initialized) {
    return;
  }
  createRole(ACCOUNT_ROLE, TEST_INDEX_ACCOUNT);
  createUser(ACCOUNT_USER, ACCOUNT_ROLE);
  initialized = true;
}
Hardcoded Row Count

The test asserts exactly 5 pages based on 234 rows / 50 per page. This assumes the TEST_INDEX_ACCOUNT index always contains exactly 1000 rows (or at least 234). If the index has fewer rows, the cursor may exhaust earlier and the assertion will fail. The test should either assert pages >= 2 (to confirm at least one continuation page succeeded) or document the exact expected row count more explicitly.

  // 234 rows / 50 per page = 5 pages
  assertEquals("expected 5 V1 cursor pages under FGAC", 5, pages);
}
Charset Assumption

In the legacy-cursor backward-compatibility test, json.toString().getBytes() and new String(Base64.getDecoder().decode(payload)) use the platform default charset. If the platform charset is not UTF-8, encoding/decoding may be inconsistent. Explicit StandardCharsets.UTF_8 should be used. The same concern applies to decodePayload in the test.

}

public static DefaultCursor from(String cursorId) {
Empty Indices Fallback

When cursor.getIndices() returns an empty array (e.g., for cursors issued by pre-fix nodes after a rolling upgrade), the SearchRequest is constructed with new SearchRequest(new String[0]). Depending on the OpenSearch version, an empty indices array may still resolve to a wildcard, meaning the FGAC issue could persist for in-flight cursors from pre-fix nodes. This edge case should be documented or handled explicitly.

String[] indices = cursor.getIndices();
SearchRequest searchRequest = new SearchRequest(indices == null ? new String[0] : indices);

@mengweieric mengweieric added bug Something isn't working SQL skip-diff-analyzer Maintainer to skip code-diff-analyzer check, after reviewing issues in AI analysis. skip-diff-reviewer Maintainer to skip code-diff-reviewer check, after reviewing issues in AI analysis. labels May 1, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 1, 2026

PR Code Suggestions ✨

Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
Possible issue
Fallback to index pattern for legacy cursors

When indices is an empty array (e.g., deserialized from a legacy cursor without the
"x" field), the SearchRequest will still be constructed with no indices, which
resolves to a wildcard under FGAC and causes the same 403 denial the fix intends to
prevent. Consider falling back to cursor.getIndexPattern() when the indices array is
empty, so legacy cursors still work correctly after upgrade.

legacy/src/main/java/org/opensearch/sql/legacy/executor/cursor/CursorResultExecutor.java [120-121]

 String[] indices = cursor.getIndices();
-SearchRequest searchRequest = new SearchRequest(indices == null ? new String[0] : indices);
+SearchRequest searchRequest;
+if (indices != null && indices.length > 0) {
+  searchRequest = new SearchRequest(indices);
+} else {
+  // Fallback for legacy cursors that predate the indices field
+  searchRequest = new SearchRequest(cursor.getIndexPattern().split("\\|"));
+}
Suggestion importance[1-10]: 7

__

Why: This is a valid concern: when indices is empty (legacy cursor without "x" field), the fix still creates a SearchRequest with no indices, which would still fail under FGAC. Falling back to indexPattern would handle the upgrade scenario correctly. However, the test deserializeLegacyCursorWithoutIndicesDefaultsToEmptyArray explicitly tests that empty array is returned for legacy cursors, suggesting the author may have intended a different fallback strategy.

Medium
Safely encode cursor value in JSON body

The cursor string may contain characters that are not safe to embed directly in a
JSON string literal (e.g., +, /, = from Base64, or special characters). Using
String.format to inject the cursor value without escaping can produce malformed JSON
and cause unexpected failures. Use a JSONObject to build the request body so the
cursor value is properly escaped.

integ-test/src/test/java/org/opensearch/sql/security/SQLCursorPermissionsIT.java [181-190]

 int pages = 1;
 while (!cursor.isEmpty()) {
-  JSONObject next =
-      executeSqlAsUser(
-          String.format(Locale.ROOT, "{\"cursor\": \"%s\"}", cursor), ACCOUNT_USER);
+  JSONObject body = new JSONObject();
+  body.put("cursor", cursor);
+  JSONObject next = executeSqlAsUser(body.toString(), ACCOUNT_USER);
   cursor = next.optString("cursor", "");
   pages++;
 }
 // 234 rows / 50 per page = 5 pages
 assertEquals("expected 5 V1 cursor pages under FGAC", 5, pages);
Suggestion importance[1-10]: 6

__

Why: The cursor value (Base64-encoded) can contain characters like +, /, and = that are safe in JSON strings but the suggestion is valid for robustness. Using JSONObject to build the request body ensures proper escaping and avoids potential malformed JSON issues.

Low
General
Use explicit charset for Base64 encode/decode

json.toString().getBytes() uses the platform's default charset, which may differ
across environments and cause decoding failures. Explicitly specify
StandardCharsets.UTF_8 for both encoding and decoding to ensure consistent behavior.

legacy/src/test/java/org/opensearch/sql/legacy/unittest/cursor/DefaultCursorTest.java [170-172]

-JSONObject json = new JSONObject(new String(Base64.getDecoder().decode(payload)));
+JSONObject json = new JSONObject(new String(Base64.getDecoder().decode(payload), java.nio.charset.StandardCharsets.UTF_8));
 json.remove("x");
-String legacyPayload = Base64.getEncoder().encodeToString(json.toString().getBytes());
+String legacyPayload = Base64.getEncoder().encodeToString(json.toString().getBytes(java.nio.charset.StandardCharsets.UTF_8));
Suggestion importance[1-10]: 4

__

Why: Using explicit StandardCharsets.UTF_8 is a good practice to avoid platform-dependent behavior, but this is a test file and the risk of charset mismatch causing failures in practice is low. It's a minor improvement to code robustness.

Low

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working enhancement New feature or request skip-diff-analyzer Maintainer to skip code-diff-analyzer check, after reviewing issues in AI analysis. skip-diff-reviewer Maintainer to skip code-diff-reviewer check, after reviewing issues in AI analysis. SQL

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant