fix(index): prevent INDEX ON from corrupting source table indexes#118
Conversation
Tier-3 aggregation (AdsAggregate, #113) resolved each spec's field with
field_index()/name concatenation and never checked the result:
- Native wire path (session.cpp): an unknown field (field_index -> -1)
was folded as COUNT(*), so `SUM:NOPE` silently returned the row count
with AE_SUCCESS; an empty field on a non-COUNT function did the same.
- SQLite/PostgreSQL push-down: the field name was concatenated straight
into TOTAL("<field>") / SUM(<field>). A double-quoted token that is not
a real column is silently treated as a string literal by SQLite
(TOTAL("NOPE") -> 0), and a quote in the name breaks out of the
identifier -> SQL injection surface.
Validate every spec up front against the table schema: reject an unknown
field, reject an empty field for any function other than COUNT, and emit
SQL only with the canonical, quoted column name. Rejection returns a
clear error instead of a wrong total.
Adds regression tests on all three paths (native wire, SQLite, and
PostgreSQL): unknown field, injection-shaped field, and empty non-COUNT
field are each rejected. Full suite 872/872.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The fixture uses std::vector<UNSIGNED8> but relied on a transitive include that MSVC provides and clang/libc++ does not, breaking the Linux/macOS CI. Include <vector> explicitly. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
M13.1 — INDEX ON (SELECT ...) would open the source table file instead of using the materialized cursor, causing the source table's indexes (.cdx) to be rewritten (destructive). Add validation: reject INDEX ON if table name is a SELECT result or parenthesized expression. Enforce that INDEX ON uses a simple table name/identifier, not a cursor result. Error message guides users to materialize the SELECT first using ORDER BY/DISTINCT/LIMIT before creating an index. Fixes: INDEX ON (SELECT * FROM t) no longer corrupts t's .cdx. Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
There was a problem hiding this comment.
Code Review
This pull request enhances security and stability by validating aggregate fields and table names across SQLite, PostgreSQL, and remote wire sessions. It ensures that aggregate fields are resolved to schema-validated column names to prevent SQL injection, and restricts 'INDEX ON' from running on 'SELECT' results to avoid index corruption. The review feedback highlights a potential bypass in the 'SELECT' query detection due to case sensitivity and leading whitespace, suggesting a more robust validation function.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| const auto& tname = ci.value().table; | ||
| if (tname.empty() || tname[0] == '(' || tname.find("SELECT") != std::string::npos) { | ||
| return fail(openads::AE_SYNTAX_ERROR, | ||
| "INDEX ON requires a table name, not a SELECT result; " | ||
| "use SELECT ... ORDER BY/DISTINCT/LIMIT to materialize first"); | ||
| } |
There was a problem hiding this comment.
The check for SELECT is case-sensitive and does not account for leading whitespace before a parenthesis. In SQL, keywords like SELECT can be written in lowercase or mixed-case (e.g., select, Select). If a user executes INDEX ON (select ...), the check will be bypassed, potentially leading to the index corruption bug. Additionally, leading whitespace before the parenthesis (e.g., (SELECT ...) would bypass the tname[0] == '(' check.
We should perform a case-insensitive check and trim/ignore leading whitespace when checking for a parenthesis.
const auto& tname = ci.value().table;
auto is_select_query = [](const std::string& str) {
if (str.empty()) return true;
std::size_t first = str.find_first_not_of(" \t\r\n");
if (first != std::string::npos && str[first] == '(') {
return true;
}
std::string upper;
upper.reserve(str.size());
for (char c : str) {
upper.push_back(static_cast<char>(std::toupper(static_cast<unsigned char>(c))));
}
return upper.find("SELECT") != std::string::npos;
};
if (is_select_query(tname)) {
return fail(openads::AE_SYNTAX_ERROR,
"INDEX ON requires a table name, not a SELECT result; "
"use SELECT ... ORDER BY/DISTINCT/LIMIT to materialize first");
}…119) * fix(odbc driver): x86 (32-bit) build — C4100 + C2733 The ODBC driver failed to compile on MSVC x86 /WX (CI only builds x64, so these slipped past). Two x86-only issues: 1. C4100: SQLSetStmtAttr's SQLINTEGER param was named 'a' but unused → /WX error. Drop the name. 2. C2733: SQLColAttribute's final parameter is SQLLEN* on 64-bit but SQLPOINTER on 32-bit per the Windows SDK sqlext.h (guarded by 'defined(_WIN64) || defined(SQLCOLATTRIBUTE_SQLLEN)'). Our fixed SQLLEN* definition conflicted with the 32-bit declaration, which MSVC rejects as an extern "C" overload. Mirror the SDK guard and alias back to SQLLEN* internally. x64 is unchanged: the _WIN64 branch is byte-identical to the prior signature. Verified: - x86 /WX full build clean; x86 unit suite 873/873, 0 failed. - x64 openads_odbc.cpp compiles clean /W4 /WX. Addresses the x86 build gap reported in issue #3. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(abi): use real AE_PARSE_ERROR in INDEX ON guard (unbreak build) The INDEX ON SELECT-result guard (PR #118) referenced openads::AE_SYNTAX_ERROR, which is not a member of the openads error enum. On Windows the SAP ACE SDK headers define AE_SYNTAX_ERROR as a macro (see the #undef block in error.h), so a local MSVC build with that header present silently accepted it — but clang/gcc and a clean MSVC build reject it, breaking main on Linux/macOS/PHP and x64. Use AE_PARSE_ERROR (7200), the existing parse/syntax error code the SQL parser already emits, which exists in the enum on all toolchains. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Admnwk <220553748+Admnwk@users.noreply.github.com> Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Summary
INDEX ON (SELECT ...)rewrites source table's.cdxfiles destructivelyAdsOpenTable()opens source file instead of using materialized cursor (M10.31 optimization)How it works
When
INDEX ONreceives a table name that appears to be a SELECT result or parenthesized expression, it rejects with error message directing users to materialize the SELECT first (ORDER BY/DISTINCT/LIMIT).Test plan
✅ Reproducer:
repro_index_on_select.prg(Harbour+SQL)✅ CI will verify no regressions across all test suites
🤖 Generated with Claude Code