Skip to content

fix(index): prevent INDEX ON from corrupting source table indexes#118

Merged
Admnwk merged 3 commits into
FiveTechSoft:mainfrom
Admnwk:fix/index-on-select-corruption
Jun 26, 2026
Merged

fix(index): prevent INDEX ON from corrupting source table indexes#118
Admnwk merged 3 commits into
FiveTechSoft:mainfrom
Admnwk:fix/index-on-select-corruption

Conversation

@Admnwk

@Admnwk Admnwk commented Jun 26, 2026

Copy link
Copy Markdown
Contributor

Summary

  • Bug: INDEX ON (SELECT ...) rewrites source table's .cdx files destructively
  • Root cause: AdsOpenTable() opens source file instead of using materialized cursor (M10.31 optimization)
  • Fix: Validate table name is simple identifier/filename; reject SELECT results with clear error

How it works

When INDEX ON receives a table name that appears to be a SELECT result or parenthesized expression, it rejects with error message directing users to materialize the SELECT first (ORDER BY/DISTINCT/LIMIT).

Test plan

✅ Reproducer: repro_index_on_select.prg (Harbour+SQL)
✅ CI will verify no regressions across all test suites

🤖 Generated with Claude Code

Admnwk and others added 3 commits June 26, 2026 13:06
Tier-3 aggregation (AdsAggregate, #113) resolved each spec's field with
field_index()/name concatenation and never checked the result:

- Native wire path (session.cpp): an unknown field (field_index -> -1)
  was folded as COUNT(*), so `SUM:NOPE` silently returned the row count
  with AE_SUCCESS; an empty field on a non-COUNT function did the same.
- SQLite/PostgreSQL push-down: the field name was concatenated straight
  into TOTAL("<field>") / SUM(<field>). A double-quoted token that is not
  a real column is silently treated as a string literal by SQLite
  (TOTAL("NOPE") -> 0), and a quote in the name breaks out of the
  identifier -> SQL injection surface.

Validate every spec up front against the table schema: reject an unknown
field, reject an empty field for any function other than COUNT, and emit
SQL only with the canonical, quoted column name. Rejection returns a
clear error instead of a wrong total.

Adds regression tests on all three paths (native wire, SQLite, and
PostgreSQL): unknown field, injection-shaped field, and empty non-COUNT
field are each rejected. Full suite 872/872.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The fixture uses std::vector<UNSIGNED8> but relied on a transitive
include that MSVC provides and clang/libc++ does not, breaking the
Linux/macOS CI. Include <vector> explicitly.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
M13.1 — INDEX ON (SELECT ...) would open the source table file
instead of using the materialized cursor, causing the source
table's indexes (.cdx) to be rewritten (destructive).

Add validation: reject INDEX ON if table name is a SELECT result
or parenthesized expression. Enforce that INDEX ON uses a simple
table name/identifier, not a cursor result.

Error message guides users to materialize the SELECT first using
ORDER BY/DISTINCT/LIMIT before creating an index.

Fixes: INDEX ON (SELECT * FROM t) no longer corrupts t's .cdx.

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
@Admnwk Admnwk merged commit 9852eda into FiveTechSoft:main Jun 26, 2026
0 of 10 checks passed

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request enhances security and stability by validating aggregate fields and table names across SQLite, PostgreSQL, and remote wire sessions. It ensures that aggregate fields are resolved to schema-validated column names to prevent SQL injection, and restricts 'INDEX ON' from running on 'SELECT' results to avoid index corruption. The review feedback highlights a potential bypass in the 'SELECT' query detection due to case sensitivity and leading whitespace, suggesting a more robust validation function.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread src/abi/ace_exports.cpp
Comment on lines +13563 to +13568
const auto& tname = ci.value().table;
if (tname.empty() || tname[0] == '(' || tname.find("SELECT") != std::string::npos) {
return fail(openads::AE_SYNTAX_ERROR,
"INDEX ON requires a table name, not a SELECT result; "
"use SELECT ... ORDER BY/DISTINCT/LIMIT to materialize first");
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The check for SELECT is case-sensitive and does not account for leading whitespace before a parenthesis. In SQL, keywords like SELECT can be written in lowercase or mixed-case (e.g., select, Select). If a user executes INDEX ON (select ...), the check will be bypassed, potentially leading to the index corruption bug. Additionally, leading whitespace before the parenthesis (e.g., (SELECT ...) would bypass the tname[0] == '(' check.

We should perform a case-insensitive check and trim/ignore leading whitespace when checking for a parenthesis.

        const auto& tname = ci.value().table;
        auto is_select_query = [](const std::string& str) {
            if (str.empty()) return true;
            std::size_t first = str.find_first_not_of(" \t\r\n");
            if (first != std::string::npos && str[first] == '(') {
                return true;
            }
            std::string upper;
            upper.reserve(str.size());
            for (char c : str) {
                upper.push_back(static_cast<char>(std::toupper(static_cast<unsigned char>(c))));
            }
            return upper.find("SELECT") != std::string::npos;
        };
        if (is_select_query(tname)) {
            return fail(openads::AE_SYNTAX_ERROR,
                "INDEX ON requires a table name, not a SELECT result; "
                "use SELECT ... ORDER BY/DISTINCT/LIMIT to materialize first");
        }

Admnwk added a commit that referenced this pull request Jun 26, 2026
…119)

* fix(odbc driver): x86 (32-bit) build — C4100 + C2733

The ODBC driver failed to compile on MSVC x86 /WX (CI only builds x64,
so these slipped past). Two x86-only issues:

1. C4100: SQLSetStmtAttr's SQLINTEGER param was named 'a' but unused →
   /WX error. Drop the name.

2. C2733: SQLColAttribute's final parameter is SQLLEN* on 64-bit but
   SQLPOINTER on 32-bit per the Windows SDK sqlext.h
   (guarded by 'defined(_WIN64) || defined(SQLCOLATTRIBUTE_SQLLEN)').
   Our fixed SQLLEN* definition conflicted with the 32-bit declaration,
   which MSVC rejects as an extern "C" overload. Mirror the SDK guard
   and alias back to SQLLEN* internally.

x64 is unchanged: the _WIN64 branch is byte-identical to the prior
signature. Verified:
- x86 /WX full build clean; x86 unit suite 873/873, 0 failed.
- x64 openads_odbc.cpp compiles clean /W4 /WX.

Addresses the x86 build gap reported in issue #3.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(abi): use real AE_PARSE_ERROR in INDEX ON guard (unbreak build)

The INDEX ON SELECT-result guard (PR #118) referenced
openads::AE_SYNTAX_ERROR, which is not a member of the openads error
enum. On Windows the SAP ACE SDK headers define AE_SYNTAX_ERROR as a
macro (see the #undef block in error.h), so a local MSVC build with
that header present silently accepted it — but clang/gcc and a clean
MSVC build reject it, breaking main on Linux/macOS/PHP and x64.

Use AE_PARSE_ERROR (7200), the existing parse/syntax error code the
SQL parser already emits, which exists in the enum on all toolchains.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Admnwk <220553748+Admnwk@users.noreply.github.com>
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant