Skip to content

Harden DuckDB extension persistence#185

Open
nicosuave wants to merge 5 commits into
review/python-native-contractfrom
review/duckdb-extension-hardening
Open

Harden DuckDB extension persistence#185
nicosuave wants to merge 5 commits into
review/python-native-contractfrom
review/duckdb-extension-hardening

Conversation

@nicosuave
Copy link
Copy Markdown
Member

Base: #184

  • Make DuckDB extension persistence atomic and context-scoped across database connections.
  • Persist metrics, dimensions, and segments under their owning model block so autoload rebuilds the same graph.
  • Tighten CREATE MODEL parsing, sidecar validation, lock handling, and extension build linking across native targets.
  • Update DuckDB extension SQL tests and docs for the supported load/install path.

@nicosuave nicosuave force-pushed the review/python-native-contract branch from cd7a08e to 4b9c638 Compare May 31, 2026 18:49
@nicosuave nicosuave force-pushed the review/duckdb-extension-hardening branch from 66c6541 to c712342 Compare May 31, 2026 18:49
@nicosuave nicosuave marked this pull request as ready for review May 31, 2026 21:47
@nicosuave nicosuave force-pushed the review/duckdb-extension-hardening branch 3 times, most recently from 9a6d2d2 to 67dd8ce Compare May 31, 2026 22:35
@nicosuave
Copy link
Copy Markdown
Member Author

@codex review

@nicosuave nicosuave force-pushed the review/python-native-contract branch from b3ee67c to 1b91738 Compare May 31, 2026 23:24
@nicosuave nicosuave force-pushed the review/duckdb-extension-hardening branch from 67dd8ce to 6cdc4af Compare May 31, 2026 23:24
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 6cdc4afeab

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread sidemantic-rs/src/ffi.rs Outdated
impl Drop for DefinitionsFileLock {
fn drop(&mut self) {
self.file.take();
let _ = fs::remove_file(&self.path);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Guard lock cleanup against stolen locks

If a writer is paused or slow for longer than DEFINITIONS_STALE_LOCK_AFTER, another process can classify its lock as stale, remove and recreate the same .lock file, and then this original guard will unconditionally remove that newer lock when it drops. That opens a window for two writers to enter the critical section and lose or corrupt persisted definitions; write an owner token into the lock and only unlink when it still matches, or avoid stealing locks that may still be live.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 2baf6794d9

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread sidemantic-rs/src/ffi.rs
idx += 1;
}

if let Some(statement_start) = start {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Split persisted model blocks without semicolons

When definitions are created through clients that omit trailing semicolons (for example con.execute("SEMANTIC CREATE MODEL ...")), append_definition_to_content preserves the semicolonless text. This fallback then makes statement_ranges return one statement spanning the rest of the sidecar, so model_definition_ranges sees all subsequent MODEL definitions as part of the first block; replacing the first model can drop later models, and adding a metric to a later model reports that it is not present in the persisted file. Split on top-level definition keywords as well as semicolons, or normalize persisted definitions to include semicolons.

Useful? React with 👍 / 👎.

@nicosuave nicosuave force-pushed the review/duckdb-extension-hardening branch 3 times, most recently from e598752 to 9a62d99 Compare June 1, 2026 01:01
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 9a62d99e9a

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread sidemantic-rs/src/ffi.rs Outdated
Comment on lines +820 to +821
if is_stale_definitions_lock(&lock_path) {
let _ = fs::remove_file(&lock_path);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Verify stale lock before unlinking

When two processes concurrently encounter the same stale sidecar lock, both can pass is_stale_definitions_lock before either recreates it; one process can then remove the old file and create its lock, only for the other process to execute this unconditional remove_file and delete that new lock before acquiring its own. That leaves two writers in the critical section despite the owner-token cleanup on drop; the fresh evidence in the current code is that the stale-removal path still unlinks by path without verifying the token or metadata it classified as stale.

Useful? React with 👍 / 👎.

@nicosuave nicosuave force-pushed the review/duckdb-extension-hardening branch from 9a62d99 to 17f7749 Compare June 1, 2026 01:11
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 17f77491f3

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +400 to +401
} else if (ch == ',' && depth == 0) {
break;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Track braces when scanning model properties

When the new CREATE MODEL name (...) shorthand is used with a structured property before the top-level name, this scanner treats commas inside {} or [] values as top-level separators because it only tracks parentheses. For example SEMANTIC CREATE MODEL orders (metadata {owner analytics, name internal}, table orders) is a valid body shape for the Rust parser's metadata object support, but this code resumes at the nested name key and reports a spurious name mismatch against orders, blocking the model from being created. Track bracket/brace depth here (as the Rust property parser does) before breaking on commas.

Useful? React with 👍 / 👎.

@nicosuave nicosuave force-pushed the review/duckdb-extension-hardening branch 2 times, most recently from 9620ed3 to 04b06fe Compare June 1, 2026 02:26
@nicosuave nicosuave force-pushed the review/duckdb-extension-hardening branch from 04b06fe to 3791e32 Compare June 1, 2026 02:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant