Skip to content

feat: add support for postgres schema selection#475

Open
gkorland wants to merge 9 commits intostagingfrom
fix/select-postgres-schema
Open

feat: add support for postgres schema selection#475
gkorland wants to merge 9 commits intostagingfrom
fix/select-postgres-schema

Conversation

@gkorland
Copy link
Contributor

@gkorland gkorland commented Mar 9, 2026

Summary

Adds support for selecting a PostgreSQL schema instead of always using public. Based on PR #373 by @sirudog, rebased onto current staging with bug fixes.

What changed

  • Backend: New _parse_schema_from_url() extracts schema from the connection URL's options parameter (search_path), following PostgreSQL's native libpq format
  • SQL queries: All information_schema queries now accept a parameterized schema argument (default: public) — no SQL injection risk
  • UI: Schema input field in DatabaseModal, shown only for PostgreSQL connections
  • Tests: 14 new unit tests for URL schema parsing (edge cases, encoding, $user handling)
  • Docs: Custom schema configuration guide with examples and troubleshooting

Fixes over original PR #373

  1. pg_namespace JOIN order in extract_columns_info — the original used LEFT JOIN pg_class then LEFT JOIN pg_namespace, which doesn't filter duplicate pg_class rows when same-named tables exist across schemas. Fixed by joining pg_namespace first and folding the namespace filter into the pg_class join condition
  2. Regex pattern — changed [=\s]+ to \s*=\s* to require = as separator, preventing mis-capture when search_path= is followed by a space and another option
  3. $user handling — replaced two-position check with a loop through all schemas, correctly handling edge cases like \$user,\$user,public
  4. Pylint compliance — fixed line-too-long, added protected-access disable for test class

Backward compatibility

All methods default to public schema, so existing connections without search_path work identically.

Closes #373 (supersedes)

Add support for selecting a PostgreSQL schema instead of always using
'public'. The schema is extracted from the connection URL's options
parameter (search_path), following PostgreSQL's native libpq format.

Changes:
- Add _parse_schema_from_url() to extract schema from connection URL
- Thread schema parameter through all extraction methods with 'public' default
- Add pg_namespace JOINs for correct cross-schema disambiguation
- Add schema input field in DatabaseModal (PostgreSQL only)
- Add comprehensive unit tests for URL schema parsing
- Update documentation with custom schema configuration guide

Based on PR #373 by sirudog with the following fixes:
- Fix pg_namespace JOIN order in extract_columns_info to prevent
  duplicate rows when same-named tables exist across schemas
- Fix regex to require '=' separator (prevents mis-capture edge cases)
- Improve $user handling to loop through all schemas instead of only
  checking first two positions
- Fix pylint line-too-long in test file

Co-authored-by: sirudog <1550561+sirudog@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@overcut-ai
Copy link

overcut-ai bot commented Mar 9, 2026

Completed Working on "Code Review"

✅ Workflow completed successfully.


👉 View complete log

@railway-app railway-app bot temporarily deployed to queryweaver / QueryWeaver-pr-475 March 9, 2026 08:32 Destroyed
@railway-app
Copy link

railway-app bot commented Mar 9, 2026

🚅 Deployed to the QueryWeaver-pr-475 environment in queryweaver

Service Status Web Updated (UTC)
QueryWeaver ✅ Success (View Logs) Web Mar 9, 2026 at 10:05 am

@github-actions
Copy link

github-actions bot commented Mar 9, 2026

Dependency Review

The following issues were found:
  • ✅ 0 vulnerable package(s)
  • ✅ 0 package(s) with incompatible licenses
  • ✅ 0 package(s) with invalid SPDX license definitions
  • ⚠️ 5 package(s) with unknown licenses.
See the Details below.

License Issues

uv.lock

PackageVersionLicenseIssue Type
falkordb1.6.0NullUnknown License
litellm1.82.0NullUnknown License
playwright1.58.0NullUnknown License
redis7.3.0NullUnknown License
uvicorn0.41.0NullUnknown License

OpenSSF Scorecard

PackageVersionScoreDetails
pip/falkordb 1.6.0 UnknownUnknown
pip/litellm 1.82.0 UnknownUnknown
pip/playwright 1.58.0 UnknownUnknown
pip/redis 7.3.0 UnknownUnknown
pip/uvicorn 0.41.0 UnknownUnknown

Scanned Files

  • uv.lock

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 9, 2026

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: fd180f9b-e3b0-4881-a435-4a6c2be44aa6

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch fix/select-postgres-schema

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@overcut-ai overcut-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the schema-selection enhancement—there are 6 MAJOR findings in total (no BLOCKER/CRITICAL/MINOR/SUGGESTION/PRAISE items).

Summary of findings

  • Importance counts: 6 MAJOR
  • Affected files: 4 (api/loaders/postgres_loader.py, app/src/components/modals/DatabaseModal.tsx, tests/test_postgres_loader.py, docs/postgres_loader.md)

Key themes observed

  1. Schema parsing robustness gaps in search_path handling (quoted commas and edge-token handling) that can select the wrong schema.
  2. Schema-scoping correctness issues in metadata joins where non-unique constraint names across schemas can lead to incorrect PK/FK attribution.
  3. Coverage and usability gaps: missing regression tests for documented edge cases and a docs snippet using an incorrect class name.

Actionable next steps

  1. Harden search_path parsing/tokenization (or delegate to server-side schema resolution) and safely encode/validate schema input from UI.
  2. Fix constraint joins to include schema-qualified keys (constraint_schema/table_schema) to prevent cross-schema collisions.
  3. Add regression tests for repeated $user and empty-token-only search_path values, then correct the documentation import/example to match implementation.

Rename _parse_schema_from_url to parse_schema_from_url since the
method is already documented for external use and tested directly.
This eliminates W0212 (protected-access) warnings that cause CI
pylint to fail with exit code 4.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@railway-app railway-app bot temporarily deployed to queryweaver / QueryWeaver-pr-475 March 9, 2026 08:43 Destroyed
@gkorland gkorland requested a review from Copilot March 9, 2026 08:59
- Add constraint_schema qualifier to key_column_usage JOINs in
  extract_columns_info to prevent cross-schema constraint name
  collisions
- Sanitize schema input in DatabaseModal to strip non-identifier
  characters before building the URL options
- Add edge case tests: empty tokens, blank quoted tokens, repeated
  $user entries

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@railway-app railway-app bot temporarily deployed to queryweaver / QueryWeaver-pr-475 March 9, 2026 09:02 Destroyed
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@railway-app railway-app bot temporarily deployed to queryweaver / QueryWeaver-pr-475 March 9, 2026 09:02 Destroyed
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds support for targeting a specific PostgreSQL schema (instead of always public) by parsing search_path from the connection URL’s options parameter, threading the schema through metadata queries, and exposing an optional schema field in the UI for PostgreSQL manual connections.

Changes:

  • Backend: added PostgresLoader.parse_schema_from_url() and parameterized information_schema queries by schema (default public).
  • UI: added an optional “Schema” input (PostgreSQL-only) that encodes options=-csearch_path=<schema> into the built connection URL.
  • Tests/Docs: added unit tests for URL schema parsing and documentation/examples for custom schema configuration.

Reviewed changes

Copilot reviewed 6 out of 7 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
api/loaders/postgres_loader.py Parses schema from connection URL and uses it to scope table/column/FK extraction queries.
app/src/components/modals/DatabaseModal.tsx Adds PostgreSQL-only schema input and injects search_path into connection URL options for manual mode.
tests/test_postgres_loader.py Adds unit tests covering search_path parsing edge cases.
docs/postgres_loader.md Documents custom schema configuration via options=-csearch_path=... and troubleshooting guidance.
README.md Fixes markdown code block formatting around the streaming Python example.
.github/wordlist.txt Adds PostgresLoader to the spellcheck allowlist.

- Fix regex to capture search_path values with spaces after commas
  (e.g. $user, public) by matching up to next -c option or EOL
- Set session search_path explicitly after connecting so sample
  queries resolve to the correct schema
- Use versionless PostgreSQL docs link (/docs/current/)
- Clarify case-sensitivity note for schema names in troubleshooting

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@railway-app railway-app bot temporarily deployed to queryweaver / QueryWeaver-pr-475 March 9, 2026 09:06 Destroyed
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@railway-app railway-app bot temporarily deployed to queryweaver / QueryWeaver-pr-475 March 9, 2026 09:06 Destroyed
Replace (.+?)(?=\s+-c|\s*$) with [^\s,]+(?:\s*,\s*[^\s,]+)* to
eliminate polynomial backtracking flagged by CodeQL. The new pattern
uses unambiguous character classes with no overlapping quantifiers.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@railway-app railway-app bot temporarily deployed to queryweaver / QueryWeaver-pr-475 March 9, 2026 09:27 Destroyed
@gkorland gkorland requested a review from Copilot March 9, 2026 09:44
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 6 out of 8 changed files in this pull request and generated 4 comments.

…L encoding

- DatabaseModal: Show validation error for invalid schema characters instead
  of silently stripping them. Throw error on submit if invalid chars present.
- docs: URL-encode the example URL to prevent copy/paste connection failures.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@railway-app railway-app bot temporarily deployed to queryweaver / QueryWeaver-pr-475 March 9, 2026 09:52 Destroyed
The URL-encoded form (-csearch_path%3Dmy_schema) inside the Liquid
capture block triggers spellcheck failures ('csearch', 'Dmy'). Reverted
to readable form since Python's urlparse handles both formats fine.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants