Cache queries on whether they are direct-to-shard + SQL by sgrif · Pull Request #1027 · pgdogdev/pgdog

sgrif · 2026-06-01T15:57:57Z

This change allows the query rewriter to safely modify its behavior for direct-to-shard queries, and skip rewrite steps that are only needed for cross-shard queries.

Though this commit does not take advantage of this, future changes will require this behavior and changing the cache key required enough code to justify being pulled into its own commit.

The signature of Map::get and Borrow means this needs a bit more boilerplate than I would have liked. There is no way to implement Borrow<(str, _)> for (String, _) (or any other owned/ref types). Instead we have to erase the types to a trait object so we can get individual references to the fields when we need them for comparison and hashing.

This trait is implemented for any combination of values/refs, and a special case for (Arc, B). If in the future this trait is needed with additional pointer types, or Arc in the right position, additional manual impls will need to be added.

This change allows the query rewriter to safely modify its behavior for direct-to-shard queries, and skip rewrite steps that are only needed for cross-shard queries. Though this commit does not take advantage of this, future changes will require this behavior and changing the cache key required enough code to justify being pulled into its own commit. The signature of Map::get and Borrow means this needs a bit more boilerplate than I would have liked. There is no way to implement Borrow<(str, _)> for (String, _) (or any other owned/ref types). Instead we have to erase the types to a trait object so we can get individual references to the fields when we need them for comparison and hashing. This trait is implemented for any combination of values/refs, and a special case for (Arc<A>, B). If in the future this trait is needed with additional pointer types, or Arc in the right position, additional manual impls will need to be added.

codecov · 2026-06-01T16:49:04Z

Codecov Report

❌ Patch coverage is 71.42857% with 18 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
pgdog/src/frontend/router/parser/cache/key_pair.rs	53.84%	18 Missing ⚠️

📢 Thoughts on this report? Let us know!

levkk · 2026-06-01T17:14:43Z

This change allows the query rewriter to safely modify its behavior for direct-to-shard queries, and skip rewrite steps that are only needed for cross-shard queries.

Based on my reading, this change only actually detects if the user passed in /* pgdog_shard: x */ manual routing hint. That's not enough to detect whether the query is direct-to-shard or not, you need to cache the output of the "parser" in parser/query/mod.rs. There is a re-usable struct you can use for some of it, but there are quite a few edge cases, e.g. schema-based sharding which users the value of search_path or the schema on a fully-qualified table, e.g. SELECT * FROM shard_0.users; and also the value of the pgdog.shard connection parameter which is another way to specify manual routing hints.

While we can't completely treat direct-to-shard queries the same as unsharded queries, we can at least skip certain parts of the rewriter. In particular, the aggregate rewrite does some extra work that will never be needed on direct-to-shard, and skipping it allows us to error on aggregate functions we don't support without breaking the very niche case of an unsupported aggregate function being used by a user only in direct-to-shard queries. Actually implementing this was a bit of a PITA. Since we cache the AST after the rewriter mutates it, as opposed to just the results of pg_query::parse, we needed to add whether the query is direct-to-shard to the cache key (#1027) The natural place to exit early is the same place we do the "skip this if the schema only has one shard", but that information was previously lost all the way up at the cache impl, so we needed to pass this information through quite a few levels of indirection. Ultimately I think we need to separate out parsing from rewriting more concretely, and structure rewriting much differently with a better structured context. However, as I work towards being ready to do that larger restructuring, putting this where it's most natural to make the dependency graph clear seems like the right short-term path forward.

sgrif requested a review from levkk June 1, 2026 15:57

sgrif force-pushed the sg-cache-key-pair branch from 59b8d1c to 88f926f Compare June 1, 2026 16:39

This comment has been minimized.

Sign in to view

sgrif marked this pull request as draft June 1, 2026 17:24

sgrif mentioned this pull request Jun 1, 2026

Error when encountering an unsupported aggregate function #1020

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cache queries on whether they are direct-to-shard + SQL#1027

Cache queries on whether they are direct-to-shard + SQL#1027
sgrif wants to merge 1 commit into
mainfrom
sg-cache-key-pair

sgrif commented Jun 1, 2026

Uh oh!

This comment has been minimized.

codecov Bot commented Jun 1, 2026

Uh oh!

levkk commented Jun 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

sgrif commented Jun 1, 2026

Uh oh!

This comment has been minimized.

codecov Bot commented Jun 1, 2026

Codecov Report

Uh oh!

levkk commented Jun 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants