feat(starrocks): add full support for partitions #6804

petrikoro · 2026-01-19T19:55:16Z

Description

This PR adds comprehensive support for StarRocks partitioning syntax including:

Expression-based partitioning (PARTITION BY expr1, expr2): https://docs.starrocks.io/docs/table_design/data_distribution/expression_partitioning/
LIST partitioning (PARTITION BY LIST (cols) (...)): https://docs.starrocks.io/docs/table_design/data_distribution/list_partitioning/
RANGE partitioning with explicit values (PARTITION BY RANGE (cols) (PARTITION ... VALUES LESS THAN ...)): https://docs.starrocks.io/docs/table_design/data_distribution/dynamic_partitioning/#example
RANGE partitioning with dynamic ranges (PARTITION BY RANGE (cols) (START ... END ... EVERY ...)): https://docs.starrocks.io/docs/table_design/data_distribution/dynamic_partitioning/#example

Previously, multi-expression partitioning and LIST partitioning were not supported.

Examples

Expression-based partitioning:

-- Single expression
CREATE TABLE t (col DATE) PARTITION BY DATE_TRUNC('DAY', col)

-- Multiple expressions  
CREATE TABLE t (col1 STRING, col2 BIGINT) PARTITION BY FROM_UNIXTIME(col2, '%Y%m%d'), col1

LIST partitioning:

-- Single column
CREATE TABLE t (city STRING) PARTITION BY LIST (city) (
    PARTITION pLA VALUES IN ('Los Angeles'),
    PARTITION pSF VALUES IN ('San Francisco')
)

-- Multi-column
CREATE TABLE t (dt DATE, city STRING) PARTITION BY LIST (dt, city) (
    PARTITION p1 VALUES IN (('2022-04-01', 'LA'), ('2022-04-01', 'SF'))
)

RANGE partitioning with explicit values:

CREATE TABLE t (col DATE) PARTITION BY RANGE (col) (
    PARTITION p1 VALUES LESS THAN ('2020-01-31'),
    PARTITION p2 VALUES LESS THAN ('2020-02-29'),
    PARTITION p_max VALUES LESS THAN (MAXVALUE)
)

-- With expression
CREATE TABLE t (col STRING) PARTITION BY RANGE (STR2DATE(col, '%Y-%m-%d')) (
    PARTITION p1 VALUES LESS THAN ('2021-01-01'),
    PARTITION p2 VALUES LESS THAN ('2021-01-02')
)

RANGE partitioning with START/END/EVERY:

CREATE TABLE t (col DATE) PARTITION BY RANGE (col) (
    START ('2019-01-01') END ('2021-01-01') EVERY (INTERVAL 1 YEAR),
    START ('2021-01-01') END ('2021-05-01') EVERY (INTERVAL 1 MONTH)
)

See more in tests/dialects/test_starrocks.py

Testing

All syntax variations have been validated against a local StarRocks instance (tested on StarRocks 4.0.2 and 3.5.0).

geooo109 · 2026-01-20T16:14:06Z

@petrikoro thank you for the PR, great work.

I have some suggestions.

There is a similar implementation in doris.py that we should check in order to factor out some code. ( a relevant commit here: 73c2894 )
As I checked MySQL has some similar PARTITION BY syntax that we currently don't cover (same cases with the ones you posted without the dynamic one because it isn't supported in MySQL e.g. https://dev.mysql.com/doc/refman/8.4/en/partitioning-range.html ). We can push some implemenation in this dialect and inherit + implement some extra logic in the derived dialects, thus adding functionallity in the MySQL dialect + removing extra code from the deried dialects (Doris, Starrocks).
For common patterns between Doris and Starrocks that don't exist in MySQL we can factor out in the Dialect class.

I will add some extra inline comments for help.

sqlglot/dialects/starrocks.py

Depends on tobymao#6804 Please review/merge tobymao#6804 first. This PR only contains changes on top of that PR. - Import expression partitioning for MV. - Enabled ALTER TABLE … RENAME for StarRocks. - Emitted ORDER BY via CLUSTER BY for StarRocks outputs. - Added MV (REFRESH) properties handling for StarRocks materialized views. - And, tests updated/added for the new StarRocks behaviors. Signed-off-by: jaogoy <jaogoy@gmail.com>

georgesittas · 2026-01-21T17:09:34Z

Hey @petrikoro 👋

Are you planning to take this to the finish line?

petrikoro · 2026-01-21T17:23:09Z

Hey @petrikoro 👋

Are you planning to take this to the finish line?

Hi 👋

Sure, I plan to get back to PR tomorrow. Thanks for the suggestions @geooo109!

petrikoro · 2026-01-22T17:24:25Z

@petrikoro thank you for the PR, great work.

I have some suggestions.

There is a similar implementation in doris.py that we should check in order to factor out some code. ( a relevant commit here: 73c2894 )

As I checked MySQL has some similar PARTITION BY syntax that we currently don't cover (same cases with the ones you posted without the dynamic one because it isn't supported in MySQL e.g. https://dev.mysql.com/doc/refman/8.4/en/partitioning-range.html ). We can push some implemenation in this dialect and inherit + implement some extra logic in the derived dialects, thus adding functionallity in the MySQL dialect + removing extra code from the deried dialects (Doris, Starrocks).

For common patterns between Doris and Starrocks that don't exist in MySQL we can factor out in the Dialect class.

I will add some extra inline comments for help.

Hi! Take a look at a477f75, did I get that right?

geooo109 · 2026-01-22T19:33:40Z

@petrikoro thank you very much, will check it soon.

geooo109

Nice work!! left some comments.

geooo109 · 2026-01-23T11:58:51Z

sqlglot/dialects/doris.py

+            if self._match_text_seq("RANGE"):
+                partition_expressions = self._parse_wrapped_csv(self._parse_assignment)
+                self._match_l_paren()
+
+                if self._match_text_seq("FROM", advance=False):
+                    create_expressions = self._parse_csv(
+                        self._parse_partitioning_granularity_dynamic
+                    )
+                elif self._match_text_seq("PARTITION", advance=False):
+                    create_expressions = self._parse_csv(self._parse_partition_definition)
+                else:
+                    create_expressions = None
+
+                self._match_r_paren()
+
+                return self.expression(
+                    exp.PartitionByRangeProperty,
+                    partition_expressions=partition_expressions,
+                    create_expressions=create_expressions,
+                )
+
+            return self._parse_partitioned_by()


Let's use this ^ to factor out the logic here both for doris and starrock.

For doris rename _parse_partition_definition to _parse_partition_range_value (parent class method)

Apply https://github.com/tobymao/sqlglot/pull/6804/changes#r2721199101

And put this method in the base dialect , we can check for "FROM" OR "START"

also use the previous logic

if not self._match_text_seq("RANGE"): return super()._parse_partitioned_by()

to avoid the extra if-nesting.

@petrikoro Let's do the 1., 2., and .4., because 3. may be complex, and I will do it in a separate PR.

@geooo109 Thanks for the feedback! Feel free to take another look whenever you have a chance: e424f88

geooo109 · 2026-01-23T12:04:46Z

sqlglot/dialects/starrocks.py


            return unnest

+        def _parse_partition_property(


follow this: https://github.com/tobymao/sqlglot/pull/6804/changes#r2720919779

geooo109 · 2026-01-23T13:29:17Z

sqlglot/dialects/doris.py

+            if self._match_text_seq("LIST"):
+                return self.expression(
+                    exp.PartitionByListProperty,
+                    partition_expressions=self._parse_wrapped_csv(self._parse_assignment),
+                    create_expressions=self._parse_wrapped_csv(self._parse_partition_list_value),
+                )


Suggested change

if self._match_text_seq("LIST"):

return self.expression(

exp.PartitionByListProperty,

partition_expressions=self._parse_wrapped_csv(self._parse_assignment),

create_expressions=self._parse_wrapped_csv(self._parse_partition_list_value),

)

if self._match_text_seq("LIST", advance=False):

return super()._parse_partition_property()

geooo109 · 2026-01-23T13:41:18Z

sqlglot/dialects/mysql.py

+            self._match_text_seq("VALUES", "LESS", "THAN")
+            values = self._parse_wrapped_csv(self._parse_expression)
+
+            if (
+                len(values) == 1
+                and isinstance(values[0], exp.Column)
+                and values[0].name.upper() == "MAXVALUE"
+            ):
+                values = [exp.var("MAXVALUE")]


Let's use here a parsing helper in order to reuse it in the similar _parse_partition_range_value doris function.

geooo109 · 2026-01-23T14:00:08Z

sqlglot/dialects/starrocks.py

+        def partitionedbyproperty_sql(self, expression: exp.PartitionedByProperty) -> str:
+            this = expression.this
+            partition_cols = this.expressions if isinstance(this, exp.Schema) else [this]
+            is_cols = all(isinstance(col, (exp.Column, exp.Identifier)) for col in partition_cols)


Why we need this check here ?

The check distinguishes between two different StarRocks partitioning syntaxes:

Column-based partitioning - needs parentheses, especially if it's StarRocks < 3.4, see https://docs.starrocks.io/docs/table_design/data_distribution/expression_partitioning/#parameters-1 (note bellow parameters):

PARTITION BY (col1, col2)

Expression-based partitioning - no parentheses:

PARTITION BY date_trunc('day', ts), col1

When it's simple column/identifier references, StarRocks expects PARTITION BY (columns) with parens. But when you use expressions like date_trunc() or str2date(), the syntax is PARTITION BY expr without wrapping parens

I added some comments for this in e424f88

georgesittas requested a review from geooo109 January 20, 2026 13:31

feat: add full support for starrocks partitions

09aef20

petrikoro force-pushed the feat/add-full-support-for-starrocks-partitions branch from f93c63c to 09aef20 Compare January 20, 2026 13:38

geooo109 reviewed Jan 20, 2026

View reviewed changes

sqlglot/dialects/starrocks.py Outdated Show resolved Hide resolved

sqlglot/dialects/starrocks.py Outdated Show resolved Hide resolved

sqlglot/dialects/starrocks.py Outdated Show resolved Hide resolved

sqlglot/dialects/starrocks.py Outdated Show resolved Hide resolved

This was referenced Jan 21, 2026

Feat(starrocks)!: improve some starrocks properties generation #6827

Open

Feat(starrocks)!: improve some StarRocks sql generation #6737

Closed

chore: refactor mysql dialect

a477f75

chore: remove partitionedbyproperty_sql from mysql dialect (unsupported)

931bb02

geooo109 reviewed Jan 23, 2026

View reviewed changes

chore: small refatoring of doris and starrocks dialects

e424f88

feat(starrocks): add full support for partitions #6804

Are you sure you want to change the base?

feat(starrocks): add full support for partitions #6804

Conversation

petrikoro commented Jan 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Examples

Testing

Uh oh!

geooo109 commented Jan 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

georgesittas commented Jan 21, 2026

Uh oh!

petrikoro commented Jan 21, 2026

Uh oh!

petrikoro commented Jan 22, 2026

Uh oh!

geooo109 commented Jan 22, 2026

Uh oh!

geooo109 left a comment

Choose a reason for hiding this comment

Uh oh!

geooo109 Jan 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

geooo109 Jan 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

petrikoro Jan 24, 2026

Choose a reason for hiding this comment

Uh oh!

geooo109 Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

geooo109 Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

geooo109 Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

geooo109 Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

petrikoro Jan 24, 2026

Choose a reason for hiding this comment

Uh oh!

petrikoro Jan 24, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

petrikoro commented Jan 19, 2026 •

edited

Loading

geooo109 commented Jan 20, 2026 •

edited

Loading

geooo109 Jan 23, 2026 •

edited

Loading

geooo109 Jan 23, 2026 •

edited

Loading