SQL: struct support by kbatuigas · Pull Request #586 · redpanda-data/cloud-docs

kbatuigas · 2026-05-14T05:30:41Z

Description

This pull request introduces comprehensive documentation improvements for working with nested fields in Redpanda SQL, focusing on mapping nested Protobuf or Avro structures as SQL ROW columns and querying them directly. It clarifies the use of the struct_mapping_policy option, expands the reference for the ROW data type, and adds a dedicated how-to guide for querying nested fields.

New documentation and feature explanations:

Added a new how-to page, query-nested-fields.adoc, detailing how to map topics with nested schemas as SQL tables using struct_mapping_policy = 'COMPOUND', how to query nested fields with ROW syntax, and how to handle recursive (cyclic) schemas.
Updated the navigation (nav.adoc) to include the new "Query Topics with Nested Fields" guide.

Improvements to ROW data type documentation:

Expanded the ROW data type reference to document field access (by position and name), wildcard projection, lexicographic comparison, NULL checks, text conversion, and usage in GROUP BY, ORDER BY, and JOIN clauses.
Enhanced the ROW type summary to mention its support for field access, comparisons, and use in query clauses.

Clarifications to CREATE TABLE options:

Clarified the struct_mapping_policy option in the CREATE TABLE documentation, emphasizing that COMPOUND maps nested structures to SQL ROW columns and noting that cyclic types are only supported in JSON mode.

Resolves https://github.com/redpanda-data/documentation-private/issues/
Review deadline: 20 May

Page previews

Reference > Redpanda SQL Reference > Data Types > Row
Redpanda SQL > Query Data > Query Topics with Nested Fields

Checks

New feature
Content gap
Support Follow-up
Small fix (typos, links, copyedits, etc)

netlify · 2026-05-14T05:30:46Z

✅ Deploy Preview for rp-cloud ready!

Name	Link
🔨 Latest commit	`f66afba`
🔍 Latest deploy log	https://app.netlify.com/projects/rp-cloud/deploys/6a0e97d47072d7000803b454
😎 Deploy Preview	https://deploy-preview-586--rp-cloud.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

coderabbitai · 2026-05-14T05:30:48Z

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: f1c0ee46-73eb-4d5d-a3dd-226c6f166332

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch DOC-2019-document-feature-record-structure-type-support

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

pkonrad1229 · 2026-05-15T07:14:16Z

+:learning-objective-2: Query nested fields using ROW field-access syntax
+:learning-objective-3: Recognize and resolve cyclic-reference errors
+
+When a glossterm:topic[]'s schema includes nested Protobuf or Avro message types, you can map those nested structures as SQL `ROW` columns instead of opaque JSON. This makes nested fields queryable by name, includable in projections, and usable in `WHERE`, `GROUP BY`, and `ORDER BY` clauses, without parsing JSON at query time.


That may be a nitpick, but stating that we are mapping as SQL ROW columns is not entirely true.

In PostgreSQL, a ROW is an anonymous record, in which you cannot explicitly set the sub-field names (they contain some generic f1, f2, f<n>... names that you cannot change).

What we do in the COMPOUND mapping is we actually create a User-Defined type, and set the names of the fields according to the schema.

Not sure if that's something that we want to explicitly state here, or maybe the ROW meaning here is something other than PostgreSQL ROW.

Changed to

When a glossterm:topic[]'s schema includes nested Protobuf, Avro, or JSON message types, you can map those nested structures as user-defined types (UDTs) with named fields, queryable using SQL ROW field-access syntax, instead of opaque JSON. This makes nested fields queryable by name, includable in projections, and usable in WHERE, GROUP BY, and ORDER BY clauses, without parsing JSON at query time.

@mattschumpert do you have a preference on whether we explicitly mention user defined types?

No idea. I defer to @pkonrad1229

I don't see an issue with why we shouldn't. Any user can check that for themselves by using e.g. pg_typeof function.

pkonrad1229 · 2026-05-15T07:16:21Z

+}
+----
+
+Redpanda SQL maps the table with three columns: `order_id` (text), `customer` (a `ROW` with fields `customer_id`, `name`, and `region`), and `amount` (double precision).


ditto here:

a ROW with fields

We may also say something along the lines of:

a structure/UDT with fields

pkonrad1229 · 2026-05-15T07:22:15Z

 (1 row)
 ----

 === Use implicit tuple syntax


I'm late to the party, as this was not modified in this PR :D I believe it's worth noting that the implicit tuple syntax works only when there are two or more expressions

@pkonrad1229 Hm, that may have just surfaced as something related based on the Claude Code research... is it ok to leave on this page? The explanation is currently under the first sectionn https://deploy-preview-586--rp-cloud.netlify.app/redpanda-cloud/reference/sql/sql-data-types/row/#syntax

yeah sure, it's okay to leave it here. I only meant to say that we follow PostgreSQL rules for the ROW constructor, where the implicit syntax works only when there's more than 1 expression, so:

(col) returns col extression

(col1,col2) returns a ROW/record of those two columns

Postgres mentions this implicit syntax rule directly in their docs .

mattschumpert · 2026-05-20T21:19:19Z

+
+* Enable Redpanda SQL on your Redpanda Bring Your Own Cloud (BYOC) cluster. See xref:sql:get-started/deploy-sql-cluster.adoc[Enable Redpanda SQL].
+* Connect to Redpanda SQL with `psql` or another PostgreSQL client. See xref:sql:connect-to-sql/index.adoc[Connect to Redpanda SQL].
+* The topic has a schema registered in glossterm:schema-registry[Schema Registry]. The schema includes one or more nested message types.


Shouldn't we be specifying that the schema is registered for the topic using the TopicNamingStrategy naming convention @pkonrad1229 @kbatuigas ? You have to name it correctly for this to work, right? If people are not already familiar with this in SR we should educate them (point them to this naming convention)

Yes, the correct schema_subject is required for this to work. Not deeply familiar with the SR side, but from Oxla's perspective, the underlying naming strategy doesn't matter; only that the resolved subject matches a registered one. Two cases:

SR uses Confluent's default TopicNameStrategy → schema_subject can be omitted; Oxla defaults to <topic>-value.

SR uses a different strategy → schema_subject must be set explicitly in the CREATE TABLE options.

Side note: this comment is broad, not specific to nested fields. It's general CREATE TABLE + Kafka topic behavior. Question whether we should explain it inline here or just link to the CREATE TABLE reference docs where schema_subject semantics belong.

mattschumpert · 2026-05-20T21:20:16Z

+----
+CREATE TABLE default_redpanda_catalog=>orders WITH (
+  topic = 'orders',
+  schema_subject = 'orders-value',


Is schema_subject required or optional?

If it's required then maybe the naming convention is not mandatory @pkonrad1229 ?

From what I see, it's optional, and when omitted, Oxla resolves the subject to <topic>-value, so in this example it could be left out, and the outcome would be the same

mattschumpert · 2026-05-20T21:22:56Z

+CREATE TABLE default_redpanda_catalog=>orders WITH (
+  topic = 'orders',
+  schema_subject = 'orders-value',
+  struct_mapping_policy = 'COMPOUND'


It says below this is optional. Comments should explain the same here (what is optional vs not)

mattschumpert · 2026-05-20T21:34:20Z

+
+| `JSON`
+| The topic schema is recursive, or you prefer flexible access through JSON functions.
+| Recursive types supported; fields are untyped until extracted with JSON functions. Queries that span the Redpanda topic and its linked Iceberg table do not align cleanly, because Iceberg always exposes nested structures as typed columns.


@grzebiel this warning would imply something very important to alert the user of (we dont really support querying iceberg topics with recursive types). However, I don't think this is the correct message here (at least not always), because Iceberg topics has a special handling encoding recursive Protobuf Struct fields as a JSON string in the Iceberg table. SO for protobuf, we do have a story for recursive fields (at least in the protobuf case).

So, how should this be adjusted.

mattschumpert · 2026-05-20T21:37:20Z

+== Next steps
+
+* xref:sql:query-data/query-streaming-topics.adoc[Query streaming topics]: query a topic without Iceberg history.
+* xref:sql:query-data/query-iceberg-topics.adoc[Query Iceberg topics]: query the Iceberg-translated history of a topic. Use `struct_mapping_policy = 'COMPOUND'` so nested fields align between the Redpanda topic and the linked Iceberg table.


@kbatuigas wrong wording IMO. 'Query a topic with Iceberg history' is better.

What's here is technically incorrect because it makes it sound like you're ONLY querying the iceberg portion (tail). but in fact this link is to how to do a bridge query that queries both the live streaming data and iceberg history.

We should ensure we correct this everywhere.

kbatuigas requested a review from a team as a code owner May 14, 2026 05:30

kbatuigas requested a review from pkonrad1229 May 14, 2026 18:07

pkonrad1229 reviewed May 15, 2026

View reviewed changes

Comment thread modules/sql/pages/query-data/query-nested-fields.adoc Outdated

kbatuigas force-pushed the rp-sql branch from e051360 to 1b9d587 Compare May 19, 2026 03:26

kbatuigas added 4 commits May 18, 2026 20:27

Draft struct support reference

49edf69

How to query structs/nested fields

5623c8a

Review pass

d35813d

Apply suggestions from SME review

e1f3231

kbatuigas force-pushed the DOC-2019-document-feature-record-structure-type-support branch from 27ea3c1 to e1f3231 Compare May 19, 2026 03:29

Don't use all caps for nav labels

60053ed

kbatuigas requested review from mattschumpert and pkonrad1229 May 20, 2026 00:40

mattschumpert approved these changes May 20, 2026

View reviewed changes

Apply suggestions from SME review

f66afba

pkonrad1229 approved these changes May 21, 2026

View reviewed changes

Conversation

kbatuigas commented May 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Page previews

Checks

Uh oh!

netlify Bot commented May 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for rp-cloud ready!

Uh oh!

coderabbitai Bot commented May 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pkonrad1229 May 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

kbatuigas commented May 14, 2026 •

edited

Loading

netlify Bot commented May 14, 2026 •

edited

Loading

coderabbitai Bot commented May 14, 2026 •

edited

Loading

pkonrad1229 May 21, 2026 •

edited

Loading