Skip to content

DType is not sufficient to correctly infer and export arrow schemas #8135

@AdamGS

Description

@AdamGS

For ParquetVariant (and potentially other types), the array's metadata is required to accurately provide an Arrow Field. The physical layout of the array itself influences the Arrow-side data type.

Parquet solves this in two ways:

  1. Storing the schema as file metadata.
  2. Parquet is structured as a tree of types, so Variant columns have children that can be directly inspected to infer the schema correctly.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions