Skip to content

GH-41017: [C++] Preserve ordered flag in DictionaryBuilder#49797

Open
tinezivic wants to merge 1 commit intoapache:mainfrom
tinezivic:fix-dictionary-builder-ordered
Open

GH-41017: [C++] Preserve ordered flag in DictionaryBuilder#49797
tinezivic wants to merge 1 commit intoapache:mainfrom
tinezivic:fix-dictionary-builder-ordered

Conversation

@tinezivic
Copy link
Copy Markdown

@tinezivic tinezivic commented Apr 19, 2026

Rationale for this change

DictionaryBuilderBase::type() does not pass the ordered parameter to ::arrow::dictionary(), causing it to always default to false. This means that any DictionaryArray built through MakeBuilder() or MakeDictionaryBuilder() with an ordered DictionaryType will produce an array where type().ordered() == false.

This affects PyArrow users: pa.array(data, type=pa.dictionary(pa.int8(), pa.string(), ordered=True)) returns an array with ordered=False.

Reported in: #41017
Also caused: pandas-dev/pandas#58152

What changes are included in this PR?

  • Add bool ordered_ = false member and set_ordered() method to DictionaryBuilderBase (both the primary template and the NullType specialization)
  • Pass ordered_ to ::arrow::dictionary() in type() and FinishInternal()
  • Add bool ordered field to DictionaryBuilderCase in builder.cc
  • Propagate dict_type.ordered() through MakeBuilderImpl::Visit() and MakeDictionaryBuilder()

Are these changes tested?

Yes. Added 3 C++ tests in array_dict_test.cc:

  • MakeBuilderPreservesOrdered — verifies MakeBuilder with ordered dict type produces ordered array
  • MakeBuilderUnorderedByDefault — verifies unordered stays unordered
  • MakeDictionaryBuilderPreservesOrdered — verifies MakeDictionaryBuilder preserves ordered

Are there any user-facing changes?

No API changes. DictionaryBuilder now correctly preserves the ordered flag from the input DictionaryType, fixing the silent data loss.

@github-actions
Copy link
Copy Markdown

⚠️ GitHub issue #41017 has been automatically assigned in GitHub to PR creator.

DictionaryBuilderBase::type() did not pass the ordered parameter
to ::arrow::dictionary(), causing it to always default to false.
This meant that building a DictionaryArray via MakeBuilder or
MakeDictionaryBuilder with an ordered DictionaryType would produce
an array with ordered=false.

Fix: Add ordered_ member and set_ordered() to DictionaryBuilderBase
(both the primary template and the NullType specialization). The
DictionaryBuilderCase in builder.cc now propagates the ordered flag
from the input DictionaryType to the builder after construction.

Generated-by: GitHub Copilot
@tinezivic tinezivic force-pushed the fix-dictionary-builder-ordered branch from 7960ab8 to d3323b0 Compare April 19, 2026 06:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant