[SPARK-47998][PS] Support native pandas inputs in pandas-on-Spark concat by hvph-uyen · Pull Request #55561 · apache/spark

hvph-uyen · 2026-04-27T10:31:32Z

What changes were proposed in this pull request?

This PR updates pandas-on-Spark concat so that native pandas DataFrame and Series objects are accepted when they are passed inside the input iterable.

It also fixes the error message for unsupported inputs so that it reports the actual invalid element type instead of the outer container type.

The existing behavior for bare inputs such as ps.concat(pdf) and ps.concat(pser) is preserved.

Why are the changes needed?

Currently, ps.concat rejects native pandas objects even when they can be converted in the iterable case.

Also, the current error message is misleading because it can report list instead of the actual invalid input type.

Does this PR introduce any user-facing change?

Yes.

Before this change, ps.concat([pdf, pdf]) rejected native pandas inputs, and unsupported input errors could report list instead of the actual invalid element type.

After this change, native pandas DataFrame and Series inputs are accepted in the iterable case, and unsupported input errors report the actual invalid element type.

How was this patch tested?

Added regression coverage in pyspark.pandas.tests.test_namespace.
Ran:
python/run-tests --python-executables python3 --testnames pyspark.pandas.tests.test_namespace

Was this patch authored or co-authored using generative AI tooling?

Generated-by: Codex (GPT-5)

Generative AI tooling was used to help inspect the issue, understand the relevant Spark codebase. The final patch was manually reviewed and tested before submission.

HyukjinKwon · 2026-04-27T23:16:49Z

I think we should not do this. Otherwise, we will have to fix all API surface to support pandas instances

hvph-uyen added 3 commits April 27, 2026 17:06

[SPARK-47998][PS] Support native pandas inputs in pandas-on-Spark concat

91f30b5

[SPARK-47998][PS] Trigger CI

46a9f6f

[SPARK-47998][PS] Fix concat mypy typing

63e2561

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-47998][PS] Support native pandas inputs in pandas-on-Spark concat#55561

[SPARK-47998][PS] Support native pandas inputs in pandas-on-Spark concat#55561
hvph-uyen wants to merge 3 commits intoapache:masterfrom
hvph-uyen:spark-47998-ps-concat

hvph-uyen commented Apr 27, 2026

Uh oh!

HyukjinKwon commented Apr 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

hvph-uyen commented Apr 27, 2026

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Was this patch authored or co-authored using generative AI tooling?

Uh oh!

HyukjinKwon commented Apr 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants