Skip to content

Single-node Python unit tests fail #52

@edmondop

Description

@edmondop

After enabling the Python unit tests on my fork here, this simple test fails:

def test_basic_query_succeed():
    df_ctx = SessionContext()
    ctx = DatafusionRayContext(df_ctx)
    df_ctx.register_csv("tips", "examples/tips.csv", has_header=True)
    record_batch = ctx.sql("SELECT * FROM tips")
    assert record_batch.num_rows == 244

As one can see from this run
https://github.com/edmondop/datafusion-ray/actions/runs/12268595231/job/34230694956:

the cause is (execute_query_stage pid=8895) index out of bounds: the len is 0 but the index is 0

effectively, the problem comes from the Query Stage code

    pub fn get_input_partition_count(&self) -> usize {
        self.plan.children()[0]
            .properties()
            .output_partitioning()
            .partition_count()
    }

that throws an exception because the CsvExec doesn't have children.

Query stage #0:
CsvExec: file_groups={1 group: [[home/runner/work/datafusion-ray/datafusion-ray/examples/tips.csv]]}, projection=[total_bill, tip, sex, smoker, day, time, size], has_header=true

It might be related to the fact that the unit tests run a local ray instance

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions