Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion dev/requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ py4j>=0.10.9.9

# PySpark dependencies (optional)
numpy>=1.22
pyarrow>=15.0.0
pyarrow>=18.0.0
six==1.16.0
pandas>=2.2.0
scipy
Expand Down
4 changes: 2 additions & 2 deletions dev/spark-test-image/python-minimum/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ LABEL org.opencontainers.image.ref.name="Apache Spark Infra Image For PySpark wi
# Overwrite this label to avoid exposing the underlying Ubuntu OS version label
LABEL org.opencontainers.image.version=""

ENV FULL_REFRESH_DATE=20250703
ENV FULL_REFRESH_DATE=20251225

ENV DEBIAN_FRONTEND=noninteractive
ENV DEBCONF_NONINTERACTIVE_SEEN=true
Expand Down Expand Up @@ -62,7 +62,7 @@ RUN apt-get update && apt-get install -y \
wget \
zlib1g-dev

ARG BASIC_PIP_PKGS="numpy==1.22.4 pyarrow==15.0.0 pandas==2.2.0 six==1.16.0 scipy scikit-learn coverage unittest-xml-reporting"
ARG BASIC_PIP_PKGS="numpy==1.22.4 pyarrow==18.0.0 pandas==2.2.0 six==1.16.0 scipy scikit-learn coverage unittest-xml-reporting"
# Python deps for Spark Connect
ARG CONNECT_PIP_PKGS="grpcio==1.76.0 grpcio-status==1.76.0 googleapis-common-protos==1.71.0 zstandard==0.25.0 graphviz==0.20 protobuf"

Expand Down
4 changes: 2 additions & 2 deletions dev/spark-test-image/python-ps-minimum/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ LABEL org.opencontainers.image.ref.name="Apache Spark Infra Image For Pandas API
# Overwrite this label to avoid exposing the underlying Ubuntu OS version label
LABEL org.opencontainers.image.version=""

ENV FULL_REFRESH_DATE=20250708
ENV FULL_REFRESH_DATE=20251225

ENV DEBIAN_FRONTEND=noninteractive
ENV DEBCONF_NONINTERACTIVE_SEEN=true
Expand Down Expand Up @@ -63,7 +63,7 @@ RUN apt-get update && apt-get install -y \
zlib1g-dev


ARG BASIC_PIP_PKGS="pyarrow==15.0.0 pandas==2.2.0 six==1.16.0 numpy scipy coverage unittest-xml-reporting"
ARG BASIC_PIP_PKGS="pyarrow==18.0.0 pandas==2.2.0 six==1.16.0 numpy scipy coverage unittest-xml-reporting"
# Python deps for Spark Connect
ARG CONNECT_PIP_PKGS="grpcio==1.76.0 grpcio-status==1.76.0 googleapis-common-protos==1.71.0 zstandard==0.25.0 graphviz==0.20 protobuf"

Expand Down
8 changes: 4 additions & 4 deletions python/docs/source/getting_started/install.rst
Original file line number Diff line number Diff line change
Expand Up @@ -226,7 +226,7 @@ Installable with ``pip install "pyspark[connect]"``.
Package Supported version Note
========================== ================= ==========================
`pandas` >=2.2.0 Required for Spark Connect
`pyarrow` >=15.0.0 Required for Spark Connect
`pyarrow` >=18.0.0 Required for Spark Connect
`grpcio` >=1.76.0 Required for Spark Connect
`grpcio-status` >=1.76.0 Required for Spark Connect
`googleapis-common-protos` >=1.71.0 Required for Spark Connect
Expand All @@ -243,7 +243,7 @@ Installable with ``pip install "pyspark[sql]"``.
Package Supported version Note
========= ================= ======================
`pandas` >=2.2.0 Required for Spark SQL
`pyarrow` >=15.0.0 Required for Spark SQL
`pyarrow` >=18.0.0 Required for Spark SQL
========= ================= ======================

Additional libraries that enhance functionality but are not included in the installation packages:
Expand All @@ -260,7 +260,7 @@ Installable with ``pip install "pyspark[pandas_on_spark]"``.
Package Supported version Note
========= ================= ================================
`pandas` >=2.2.0 Required for Pandas API on Spark
`pyarrow` >=15.0.0 Required for Pandas API on Spark
`pyarrow` >=18.0.0 Required for Pandas API on Spark
========= ================= ================================

Additional libraries that enhance functionality but are not included in the installation packages:
Expand Down Expand Up @@ -310,7 +310,7 @@ Installable with ``pip install "pyspark[pipelines]"``. Includes all dependencies
Package Supported version Note
========================== ================= ===================================================
`pandas` >=2.2.0 Required for Spark Connect and Spark SQL
`pyarrow` >=15.0.0 Required for Spark Connect and Spark SQL
`pyarrow` >=18.0.0 Required for Spark Connect and Spark SQL
`grpcio` >=1.76.0 Required for Spark Connect
`grpcio-status` >=1.76.0 Required for Spark Connect
`googleapis-common-protos` >=1.71.0 Required for Spark Connect
Expand Down
2 changes: 1 addition & 1 deletion python/pyspark/sql/classic/dataframe.py
Original file line number Diff line number Diff line change
Expand Up @@ -2011,7 +2011,7 @@ def _test() -> None:
import pyarrow as pa
from pyspark.loose_version import LooseVersion

if LooseVersion(pa.__version__) < LooseVersion("17.0.0"):
if LooseVersion(pa.__version__) < LooseVersion("21.0.0"):
del pyspark.sql.dataframe.DataFrame.mapInArrow.__doc__

spark = (
Expand Down
2 changes: 1 addition & 1 deletion python/pyspark/sql/connect/dataframe.py
Original file line number Diff line number Diff line change
Expand Up @@ -2376,7 +2376,7 @@ def _test() -> None:
import pyarrow as pa
from pyspark.loose_version import LooseVersion

if LooseVersion(pa.__version__) < LooseVersion("17.0.0"):
if LooseVersion(pa.__version__) < LooseVersion("21.0.0"):
del pyspark.sql.dataframe.DataFrame.mapInArrow.__doc__

globs["spark"] = (
Expand Down