Skip to content

Commit 6128d91

Browse files
Merge branch 'pandas-dev:main' into test/ensure-clean-append
2 parents 0049d66 + 2d73d62 commit 6128d91

File tree

35 files changed

+228
-52
lines changed

35 files changed

+228
-52
lines changed

.github/PULL_REQUEST_TEMPLATE.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,3 +3,4 @@
33
- [ ] All [code checks passed](https://pandas.pydata.org/pandas-docs/dev/development/contributing_codebase.html#pre-commit).
44
- [ ] Added [type annotations](https://pandas.pydata.org/pandas-docs/dev/development/contributing_codebase.html#type-hints) to new arguments/methods/functions.
55
- [ ] Added an entry in the latest `doc/source/whatsnew/vX.X.X.rst` file if fixing a bug or adding a new feature.
6+
- [ ] If I used AI to develop this pull request, I prompted it to follow `AGENTS.md`.

AGENTS.md

Lines changed: 48 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,48 @@
1+
# pandas Agent Instructions
2+
3+
## Project Overview
4+
`pandas` is an open source, BSD-licensed library providing high-performance, easy-to-use data structures and data analysis tools for the Python programming language.
5+
6+
## Purpose
7+
- Assist contributors by suggesting code changes, tests, and documentation edits for the pandas repository while preserving stability and compatibility.
8+
9+
## Persona & Tone
10+
- Concise, neutral, code-focused. Prioritize correctness, readability, and tests.
11+
12+
## Project Guidelines
13+
- Be sure to follow all guidelines for contributing to the codebase specified at https://pandas.pydata.org/docs/development/contributing_codebase.html
14+
- These guidelines are also available in the following local files, which should be loaded into context and adhered to
15+
- doc/source/development/contributing_codebase.rst
16+
- doc/source/development/contributing_docstring.rst
17+
- doc/source/development/contributing_documentation.rst
18+
- doc/source/development/contributing.rst
19+
20+
## Decision heuristics
21+
- Favor small, backward-compatible changes with tests.
22+
- If a change would be breaking, propose it behind a deprecation path and document the rationale.
23+
- Prefer readability over micro-optimizations unless benchmarks are requested.
24+
- Add tests for behavioral changes; update docs only after code change is final.
25+
26+
## Type hints guidance (summary)
27+
- Prefer PEP 484 style and types in pandas._typing when appropriate.
28+
- Avoid unnecessary use of typing.cast; prefer refactors that convey types to type-checkers.
29+
- Use builtin generics (list, dict) when possible.
30+
31+
## Docstring guidance (summary)
32+
- Follow NumPy / numpydoc conventions used across the repo: short summary, extended summary, Parameters, Returns/Yields, See Also, Notes, Examples.
33+
- Ensure examples are deterministic, import numpy/pandas as documented, and pass doctest rules used by docs validation.
34+
- Preserve formatting rules: triple double-quotes, no blank line before/after docstring, parameter formatting ("name : type, default ..."), types and examples conventions.
35+
36+
## Pull Requests (summary)
37+
- Pull request titles should be descriptive and include one of the following prefixes:
38+
- ENH: Enhancement, new functionality
39+
- BUG: Bug fix
40+
- DOC: Additions/updates to documentation
41+
- TST: Additions/updates to tests
42+
- BLD: Updates to the build process/scripts
43+
- PERF: Performance improvement
44+
- TYP: Type annotations
45+
- CLN: Code cleanup
46+
- Pull request descriptions should follow the template, and **succinctly** describe the change being made. Usually a few sentences is sufficient.
47+
- Pull requests which are resolving an existing Github Issue should include a link to the issue in the PR Description.
48+
- Do not add summaries or additional comments to individual commit messages. The single PR description is sufficient.

doc/source/user_guide/io.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -343,7 +343,7 @@ on_bad_lines : {{'error', 'warn', 'skip'}}, default 'error'
343343
Specifies what to do upon encountering a bad line (a line with too many fields).
344344
Allowed values are :
345345

346-
- 'error', raise an ParserError when a bad line is encountered.
346+
- 'error', raise a ParserError when a bad line is encountered.
347347
- 'warn', print a warning when a bad line is encountered and skip that line.
348348
- 'skip', skip bad lines without raising or warning when they are encountered.
349349

doc/source/whatsnew/v3.0.0.rst

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1032,13 +1032,13 @@ Bug fixes
10321032
Categorical
10331033
^^^^^^^^^^^
10341034
- Bug in :class:`Categorical` where constructing from a pandas :class:`Series` or :class:`Index` with ``dtype='object'`` did not preserve the categories' dtype as ``object``; now the ``categories.dtype`` is preserved as ``object`` for these cases, while numpy arrays and Python sequences with ``dtype='object'`` continue to infer the most specific dtype (for example, ``str`` if all elements are strings) (:issue:`61778`)
1035+
- Bug in :class:`pandas.Categorical` displaying string categories without quotes when using "string" dtype (:issue:`63045`)
10351036
- Bug in :func:`Series.apply` where ``nan`` was ignored for :class:`CategoricalDtype` (:issue:`59938`)
10361037
- Bug in :func:`bdate_range` raising ``ValueError`` with frequency ``freq="cbh"`` (:issue:`62849`)
10371038
- Bug in :func:`testing.assert_index_equal` raising ``TypeError`` instead of ``AssertionError`` for incomparable ``CategoricalIndex`` when ``check_categorical=True`` and ``exact=False`` (:issue:`61935`)
10381039
- Bug in :meth:`Categorical.astype` where ``copy=False`` would still trigger a copy of the codes (:issue:`62000`)
10391040
- Bug in :meth:`DataFrame.pivot` and :meth:`DataFrame.set_index` raising an ``ArrowNotImplementedError`` for columns with pyarrow dictionary dtype (:issue:`53051`)
10401041
- Bug in :meth:`Series.convert_dtypes` with ``dtype_backend="pyarrow"`` where empty :class:`CategoricalDtype` :class:`Series` raised an error or got converted to ``null[pyarrow]`` (:issue:`59934`)
1041-
-
10421042

10431043
Datetimelike
10441044
^^^^^^^^^^^^
@@ -1299,6 +1299,7 @@ ExtensionArray
12991299
- Bug in :class:`Categorical` when constructing with an :class:`Index` with :class:`ArrowDtype` (:issue:`60563`)
13001300
- Bug in :meth:`.arrays.ArrowExtensionArray.__setitem__` which caused wrong behavior when using an integer array with repeated values as a key (:issue:`58530`)
13011301
- Bug in :meth:`ArrowExtensionArray.factorize` where NA values were dropped when input was dictionary-encoded even when dropna was set to False(:issue:`60567`)
1302+
- Bug in :meth:`NDArrayBackedExtensionArray.take` which produced arrays whose dtypes didn't match their underlying data, when called with integer arrays (:issue:`62448`)
13021303
- Bug in :meth:`api.types.is_datetime64_any_dtype` where a custom :class:`ExtensionDtype` would return ``False`` for array-likes (:issue:`57055`)
13031304
- Bug in comparison between object with :class:`ArrowDtype` and incompatible-dtyped (e.g. string vs bool) incorrectly raising instead of returning all-``False`` (for ``==``) or all-``True`` (for ``!=``) (:issue:`59505`)
13041305
- Bug in constructing pandas data structures when passing into ``dtype`` a string of the type followed by ``[pyarrow]`` while PyArrow is not installed would raise ``NameError`` rather than ``ImportError`` (:issue:`57928`)

pandas/_libs/lib.pyx

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -322,7 +322,7 @@ def item_from_zerodim(val: object) -> object:
322322
>>> item_from_zerodim(np.array([1]))
323323
array([1])
324324
"""
325-
if cnp.PyArray_IsZeroDim(val):
325+
if cnp.PyArray_IsZeroDim(val) and cnp.PyArray_CheckExact(val):
326326
return cnp.PyArray_ToScalar(cnp.PyArray_DATA(val), val)
327327
return val
328328

@@ -2593,7 +2593,7 @@ def maybe_convert_objects(ndarray[object] objects,
25932593
Whether to convert numeric entries.
25942594
convert_to_nullable_dtype : bool, default False
25952595
If an array-like object contains only integer or boolean values (and NaN) is
2596-
encountered, whether to convert and return an Boolean/IntegerArray.
2596+
encountered, whether to convert and return a Boolean/IntegerArray.
25972597
convert_non_numeric : bool, default False
25982598
Whether to convert datetime, timedelta, period, interval types.
25992599
dtype_if_all_nat : np.dtype, ExtensionDtype, or None, default None

pandas/_libs/tslibs/fields.pyx

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -146,7 +146,7 @@ def get_date_name_field(
146146
NPY_DATETIMEUNIT reso=NPY_FR_ns,
147147
):
148148
"""
149-
Given a int64-based datetime index, return array of strings of date
149+
Given an int64-based datetime index, return array of strings of date
150150
name based on requested field (e.g. day_name)
151151
"""
152152
cdef:
@@ -335,7 +335,7 @@ def get_date_field(
335335
NPY_DATETIMEUNIT reso=NPY_FR_ns,
336336
):
337337
"""
338-
Given a int64-based datetime index, extract the year, month, etc.,
338+
Given an int64-based datetime index, extract the year, month, etc.,
339339
field and return an array of these values.
340340
"""
341341
cdef:
@@ -502,7 +502,7 @@ def get_timedelta_field(
502502
NPY_DATETIMEUNIT reso=NPY_FR_ns,
503503
):
504504
"""
505-
Given a int64-based timedelta index, extract the days, hrs, sec.,
505+
Given an int64-based timedelta index, extract the days, hrs, sec.,
506506
field and return an array of these values.
507507
"""
508508
cdef:
@@ -555,7 +555,7 @@ def get_timedelta_days(
555555
NPY_DATETIMEUNIT reso=NPY_FR_ns,
556556
):
557557
"""
558-
Given a int64-based timedelta index, extract the days,
558+
Given an int64-based timedelta index, extract the days,
559559
field and return an array of these values.
560560
"""
561561
cdef:
@@ -592,7 +592,7 @@ cpdef isleapyear_arr(ndarray years):
592592
@cython.boundscheck(False)
593593
def build_isocalendar_sarray(const int64_t[:] dtindex, NPY_DATETIMEUNIT reso):
594594
"""
595-
Given a int64-based datetime array, return the ISO 8601 year, week, and day
595+
Given an int64-based datetime array, return the ISO 8601 year, week, and day
596596
as a structured array.
597597
"""
598598
cdef:

pandas/_libs/tslibs/offsets.pyx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -827,7 +827,7 @@ cdef class BaseOffset:
827827
@property
828828
def nanos(self):
829829
"""
830-
Returns a integer of the total number of nanoseconds for fixed frequencies.
830+
Returns an integer of the total number of nanoseconds for fixed frequencies.
831831
832832
Raises
833833
------

pandas/_libs/tslibs/timedeltas.pyx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -334,7 +334,7 @@ cdef convert_to_timedelta64(object ts, str unit):
334334
Handle these types of objects:
335335
- timedelta/Timedelta
336336
337-
Return an timedelta64[ns] object
337+
Return a timedelta64[ns] object
338338
"""
339339
# Caller is responsible for checking unit not in ["Y", "y", "M"]
340340
if isinstance(ts, _Timedelta):

pandas/_libs/tslibs/timestamps.pyx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1717,7 +1717,7 @@ cdef class _Timestamp(ABCTimestamp):
17171717
17181718
def to_period(self, freq=None):
17191719
"""
1720-
Return an period of which this timestamp is an observation.
1720+
Return a period of which this timestamp is an observation.
17211721

17221722
This method converts the given Timestamp to a Period object,
17231723
which represents a span of time,such as a year, month, etc.,

pandas/core/arrays/boolean.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -273,7 +273,7 @@ class BooleanArray(BaseMaskedArray):
273273
BooleanArray implements Kleene logic (sometimes called three-value
274274
logic) for logical operations. See :ref:`boolean.kleene` for more.
275275
276-
To construct an BooleanArray from generic array-like input, use
276+
To construct a BooleanArray from generic array-like input, use
277277
:func:`pandas.array` specifying ``dtype="boolean"`` (see examples
278278
below).
279279
@@ -313,7 +313,7 @@ class BooleanArray(BaseMaskedArray):
313313
314314
Examples
315315
--------
316-
Create an BooleanArray with :func:`pandas.array`:
316+
Create a BooleanArray with :func:`pandas.array`:
317317
318318
>>> pd.array([True, False, None], dtype="boolean")
319319
<BooleanArray>

0 commit comments

Comments
 (0)