-
-
Notifications
You must be signed in to change notification settings - Fork 19.4k
API: microsecond resolution for Timedelta strings #63196
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
f656e6b
887ded1
c2eac39
85a1745
d6e9464
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -147,7 +147,7 @@ def test_len_nan_group(): | |
|
|
||
| def test_groupby_timedelta_median(): | ||
| # issue 57926 | ||
| expected = Series(data=Timedelta("1D"), index=["foo"]) | ||
| expected = Series(data=Timedelta("1D"), index=["foo"], dtype="m8[ns]") | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I was wondering why ns, but it is the other PR that will preserve the unit of the Timedelta object when converting to an array?
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Adding the dtype here preserves the current dtype for expected. The other PR will change the dtype for |
||
| df = DataFrame({"label": ["foo", "foo"], "timedelta": [pd.NaT, Timedelta("1D")]}) | ||
| gb = df.groupby("label")["timedelta"] | ||
| actual = gb.median() | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -440,7 +440,7 @@ def test_td_mul_td64_ndarray_invalid(self): | |
|
|
||
| msg = ( | ||
| "ufunc '?multiply'? cannot use operands with types " | ||
| rf"dtype\('{tm.ENDIAN}m8\[ns\]'\) and dtype\('{tm.ENDIAN}m8\[ns\]'\)" | ||
| rf"dtype\('{tm.ENDIAN}m8\[us\]'\) and dtype\('{tm.ENDIAN}m8\[us\]'\)" | ||
| ) | ||
| with pytest.raises(TypeError, match=msg): | ||
| td * other | ||
|
|
@@ -1219,6 +1219,7 @@ def test_ops_str_deprecated(box): | |
| "ufunc 'divide' cannot use operands", | ||
| "Invalid dtype object for __floordiv__", | ||
| r"unsupported operand type\(s\) for /: 'int' and 'str'", | ||
| r"unsupported operand type\(s\) for /: 'datetime.timedelta' and 'str'", | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Just curious, how is this caused by the changes here?
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It was only on the dev builds and when box=True so we are dividing by |
||
| ] | ||
| ) | ||
| with pytest.raises(TypeError, match=msg): | ||
|
|
||
| Original file line number | Diff line number | Diff line change | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
@@ -271,12 +271,12 @@ def test_construction(): | |||||||||||
| expected = np.timedelta64(10, "D").astype("m8[ns]").view("i8") | ||||||||||||
| assert Timedelta(10, unit="D")._value == expected | ||||||||||||
| assert Timedelta(10.0, unit="D")._value == expected | ||||||||||||
| assert Timedelta("10 days")._value == expected | ||||||||||||
| assert Timedelta("10 days")._value == expected // 1000 | ||||||||||||
| assert Timedelta(days=10)._value == expected | ||||||||||||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Just to keep track, this is another code path that should also still be updated ideally? (for another PR)
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Not sure what you're suggesting. I'd be open to changing .value to no longer always cast to nanos, but _value is correct here
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sorry, this was about the
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. ahh i see. will be starting a branch for that shortly |
||||||||||||
| assert Timedelta(days=10.0)._value == expected | ||||||||||||
|
|
||||||||||||
| expected += np.timedelta64(10, "s").astype("m8[ns]").view("i8") | ||||||||||||
| assert Timedelta("10 days 00:00:10")._value == expected | ||||||||||||
| assert Timedelta("10 days 00:00:10")._value == expected // 1000 | ||||||||||||
| assert Timedelta(days=10, seconds=10)._value == expected | ||||||||||||
| assert Timedelta(days=10, milliseconds=10 * 1000)._value == expected | ||||||||||||
| assert Timedelta(days=10, microseconds=10 * 1000 * 1000)._value == expected | ||||||||||||
|
|
@@ -434,7 +434,7 @@ def test_td_construction_with_np_dtypes(npdtype, item): | |||||||||||
| def test_td_from_repr_roundtrip(val): | ||||||||||||
| # round-trip both for string and value | ||||||||||||
| td = Timedelta(val) | ||||||||||||
| assert Timedelta(td._value) == td | ||||||||||||
| assert Timedelta(td.value) == td | ||||||||||||
|
|
||||||||||||
| assert Timedelta(str(td)) == td | ||||||||||||
| assert Timedelta(td._repr_base(format="all")) == td | ||||||||||||
|
|
@@ -443,7 +443,7 @@ def test_td_from_repr_roundtrip(val): | |||||||||||
|
|
||||||||||||
| def test_overflow_on_construction(): | ||||||||||||
| # GH#3374 | ||||||||||||
| value = Timedelta("1day")._value * 20169940 | ||||||||||||
| value = Timedelta("1day").as_unit("ns")._value * 20169940 | ||||||||||||
| msg = "Cannot cast 1742682816000000000000 from ns to 'ns' without overflow" | ||||||||||||
| with pytest.raises(OutOfBoundsTimedelta, match=msg): | ||||||||||||
| Timedelta(value) | ||||||||||||
|
|
@@ -705,3 +705,21 @@ def test_non_nano_value(): | |||||||||||
| # check that the suggested workaround actually works | ||||||||||||
| result = td.asm8.view("i8") | ||||||||||||
| assert result == 86400000000 | ||||||||||||
|
|
||||||||||||
|
|
||||||||||||
| def test_parsed_unit(): | ||||||||||||
| td = Timedelta("1 Day") | ||||||||||||
| assert td.unit == "us" | ||||||||||||
|
|
||||||||||||
| td = Timedelta("1 Day 2 hours 3 minutes 4 ns") | ||||||||||||
| assert td.unit == "ns" | ||||||||||||
|
|
||||||||||||
| td = Timedelta("1 Day 2:03:04.012345") | ||||||||||||
| assert td.unit == "us" | ||||||||||||
|
|
||||||||||||
| td = Timedelta("1 Day 2:03:04.012345000") | ||||||||||||
| assert td.unit == "ns" | ||||||||||||
|
|
||||||||||||
| # 7 digits after the decimal | ||||||||||||
| td = Timedelta("1 Day 2:03:04.0123450") | ||||||||||||
| assert td.unit == "ns" | ||||||||||||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. will update |
||||||||||||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had a hard time understanding what the function was doing on first read, so I think a bit more explanation like above would help for future readers