Fix tuple-typed function args being exploded in extraction#30
Open
bastianhagedorn wants to merge 1 commit intomainfrom
Open
Fix tuple-typed function args being exploded in extraction#30bastianhagedorn wants to merge 1 commit intomainfrom
bastianhagedorn wants to merge 1 commit intomainfrom
Conversation
When a decorated function has tuple-typed arguments (e.g. mnkl, tile_shape), pd.Series.explode was applied to all DataFrame columns, unintentionally exploding those arg columns. When two tuple args had different lengths, this caused a ValueError. With matching lengths, data was silently corrupted. Replace df.apply(pd.Series.explode) with df.explode(["Value", "Metric"]) to only explode the columns that contain per-metric data. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Collaborator
Author
|
/ok to test 1950793 |
acollins3
approved these changes
Mar 6, 2026
Collaborator
|
lgtm. Just need to fix those lint failures |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fix tuple-typed function args being exploded in extraction
df.apply(pd.Series.explode)inextraction.pyexplodes all columns, including tuple-typed function arguments (e.g.mnkl=(8192,8192,8192,1),tile_shape=(128,256,64)). When two tuple args have different lengths, this crashes withValueError: cannot reindex on an axis with duplicate labels. With matching lengths, data is silently corrupted.Fix
Replace
df.apply(pd.Series.explode)withdf.explode(["Value", "Metric"])so only the metric data columns are exploded.Tests
test_tuple_typed_function_args— single metric, two tuple args of different lengthstest_tuple_typed_function_args_multiple_metrics— 3 metrics, tuple args with 4 and 2 elements, verifies exactly 3 rows produced with tuples preserved