fix: convert ISO datetime string columns before VegaFusion pre-transform#110
fix: convert ISO datetime string columns before VegaFusion pre-transform#110sLightlyDev wants to merge 1 commit into
Conversation
Charts exported from Deepnote cloud store temporal columns as object (ISO 8601 string) dtype. VegaFusion/DataFusion then tries to parse those strings using the Vega-Lite axis display format (e.g. '%B %d, %Y %H:%M'), which raises a ValueError and crashes the chart. Fix by converting only the columns that the Vega-Lite spec declares as "type": "temporal" to datetime64 before the VegaFusion pre-transform. Scoping the conversion to temporal fields avoids falsely converting nominal string axes (year strings, month codes, numeric IDs). Also broadens the VegaFusion except clause to catch ValueError alongside the existing ArrowNotImplementedError so any remaining edge cases surface a friendly ChartError instead of a raw traceback.
📝 WalkthroughWalkthrough
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes 🚥 Pre-merge checks | ✅ 4 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Comment |
|
📦 Python package built successfully!
|
|
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
deepnote_toolkit/chart/utils.py (1)
51-56: 🧹 Nitpick | 🔵 Trivial | ⚡ Quick winConsider logging conversion failures for debugging.
While intentionally defensive, silently swallowing all exceptions makes debugging difficult when columns fail to convert. Per coding guidelines, errors should be logged with context.
📋 Proposed enhancement
try: converted = pd.to_datetime(non_null, format="ISO8601", utc=True, errors="coerce") if converted.notna().all(): pd_df[col] = pd.to_datetime(pd_df[col], format="ISO8601", utc=True, errors="coerce") - except Exception: - pass + except Exception as e: + logger.debug( + "Skipping datetime conversion for column %r: %s", + col, + e, + )Add logger import if not present:
import logging logger = logging.getLogger(__name__)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@deepnote_toolkit/chart/utils.py` around lines 51 - 56, The except block in the pd.to_datetime conversion logic is silently swallowing all exceptions without any logging, which makes debugging difficult when column conversions fail. Add logging to the exception handler to capture and log the actual error context. First, ensure the logging module is imported at the top of the file with a logger instance created (e.g., logger = logging.getLogger(__name__)). Then, in the except Exception block, replace the pass statement with a logger.warning or logger.error call that includes the column name (col) and the actual exception details to provide context for debugging conversion failures.Source: Coding guidelines
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@deepnote_toolkit/chart/spec_utils.py`:
- Line 332: The function get_temporal_fields_from_vega_lite_spec has a generic
return type hint of set instead of the more explicit Set[str]. Import Set from
the typing module at the top of the file, then update the return type annotation
of get_temporal_fields_from_vega_lite_spec to use Set[str] instead of set to
follow the coding guidelines for explicit type hints.
In `@deepnote_toolkit/chart/utils.py`:
- Line 52: The pd.to_datetime call with format="ISO8601" is incompatible with
pandas versions below 2.0.0, but the project supports pandas >= 1.2.5, causing
silent failures when the exception is caught by the broad except: pass block.
Either add version-specific logic to check the pandas version and conditionally
use format="ISO8601" only for pandas 2.0.0+, removing the parameter for older
versions, or update the project's minimum pandas version requirement to 2.0.0 or
higher. This applies to both instances of this parameter in the converted and
converted_utc variable assignments.
---
Nitpick comments:
In `@deepnote_toolkit/chart/utils.py`:
- Around line 51-56: The except block in the pd.to_datetime conversion logic is
silently swallowing all exceptions without any logging, which makes debugging
difficult when column conversions fail. Add logging to the exception handler to
capture and log the actual error context. First, ensure the logging module is
imported at the top of the file with a logger instance created (e.g., logger =
logging.getLogger(__name__)). Then, in the except Exception block, replace the
pass statement with a logger.warning or logger.error call that includes the
column name (col) and the actual exception details to provide context for
debugging conversion failures.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 1e6e1104-dfc9-430b-a529-f7bfdc3ba61a
📒 Files selected for processing (4)
deepnote_toolkit/chart/deepnote_chart.pydeepnote_toolkit/chart/spec_utils.pydeepnote_toolkit/chart/utils.pytests/unit/test_chart.py
| ) | ||
|
|
||
|
|
||
| def get_temporal_fields_from_vega_lite_spec(vega_lite_spec: Dict[str, Any]) -> set: |
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick win
Use explicit type hint for return value.
Return type should be Set[str] instead of set per coding guidelines.
📝 Proposed fix
-def get_temporal_fields_from_vega_lite_spec(vega_lite_spec: Dict[str, Any]) -> set:
+def get_temporal_fields_from_vega_lite_spec(vega_lite_spec: Dict[str, Any]) -> Set[str]:Add Set to imports at top of file:
+from typing import Set🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@deepnote_toolkit/chart/spec_utils.py` at line 332, The function
get_temporal_fields_from_vega_lite_spec has a generic return type hint of set
instead of the more explicit Set[str]. Import Set from the typing module at the
top of the file, then update the return type annotation of
get_temporal_fields_from_vega_lite_spec to use Set[str] instead of set to follow
the coding guidelines for explicit type hints.
Source: Coding guidelines
| if len(non_null) == 0: | ||
| continue | ||
| try: | ||
| converted = pd.to_datetime(non_null, format="ISO8601", utc=True, errors="coerce") |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check if minimum pandas version is specified and compatible with ISO8601 format
echo "Checking pandas version constraints..."
# Check pyproject.toml or setup.py for pandas version requirements
fd -t f -e toml -e py -e txt | rg -l "pandas" | head -5 | while read -r file; do
echo "=== $file ==="
rg -C2 "pandas" "$file"
done
# Check if there's a minimum pandas version that guarantees format="ISO8601" support (2.0.0+)
echo -e "\n=== Searching for pandas version constraints ==="
rg -n "pandas\s*[><=]" --type toml --type py --type txtRepository: deepnote/deepnote-toolkit
Length of output: 1602
🏁 Script executed:
cd deepnote_toolkit/chart && wc -l utils.pyRepository: deepnote/deepnote-toolkit
Length of output: 82
🏁 Script executed:
cd deepnote_toolkit/chart && sed -n '45,60p' utils.pyRepository: deepnote/deepnote-toolkit
Length of output: 600
🏁 Script executed:
cd deepnote_toolkit/chart && sed -n '35,75p' utils.pyRepository: deepnote/deepnote-toolkit
Length of output: 1332
🏁 Script executed:
rg -n "pandas|version|ISO8601" deepnote_toolkit/chart/utils.pyRepository: deepnote/deepnote-toolkit
Length of output: 356
Handle pandas < 2.0.0 incompatibility with format="ISO8601".
The format="ISO8601" parameter (lines 52, 54) requires pandas 2.0.0+, but the dependencies allow pandas ≥1.2.5 for Python 3.9–3.11. On older pandas versions, this raises an exception that the broad except: pass silently swallows, preventing temporal field conversion. Add version-specific logic or raise the minimum pandas version requirement.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@deepnote_toolkit/chart/utils.py` at line 52, The pd.to_datetime call with
format="ISO8601" is incompatible with pandas versions below 2.0.0, but the
project supports pandas >= 1.2.5, causing silent failures when the exception is
caught by the broad except: pass block. Either add version-specific logic to
check the pandas version and conditionally use format="ISO8601" only for pandas
2.0.0+, removing the parameter for older versions, or update the project's
minimum pandas version requirement to 2.0.0 or higher. This applies to both
instances of this parameter in the converted and converted_utc variable
assignments.
|
🚀 Review App Deployment Started
|
Problem
When a chart block is saved in Deepnote cloud and re-run in VS Code, the DataFrame's temporal column arrives as
object(ISO 8601 string) dtype rather thandatetime64.VegaFusion/DataFusion attempts to parse those strings using the Vega-Lite axis display format (e.g.
'%B %d, %Y %H:%M'), not ISO 8601, which raises:The existing
except pa.ArrowNotImplementedErrorblock did not catchValueError, so it propagated as a raw traceback instead of a friendlyChartError.Fix
spec_utils.py— newget_temporal_fields_from_vega_lite_spec()that walks the Vega-Lite encoding tree (handles top-layer, multi-layer v1, multi-layer v2) and returns only field names declared with"type": "temporal".utils.py— new_convert_datetime_string_columns()called fromsanitize_dataframe_for_chart(). Converts ISO 8601 object columns todatetime64[ns, UTC]only for fields listed as temporal in the spec. This avoids falsely converting nominal string axes (year labels"2019", month codes"2024-01", numeric IDs).deepnote_chart.py— callsget_temporal_fields_from_vega_lite_specafter spec attachment and passestemporal_fieldstosanitize_dataframe_for_chart. Broadens theexceptto(pa.ArrowNotImplementedError, ValueError)as a safety net for edge cases that survive the pre-conversion step.Tests
tests/unit/test_chart.py— two new test classes:TestGetTemporalFields(7 cases): single field, no fields, multiple fields, mixed types, layered specs, datum-only encodings, escaped field names.TestConvertDatetimeStringColumns(8 cases): ISO conversion, nominal strings untouched,Nonetemporal_fields is no-op, already-datetime columns, mixed-valid/invalid strings not converted, all-null column skipped, end-to-endsanitize_dataframe_for_chartpath.Summary by CodeRabbit
Release Notes
New Features
Bug Fixes
Tests