Skip to content

fix: convert ISO datetime string columns before VegaFusion pre-transform#110

Draft
sLightlyDev wants to merge 1 commit into
mainfrom
fix/temporal-string-datetime-conversion
Draft

fix: convert ISO datetime string columns before VegaFusion pre-transform#110
sLightlyDev wants to merge 1 commit into
mainfrom
fix/temporal-string-datetime-conversion

Conversation

@sLightlyDev

@sLightlyDev sLightlyDev commented Jun 22, 2026

Copy link
Copy Markdown

Problem

When a chart block is saved in Deepnote cloud and re-run in VS Code, the DataFrame's temporal column arrives as object (ISO 8601 string) dtype rather than datetime64.

VegaFusion/DataFusion attempts to parse those strings using the Vega-Lite axis display format (e.g. '%B %d, %Y %H:%M'), not ISO 8601, which raises:

ValueError: DataFusion error: Execution error:
  Error parsing timestamp from '2024-04-17T23:18:06.527738'
  using format '%B %d, %Y %H:%M'

The existing except pa.ArrowNotImplementedError block did not catch ValueError, so it propagated as a raw traceback instead of a friendly ChartError.

Fix

spec_utils.py — new get_temporal_fields_from_vega_lite_spec() that walks the Vega-Lite encoding tree (handles top-layer, multi-layer v1, multi-layer v2) and returns only field names declared with "type": "temporal".

utils.py — new _convert_datetime_string_columns() called from sanitize_dataframe_for_chart(). Converts ISO 8601 object columns to datetime64[ns, UTC] only for fields listed as temporal in the spec. This avoids falsely converting nominal string axes (year labels "2019", month codes "2024-01", numeric IDs).

deepnote_chart.py — calls get_temporal_fields_from_vega_lite_spec after spec attachment and passes temporal_fields to sanitize_dataframe_for_chart. Broadens the except to (pa.ArrowNotImplementedError, ValueError) as a safety net for edge cases that survive the pre-conversion step.

Tests

tests/unit/test_chart.py — two new test classes:

  • TestGetTemporalFields (7 cases): single field, no fields, multiple fields, mixed types, layered specs, datum-only encodings, escaped field names.
  • TestConvertDatetimeStringColumns (8 cases): ISO conversion, nominal strings untouched, None temporal_fields is no-op, already-datetime columns, mixed-valid/invalid strings not converted, all-null column skipped, end-to-end sanitize_dataframe_for_chart path.

Summary by CodeRabbit

Release Notes

  • New Features

    • Enhanced temporal data handling: charts now automatically detect and convert ISO-8601 datetime strings to proper datetime types for accurate temporal visualizations.
  • Bug Fixes

    • Improved error reporting during chart transformations with more detailed messages to aid troubleshooting.
  • Tests

    • Added comprehensive test coverage for temporal field detection and datetime string conversion.

Charts exported from Deepnote cloud store temporal columns as object
(ISO 8601 string) dtype.  VegaFusion/DataFusion then tries to parse
those strings using the Vega-Lite axis display format (e.g.
'%B %d, %Y %H:%M'), which raises a ValueError and crashes the chart.

Fix by converting only the columns that the Vega-Lite spec declares as
"type": "temporal" to datetime64 before the VegaFusion pre-transform.
Scoping the conversion to temporal fields avoids falsely converting
nominal string axes (year strings, month codes, numeric IDs).

Also broadens the VegaFusion except clause to catch ValueError alongside
the existing ArrowNotImplementedError so any remaining edge cases
surface a friendly ChartError instead of a raw traceback.
@coderabbitai

coderabbitai Bot commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

📝 Walkthrough

Walkthrough

DeepnoteChart.__init__ now calls get_temporal_fields_from_vega_lite_spec on the spec dict and passes the result to sanitize_dataframe_for_chart. That function gains an optional temporal_fields parameter and calls a new _convert_datetime_string_columns helper, which parses ISO-8601 string columns (object dtype, non-null, fully parseable) to UTC-aware datetime64 in-place. The VegaFusion pre_transform_spec error catch is widened to include ValueError, and the ChartError message gains a bullet about unparseable datetime strings.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 2

❌ Failed checks (2 warnings)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 43.48% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Updates Docs ⚠️ Warning No documentation was updated to reflect the new chart temporal field handling feature. The PR modifies chart-related code but the docs/ directory contains only configuration guides with no chart-re... Add documentation for the temporal field conversion feature: either update docs/user or docs/dev with an explanation of the fix and its impact on chart blocks handling datetime columns.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed Title clearly describes the main fix: converting ISO datetime strings before VegaFusion transformation. Directly matches the PR's primary objective.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions

Copy link
Copy Markdown

📦 Python package built successfully!

  • Version: 2.3.1.dev3+e40627c
  • Wheel: deepnote_toolkit-2.3.1.dev3+e40627c-py3-none-any.whl
  • Install:
    pip install "deepnote-toolkit @ https://deepnote-staging-runtime-artifactory.s3.amazonaws.com/deepnote-toolkit-packages/2.3.1.dev3%2Be40627c/deepnote_toolkit-2.3.1.dev3%2Be40627c-py3-none-any.whl"

@codecov

codecov Bot commented Jun 22, 2026

Copy link
Copy Markdown

⚠️ JUnit XML file not found

The CLI was unable to find any JUnit XML files to upload.
For more help, visit our troubleshooting guide.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
deepnote_toolkit/chart/utils.py (1)

51-56: 🧹 Nitpick | 🔵 Trivial | ⚡ Quick win

Consider logging conversion failures for debugging.

While intentionally defensive, silently swallowing all exceptions makes debugging difficult when columns fail to convert. Per coding guidelines, errors should be logged with context.

📋 Proposed enhancement
         try:
             converted = pd.to_datetime(non_null, format="ISO8601", utc=True, errors="coerce")
             if converted.notna().all():
                 pd_df[col] = pd.to_datetime(pd_df[col], format="ISO8601", utc=True, errors="coerce")
-        except Exception:
-            pass
+        except Exception as e:
+            logger.debug(
+                "Skipping datetime conversion for column %r: %s",
+                col,
+                e,
+            )

Add logger import if not present:

import logging
logger = logging.getLogger(__name__)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@deepnote_toolkit/chart/utils.py` around lines 51 - 56, The except block in
the pd.to_datetime conversion logic is silently swallowing all exceptions
without any logging, which makes debugging difficult when column conversions
fail. Add logging to the exception handler to capture and log the actual error
context. First, ensure the logging module is imported at the top of the file
with a logger instance created (e.g., logger = logging.getLogger(__name__)).
Then, in the except Exception block, replace the pass statement with a
logger.warning or logger.error call that includes the column name (col) and the
actual exception details to provide context for debugging conversion failures.

Source: Coding guidelines

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@deepnote_toolkit/chart/spec_utils.py`:
- Line 332: The function get_temporal_fields_from_vega_lite_spec has a generic
return type hint of set instead of the more explicit Set[str]. Import Set from
the typing module at the top of the file, then update the return type annotation
of get_temporal_fields_from_vega_lite_spec to use Set[str] instead of set to
follow the coding guidelines for explicit type hints.

In `@deepnote_toolkit/chart/utils.py`:
- Line 52: The pd.to_datetime call with format="ISO8601" is incompatible with
pandas versions below 2.0.0, but the project supports pandas >= 1.2.5, causing
silent failures when the exception is caught by the broad except: pass block.
Either add version-specific logic to check the pandas version and conditionally
use format="ISO8601" only for pandas 2.0.0+, removing the parameter for older
versions, or update the project's minimum pandas version requirement to 2.0.0 or
higher. This applies to both instances of this parameter in the converted and
converted_utc variable assignments.

---

Nitpick comments:
In `@deepnote_toolkit/chart/utils.py`:
- Around line 51-56: The except block in the pd.to_datetime conversion logic is
silently swallowing all exceptions without any logging, which makes debugging
difficult when column conversions fail. Add logging to the exception handler to
capture and log the actual error context. First, ensure the logging module is
imported at the top of the file with a logger instance created (e.g., logger =
logging.getLogger(__name__)). Then, in the except Exception block, replace the
pass statement with a logger.warning or logger.error call that includes the
column name (col) and the actual exception details to provide context for
debugging conversion failures.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 1e6e1104-dfc9-430b-a529-f7bfdc3ba61a

📥 Commits

Reviewing files that changed from the base of the PR and between 3967dd5 and de0dbab.

📒 Files selected for processing (4)
  • deepnote_toolkit/chart/deepnote_chart.py
  • deepnote_toolkit/chart/spec_utils.py
  • deepnote_toolkit/chart/utils.py
  • tests/unit/test_chart.py

)


def get_temporal_fields_from_vega_lite_spec(vega_lite_spec: Dict[str, Any]) -> set:

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick win

Use explicit type hint for return value.

Return type should be Set[str] instead of set per coding guidelines.

📝 Proposed fix
-def get_temporal_fields_from_vega_lite_spec(vega_lite_spec: Dict[str, Any]) -> set:
+def get_temporal_fields_from_vega_lite_spec(vega_lite_spec: Dict[str, Any]) -> Set[str]:

Add Set to imports at top of file:

+from typing import Set
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@deepnote_toolkit/chart/spec_utils.py` at line 332, The function
get_temporal_fields_from_vega_lite_spec has a generic return type hint of set
instead of the more explicit Set[str]. Import Set from the typing module at the
top of the file, then update the return type annotation of
get_temporal_fields_from_vega_lite_spec to use Set[str] instead of set to follow
the coding guidelines for explicit type hints.

Source: Coding guidelines

if len(non_null) == 0:
continue
try:
converted = pd.to_datetime(non_null, format="ISO8601", utc=True, errors="coerce")

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check if minimum pandas version is specified and compatible with ISO8601 format

echo "Checking pandas version constraints..."

# Check pyproject.toml or setup.py for pandas version requirements
fd -t f -e toml -e py -e txt | rg -l "pandas" | head -5 | while read -r file; do
    echo "=== $file ==="
    rg -C2 "pandas" "$file"
done

# Check if there's a minimum pandas version that guarantees format="ISO8601" support (2.0.0+)
echo -e "\n=== Searching for pandas version constraints ==="
rg -n "pandas\s*[><=]" --type toml --type py --type txt

Repository: deepnote/deepnote-toolkit

Length of output: 1602


🏁 Script executed:

cd deepnote_toolkit/chart && wc -l utils.py

Repository: deepnote/deepnote-toolkit

Length of output: 82


🏁 Script executed:

cd deepnote_toolkit/chart && sed -n '45,60p' utils.py

Repository: deepnote/deepnote-toolkit

Length of output: 600


🏁 Script executed:

cd deepnote_toolkit/chart && sed -n '35,75p' utils.py

Repository: deepnote/deepnote-toolkit

Length of output: 1332


🏁 Script executed:

rg -n "pandas|version|ISO8601" deepnote_toolkit/chart/utils.py

Repository: deepnote/deepnote-toolkit

Length of output: 356


Handle pandas < 2.0.0 incompatibility with format="ISO8601".

The format="ISO8601" parameter (lines 52, 54) requires pandas 2.0.0+, but the dependencies allow pandas ≥1.2.5 for Python 3.9–3.11. On older pandas versions, this raises an exception that the broad except: pass silently swallows, preventing temporal field conversion. Add version-specific logic or raise the minimum pandas version requirement.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@deepnote_toolkit/chart/utils.py` at line 52, The pd.to_datetime call with
format="ISO8601" is incompatible with pandas versions below 2.0.0, but the
project supports pandas >= 1.2.5, causing silent failures when the exception is
caught by the broad except: pass block. Either add version-specific logic to
check the pandas version and conditionally use format="ISO8601" only for pandas
2.0.0+, removing the parameter for older versions, or update the project's
minimum pandas version requirement to 2.0.0 or higher. This applies to both
instances of this parameter in the converted and converted_utc variable
assignments.

@deepnote-bot

Copy link
Copy Markdown

🚀 Review App Deployment Started

📝 Description 🌐 Link / Info
🌍 Review application ra-110
🔑 Sign-in URL Click to sign-in
📊 Application logs View logs
🔄 Actions Click to redeploy
🚀 ArgoCD deployment View deployment
Last deployed 2026-06-22 15:27:36 (UTC)
📜 Deployed commit 12ca7975671d48fb9a36d2173de24aebf361d2e3
🛠️ Toolkit version e40627c

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants