Skip to content

Conversation

@JoOkuma
Copy link
Member

@JoOkuma JoOkuma commented Jan 20, 2026

Closes #193 and #243

- Add AttrSchema dataclass with defensive copying
- Create process_attr_key_args() helper to eliminate duplication
- Refactor dtype inference and SQLAlchemy mapping to use dictionaries
- Add overloaded method signatures for convenience and schema modes
- Replace _node/edge_attr_keys lists with _attr_schemas dicts
- Add comprehensive test suite (16 tests passing)

Breaking change: dtype parameter now required. Next phase will update
all calling sites throughout the codebase.
…add_edge_attr_key

This is a breaking change that requires dtype (polars.DataType) to be explicitly
specified when adding attribute keys to graphs.

Core changes:
- Add AttrSchema dataclass to store key, dtype, and default_value together
- Add utility functions: infer_default_value_from_dtype, polars_dtype_to_sqlalchemy_type,
  validate_default_value_dtype_compatibility, and process_attr_key_args
- Update BaseGraph abstract methods with overloaded signatures supporting both
  direct parameters and AttrSchema objects
- Replace _node_attr_keys/_edge_attr_keys lists with _node_attr_schemas dicts
  in RustWorkXGraph and SQLGraph
- Remove _boolean_columns tracking in SQLGraph (now uses AttrSchema)
- Fix _polars_schema_override to use == instead of isinstance for pl.Boolean check

Backend implementations:
- RustWorkXGraph: Use process_attr_key_args helper, store AttrSchema objects
- SQLGraph: Use process_attr_key_args helper, pass AttrSchema to _add_new_column
- GraphView: Delegate to root graph with all parameters preserved
- Add dtype inference fallback for complex objects that polars can't parse

Updated all callers:
- Graph methods: from_other(), match()
- Node operators: RandomNodes, RegionPropsNodes, GenericNodesOperator
- Edge operators: DistanceEdges, GenericEdgesOperator
- Solvers: ILPSolver, NearestNeighborsSolver
- IO: numpy_array, ctc
Update all test files to use the new required dtype parameter for
add_node_attr_key and add_edge_attr_key methods.

Key test fixes:
- Add dtype inference logic in test_add_node_attr_key based on value type
- Fix bbox attribute calls to use pl.Array(pl.Int64, size) instead of
  passing numpy arrays as dtype
- Add missing polars imports where needed
- Fix argument order from (key, False, dtype=pl.Boolean) to
  (key, pl.Boolean, default_value=False)
@codecov-commenter
Copy link

codecov-commenter commented Jan 20, 2026

Codecov Report

❌ Patch coverage is 90.86022% with 17 lines in your changes missing coverage. Please review.
✅ Project coverage is 88.07%. Comparing base (37a3361) to head (7940fee).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
src/tracksdata/graph/_sql_graph.py 85.00% 4 Missing and 2 partials ⚠️
src/tracksdata/graph/_graph_view.py 73.33% 2 Missing and 2 partials ⚠️
src/tracksdata/graph/_rustworkx_graph.py 90.24% 3 Missing and 1 partial ⚠️
src/tracksdata/utils/_dtypes.py 96.61% 1 Missing and 1 partial ⚠️
src/tracksdata/nodes/_mask.py 50.00% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #244      +/-   ##
==========================================
- Coverage   88.27%   88.07%   -0.21%     
==========================================
  Files          56       56              
  Lines        4077     4176      +99     
  Branches      710      729      +19     
==========================================
+ Hits         3599     3678      +79     
- Misses        291      311      +20     
  Partials      187      187              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@JoOkuma
Copy link
Member Author

JoOkuma commented Jan 20, 2026

Hi @TeunHuijben and @yfukai,

I would appreciate it if you both could review this PR, at least the public API, since it involves breaking changes.
Before you do that, I need to give it a pass because a big part of it was generated by an LLM.
I'll tag you again once I'm done.

rx_graph = self.rx_graph
for node_id in rx_graph.node_indices():
rx_graph[node_id][key] = default_value
rx_graph[node_id][schema.key] = schema.default_value
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need to add the default_value?
This could be solved when returning the attrs value.

# Add to all existing edges
for edge_attr in self.rx_graph.edges():
edge_attr[key] = default_value
edge_attr[schema.key] = schema.default_value
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above.

@TeunHuijben
Copy link
Contributor

Haven't personally tested the code, but the API looks good!

@JoOkuma
Copy link
Member Author

JoOkuma commented Jan 21, 2026

Hi @yfukai , this should be ready for review.
I'm going to open a separate PR that stops setting the attributes of existing nodes when adding a new key, and it will happen when querying node|edge attributes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Explicit dtype for the node and edge attrs?

3 participants