Skip to content

B2view plotting and other improvements#663

Merged
FrancescAlted merged 24 commits into
mainfrom
b2view-plotting
Jun 15, 2026
Merged

B2view plotting and other improvements#663
FrancescAlted merged 24 commits into
mainfrom
b2view-plotting

Conversation

@FrancescAlted

@FrancescAlted FrancescAlted commented Jun 15, 2026

Copy link
Copy Markdown
Member

Main improvements:

  1. b2view — plotting & TUI features (the headline)
  • In-terminal plotting: braille line plots of a selected series (p key), peak-preserving min/max envelopes by default so large series render faithfully without reading everything.
  • Exact streamed envelopes for large series — bounded-span streaming keeps the envelope exact for local objects; only remote c2arrays fall back to strided sampling.
  • Row-range zoom in the plot modal, and v to lock the data grid to the plotted row range.
  • h opens a high-res matplotlib view of the plotted range (new optional hires extra, renamed from plot).
  • SChunk preview as a paged hex dump; enter decodes a skipped CTable cell on demand.
  • Several fixes: row paging re-aligns to the page grid after dim-mode scrolls, status chips re-branded yellow, and the data panel now focuses correctly with --path + --panel data.
  1. CTable — pandas-like display & I/O
  • to_string() renders the full table by default (new max_rows/max_width to truncate) — behaviour change.
  • repr() now shows the same truncated table as str(); compact summary moved to ctable.info.
  • Dimensions footer ([N rows x M columns]) now follows pandas conventions.
  • New display options: display_width, display_rows=-1, plus a blosc2.printoptions(...) context manager.
  • to_csv() with no path returns the CSV as a string (byte-for-byte identical to the file).
  1. Performance & correctness (getitem / indexing)
  • Acceleration paths for NDArray.getitem (large strides) and Column.getitem (when logical == physical positions).
  • Fix: negative-step Column getitem was returning [].
  • Fix: sidecar-handle cache collision for compact-store columns.
  • Cross-column index pruning enabled in compact CTable queries.
  1. WASM wheels on PyPI (the work from this session)
  • New build_wheels_wasm job in cibuildwheels.yml building cp313 (2025 ABI) + cp314 (2026 ABI) and uploading to PyPI, with cp313 pinned to Pyodide 0.29.3 to dodge the 0.29.4 get_slice regression.
  • cli.py wasm guard, installation docs, and the IS_WASM export.

   Include the sidecar's `path` in `_sidecar_handle_cache_key` so sibling
   columns in a compact (.b2z) store no longer collide on a single cache
   entry.  Previously all columns shared the same `_array_key` (identical
   urlpath), so every column received the handle opened first — returning
   wrong data when the handle was read directly (e.g. plots, b2view).
Compact (.b2z) stores share one urlpath, so the index store held only the
first column's descriptor under "__self__" — multi-column predicates pruned
by one column at best. Injecting all columns (per-column tokens +
array_to_col threading) wasn't enough: the segment merge required
left.base is right.base, so cross-column ANDs failed to merge and fell back
to a full scan (~2x regression).

Replace the base-identity check with a row-grid compatibility check
(_grid_compatible_segment_plans): segments are row-aligned, so columns
sharing level/segment_len/shape/row-count map every segment to the same
rows and their candidate masks can be combined directly. _merge_segment_plans
now intersects (AND) / unions (OR) across columns, with a safe fallback when
grids differ (AND -> most-selective side, OR -> full scan).

Cross-column AND now prunes instead of full-scanning; OR prunes when both
sides are segment-selective. Adds tests for AND/OR pruning, correctness vs
scan, and merge semantics.
plot_series capped the exact min/max envelope at _PLOT_FULL_READ_MAX_BYTES
(~1 GB) and fell back to a strided sample above it, which can step over peaks
the envelope exists to preserve.

For local objects (CTable columns, N-D arrays), stream the series in bounded
spans and accumulate per-bucket min/max instead. Since min/max are associative,
arbitrary span boundaries reproduce the single-read result exactly, so the
envelope stays peak-preserving (method="reduce") at O(span) memory. Remote
c2arrays keep the labeled strided sample to avoid many network round-trips.

Adds _minmax_buckets_streaming / _bucket_geometry / _stream_span (reads aligned
to native chunks, ~_PLOT_STREAM_BUFFER_BYTES each) and model-level tests in
tests/b2view/test_plot_model.py: exactness vs full read, a spike a sample would
miss, all-NaN/int/edge cases, and remote-stays-sample.
The 'p' plot showed only a whole-series min/max envelope, so detail within a
region was bucketed away with no way to drill in.

plot_series gains row_start/row_stop: the whole series still uses the fast
SUMMARY tier, while a sub-range is read exactly (reduce/stream, or a strided
sample only for large remote ranges) with x kept in absolute row coordinates.
PlotScreen now holds a fetch closure + total n and re-queries the envelope on
+/- (zoom about centre), left/right (pan), 0 (reset), and g (type an exact
start:stop via the new PlotRangeScreen). A key-hint line and a '?'-help group
advertise the keys.
b2view: 'v' locks the data grid to the plotted row range

Add a public CTable.slice(start, stop=None, /, *, copy=True): range- or
slice-style bounds in live-row space; copy=False returns a zero-copy view
(via _view_from_positions, like head/tail), copy=True a compact copy (via
take), mirroring NDArray.slice.

In the plot modal, 'v' now locks the data grid in place to the navigated
row range instead of just jumping the cursor. For CTable sources the model
registers a copy=False slice view per-path in _window_views (precedence over
_filter_views, so it composes over an active filter); len(view) bounds paging
for free. The app holds self.row_window, reloads in place via
_enter/_exit_row_window, shows a WINDOW a:b header chip, and gains an esc
unlock layer. NDArray plots still fall back to a cursor jump (follow-up:
window them copy-free via the layout).
Add a public CTable.slice(start, stop=None, /, *, copy=True): range- or
slice-style bounds in live-row space; copy=False returns a zero-copy view
(via _view_from_positions, like head/tail), copy=True a compact copy (via
take), mirroring NDArray.slice.

In the plot modal, 'v' now locks the data grid in place to the navigated
row range instead of jumping the cursor; esc unlocks. CTable sources use a
copy=False slice view registered per-path in the model's _window_views
(precedence over _filter_views, so it composes over a filter); len(view)
bounds paging for free. NDArray sources are clamped copy-free via the
layout: DataSliceLayout gains a row_window field, preview_array_from_layout
reports nrows = stop-start and offsets reads by start. The app holds
self.row_window, reloads in place, shows a WINDOW a:b header chip, and
gains an esc unlock layer in action_dim_exit.
In the plot modal, 'h' renders the raw values of the currently shown
range as a real image over the braille plot; q/esc/h return with the
zoom intact. The braille envelope stays the fast navigator.

model.read_series reads the exact values for [row_start, row_stop)
(same series selection as plot_series, no bucketing). PlotScreen gains a
raw_fetch closure and action_hires, capped at _HIRES_MAX_POINTS (50k;
above that it asks you to zoom in). HiResPlotScreen renders matplotlib
(Agg) to PNG and shows it via textual-image's auto Image
(kitty/iTerm2/sixel -> half-cells), scaled to fill the dialog; a
focusable VerticalScroll body keeps the screen's keys live, and it
closes with pop_screen.

Add textual-image and matplotlib to the 'plot' extra.
Mirror pandas/polars display conventions:

- CTable.to_string() now renders the whole table by default (all rows and
  columns), like pandas DataFrame.to_string(). New max_rows/max_width params
  truncate on demand. Decouple __str__ from to_string so str/print/repr stay
  truncated per the global options.
- repr(t) now shows the same truncated table as str(t) instead of the
  one-line CTable<...> summary; the summary remains on t.info.
- set_printoptions gains display_width (None=auto terminal, -1=all columns,
  int=fixed budget); display_rows now accepts -1 (all rows).
- Add blosc2.printoptions(...) context manager (set + restore on exit).

Behaviour changes: to_string() returns the full table (was truncated), and
repr() returns the table (was the summary).
Make the path argument optional: with no path (or path=None), nothing is
written and the CSV text is returned as a string, like pandas'
DataFrame.to_csv(). Passing a path still writes the file and returns None;
the returned string is byte-for-byte identical to the file content.

No index column is written (matching polars and CTable's index-less data
model).
Expensive CTable columns (list/struct/object/ndarray) render a
`<...; skipped>` placeholder to keep table navigation responsive.
Pressing Enter on such a cell now decodes just that one cell into a
CellDetailScreen modal (pretty-printed, scrollable, esc/q/enter to
return) — the table stays underneath with its position intact.
The data panel showed an "not implemented" stub for SChunk nodes.
Render them instead as an xxd-style hex dump, modeled as a grid preview
so all the existing row navigation applies unchanged.
With the braille 'p' plot now built in (textual-plotext is a core dep),
the extra only covers the high-res 'h' image view, so `plot` was
misleading.  Rename it to `hires`.  Unreleased (4.4.6.dev0), so no
deprecation needed.
The WINDOW / DIM MODE indicators used plain [reverse] (white) reverse
video.  Switch them to the brand accent (yellow) so they match the theme:

- new _accent_chip() helper renders dark-text-on-yellow via theme vars
- WINDOW chips (zoomed/locked header and filter-chip path) use it
- the dim-mode full-line highlight becomes dark-on-yellow; the DIM MODE
  label is an inverted cutout (yellow-on-dark) so it still stands out
Startup focus was applied on a fixed 0.05s timer that raced the node
selection; when selection won, its tree.focus() pulled focus back to the
tree, so `b2view store /path --panel data` left the grid unfocused.

Replace the timer with a one-shot flag applied at the end of
update_panels, once the data panel's display and contents have settled,
so focus lands deterministically on the requested panel (the grid for a
leaf, the preview scroll for a group).

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR extends the b2view terminal UI with plotting and richer data inspection, while also improving core Blosc2 table/array ergonomics (CTable display/CSV), adding performance fast paths for strided reads, and fixing compact-store indexing correctness/performance (sidecar handle cache + cross-column SUMMARY pruning). It also adds a new cibuildwheel WASM/Pyodide job.

Changes:

  • Add b2view features: zoomable plot modal (p), optional high-res matplotlib view (h via hires extra), row-window locking (v), on-demand decode of skipped CTable cells (Enter), and a paged SChunk hex dump preview.
  • Revise CTable display APIs: to_string() shows full table by default, str/repr become the truncated “interactive” view controlled by print options, add display_width, and add blosc2.printoptions(...) context manager; to_csv() can return CSV text when no path is provided.
  • Improve indexing/query planning and strided slicing performance (NDArray sparse-gather subsample path; Column identity fast path; fix sidecar cache keying + cross-column segment-plan merging).

Reviewed changes

Copilot reviewed 26 out of 26 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
todo/b2view.md Updates b2view roadmap with newly completed features.
tests/test_b2view_model.py Adds unit coverage for SChunk hex preview, on-demand cell decoding, and plot-series envelope behavior.
tests/ndarray/test_getitem.py Adds tests for NDArray strided-slice sparse-gather dispatch/correctness.
tests/ctable/test_vlstring_vlbytes.py Updates repr expectations after CTable display changes.
tests/ctable/test_select_describe_cov.py Updates to_string footer expectations (pandas-style).
tests/ctable/test_getitem_access.py Adds tests for printoptions, to_string truncation controls, and validation.
tests/ctable/test_ctable_take.py Adds coverage for new CTable.slice() API (copy and zero-copy).
tests/ctable/test_ctable_indexing.py Adds regressions for compact-store SUMMARY handle collisions and cross-column pruning.
tests/ctable/test_csv_interop.py Adds coverage for to_csv() returning CSV text when no path is given.
tests/ctable/test_column.py Updates repr/str semantics tests for CTable.
tests/ctable/test_column_slice_fastpath.py New tests for Column identity-case fast path for strided slicing.
tests/b2view/test_plot_model.py New unit tests for streamed plot envelope exactness and remote sampling behavior.
tests/b2view/test_basics.py Adds Pilot regressions for focus, paging alignment, plotting, row window, cell decode modal, and SChunk hex dump paging.
src/blosc2/ndarray.py Adds _try_subsample_gather fast path and updates slicing docs.
src/blosc2/indexing.py Fixes sidecar handle cache keying (include path) and enables cross-column segment plan merging.
src/blosc2/ctable.py Adds display-width options, printoptions context manager, repr/str behavior changes, to_string controls, CTable.slice, and to_csv return-string option.
src/blosc2/ctable_indexing.py Adjusts index token handling for compact stores and passes column mapping into planning.
src/blosc2/b2view/model.py Adds plot/read-series APIs, row-window support, SChunk hex preview, and cell decode API.
src/blosc2/b2view/cli.py Adds a WASM/Pyodide guard with a clear error message.
src/blosc2/b2view/app.py Implements plotting screens, high-res view, cell detail modal, theming, paging alignment, and row-window behavior.
src/blosc2/init.py Exposes printoptions in the public API.
RELEASE_NOTES.md Documents display/to_string/to_csv changes and new print options.
pyproject.toml Adds textual-plotext to core deps; introduces hires extra (textual-image, matplotlib).
doc/getting_started/installation.rst Documents extras (hires, parquet) and install syntax.
doc/getting_started/b2view.rst Documents b2view plot/high-res dependencies and points to extras.
.github/workflows/cibuildwheels.yml Adds WASM/Pyodide wheel build matrix and uploads artifacts.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/blosc2/b2view/app.py
Comment thread src/blosc2/b2view/model.py
Comment thread src/blosc2/b2view/model.py
@FrancescAlted FrancescAlted merged commit 5390017 into main Jun 15, 2026
17 of 18 checks passed
@FrancescAlted FrancescAlted deleted the b2view-plotting branch June 15, 2026 11:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants