Skip to content

feat(mosaico_integration): Mosaico demo with 3-robot fleet variant#57

Merged
mfaferek93 merged 3 commits intomainfrom
feat/mosaico-m0-demo
Apr 24, 2026
Merged

feat(mosaico_integration): Mosaico demo with 3-robot fleet variant#57
mfaferek93 merged 3 commits intomainfrom
feat/mosaico-m0-demo

Conversation

@mfaferek93
Copy link
Copy Markdown

@mfaferek93 mfaferek93 commented Apr 15, 2026

Docker Compose demo showing medkit fault snapshots flowing into mosaicod as queryable sequences.

Two variants:

  • Single-robot (docker-compose.yml) - one sensor-demo + bridge. Proves the pipeline: SSE fault event to bag download to Arrow Flight ingest.
  • Fleet (docker-compose.fleet.yml) - three robots with different fault signatures (LiDAR noise, IMU failure, LiDAR drift) sharing one mosaicod. That is what makes cross-robot .Q queries actually interesting.

Bridge is a separate Python process talking Arrow Flight via the mosaicolabs SDK. mosaicod runs as the unmodified upstream image, no linking or patching.

Run it

cd demos/mosaico_integration
docker compose -f docker-compose.fleet.yml up -d
# wait ~30s for healthchecks + ring buffer prime
./scripts/trigger-fleet-faults.sh
# three sequences in mosaicod within ~45s

See README.md for the single-robot flow and architecture diagram.

Notes

  • Ring buffer widened to 15s pre + 10s post so snapshots have enough baseline for drift-vs-noise comparison.
  • LaserScanAdapter still pinned to Mosaico PR #368 commit 8e090cd until it lands in a release.

@mfaferek93 mfaferek93 force-pushed the feat/mosaico-m0-demo branch 3 times, most recently from 883b019 to 2dbc307 Compare April 15, 2026 17:09
@mfaferek93 mfaferek93 changed the title feat(mosaico_integration): Mosaico M0 demo + 3-robot fleet variant feat(mosaico_integration): Mosaico demo with 3-robot fleet variant Apr 15, 2026
@mfaferek93 mfaferek93 force-pushed the feat/mosaico-m0-demo branch 2 times, most recently from fc85c79 to 222d947 Compare April 15, 2026 19:22
Copy link
Copy Markdown
Contributor

@bburda bburda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice demo - clean layout and the fleet scenario genuinely exercises the compound .Q query. Inline comments below: a few real functional bugs (SSE resume, savefig path, silent script timeout), some internal inconsistencies, and a batch of nits.

Comment thread demos/mosaico_integration/bridge/bridge.py Outdated
Comment thread demos/mosaico_integration/bridge/bridge.py
Comment thread demos/mosaico_integration/bridge/bridge.py
Comment thread demos/mosaico_integration/scripts/trigger-fault.sh
Comment thread demos/mosaico_integration/medkit_overrides/medkit_params.yaml Outdated
Comment thread demos/mosaico_integration/notebooks/mosaico_demo.ipynb Outdated
Comment thread demos/mosaico_integration/notebooks/mosaico_demo.ipynb Outdated
Comment thread demos/mosaico_integration/README.md Outdated
@mfaferek93 mfaferek93 self-assigned this Apr 17, 2026
…as queryable Mosaico sequences

A fault fires on the simulated LiDAR, medkit confirms it and flushes
its 15 s pre-fault + 10 s post-fault ring buffer to an .mcap file.
A small Python bridge picks up the SSE event, downloads the bag from
the gateway REST API, and ingests it into mosaicod over Apache Arrow
Flight using Mosaico's own Python SDK. From docker compose up to a
queryable Sequence in mosaicod takes roughly a minute.

## Two stacks

- docker-compose.yml: one sensor-demo + one bridge (single-robot).
- docker-compose.fleet.yml: three sensor-demos (warehouse-A, warehouse-B,
  outdoor-yard) each with its own bridge, all feeding one mosaicod.

## Fleet scenario exercises both query types

All three robots fire LIDAR_SIM. Robot-02 is rotating
(IMU drift_rate = 0.3 rad/s) during its fault window, so:

- Step 1 (QueryTopic by LaserScan ontology tag) matches 3 of 3.
- Step 2 (QueryOntologyCatalog, 6-axis IMU stationarity AND) matches
  2 of 3 - robot-02 is excluded because angular_velocity.z sits at
  0.300 rad/s, outside between(-0.1, 0.1).
- Content pull on the two stationary matches shows noise signature
  on robot-01 (range_std spike 0.41 to 0.63) vs drift on robot-03
  (range_mean 2.3 to 3.5 m, range_std collapses to 0 as all beams
  saturate at sensor max).

## Bridge

- Subscribes SSE at /api/v1/faults/stream and resumes via Last-Event-ID
  on reconnect.
- Resolves the SOVD entity that owns each bag by enumerating apps +
  components and HEAD-probing /bulk-data/rosbags/{fault_code}.
  A follow-up in medkit (ros2_medkit#380) can replace the probe with
  an x-medkit SSE extension or per-entity streams.
- FAULT_CODE_ALLOWLIST env var keeps the ingested catalog scoped to
  the topic the demo compares (LIDAR_SIM). The fleet compose sets
  it so the IMU DRIFTING diagnostic that robot-02 emits alongside
  its LIDAR fault does not land as a second Mosaico sequence.
- Verifies the MCAP magic header before calling RosbagInjector so a
  race with the rosbag2 finalizer does not mask itself as an ingest
  failure. Transport errors (FlightError, OSError) are caught;
  programmer errors (AttributeError, KeyError) propagate so SDK drift
  surfaces loudly.
- Sequence name includes robot_id so fleet runs cannot collide when
  two robots hit the same event_id in the same wall-clock second.

## Mosaico SDK pin

PR #368 (ROS adapters for the futures ontology) merged on 2026-04-13
as commit b3867be. The subsequent mosaicolabs==0.3.2 PyPI wheel is
missing the futures subpackage from the distributed artifact, so the
bridge Dockerfile installs from the upstream repo at b3867be until
a packaging-fixed release ships. Swap for pip install
mosaicolabs==<fixed_version> when it lands.

## Snapshot contents and storage

Four topics captured: /sensors/scan (LaserScan 10 Hz),
/sensors/imu (Imu 100 Hz), /sensors/fix (NavSatFix 1 Hz), /diagnostics.
/sensors/image_raw (30 Hz raw camera) is intentionally excluded from
snapshot capture - that single topic would dominate the bag at
~27 MB/s; drop in a CompressedImage topic when vision forensics are
needed. Bag size is ~2 MB per 25 s snapshot; 24/7 recording of the
same four topics would be ~6 GB/robot/day, so at ~5 confirmed
faults/robot/day the smart-snapshot catalog stays at ~10 MB/robot/day.

## License-safe

mosaicod runs as the unmodified upstream Docker image. The bridge is
a separate Python process speaking the public Apache Arrow Flight
protocol via Mosaico's own SDK. We never link or modify mosaicod or
its Rust crates.

## Verified end-to-end

After ./scripts/trigger-fleet-faults.sh: three LIDAR_SIM sequences
land in mosaicod with distinct robot_id metadata, QueryTopic matches
3, compound IMU .Q returns 2 (robot-02 excluded by measured
angular_velocity.z mean = 0.300 rad/s), noise and drift range
statistics are visible in the pulled LaserScan data. Lint, yaml, and
nbformat validation all pass.
@mfaferek93 mfaferek93 force-pushed the feat/mosaico-m0-demo branch from 58c1e21 to 0d00085 Compare April 17, 2026 13:08
@fdicorato
Copy link
Copy Markdown

This is awesome!!! 😄🔥

@fdicorato
Copy link
Copy Markdown

If you are not in a hurry for merging this PR, I'd like to review the code (on the mosaico side) to propose some updates, if any.

@mfaferek93
Copy link
Copy Markdown
Author

@fdicorato

If you are not in a hurry for merging this PR, I'd like to review the code (on the mosaico side) to propose some updates, if any.

Not in a hurry at all, your review is genuinely welcome. Please take all the time you need :) we'd much rather merge something that has your team's eyes on it than ship fast. Happy to iterate on anything you flag, whether it's on the bridge side or how we use the SDK.

Copy link
Copy Markdown

@fdicorato fdicorato left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work guys! I've left a few comments inline

Comment thread demos/mosaico_integration/bridge/bridge.py
Comment thread demos/mosaico_integration/README.md
Comment thread demos/mosaico_integration/README.md Outdated
- Replace git-clone+pin SDK install with mosaicolabs>=0.4.0 from PyPI.
  Release 0.4.0 (2026-04-21) is the first wheel shipping the futures
  subpackage correctly; the earlier packaging gap that forced the pin
  is resolved upstream. Bridge image drops ~100 MB.
- Store captured_at_seconds (float) alongside captured_at_iso so
  the timestamp is range-queryable via
  QuerySequence().with_user_metadata("captured_at_seconds", gt=X).
  Mosaico metadata filters only support numeric types for range
  predicates; the ISO string stays for human-readable listing.
- Update README SDK section and troubleshooting tip accordingly.
@mfaferek93 mfaferek93 requested a review from bburda April 24, 2026 13:37
Comment thread demos/mosaico_integration/notebooks/mosaico_demo.ipynb Outdated
Comment thread demos/mosaico_integration/notebooks/mosaico_demo.ipynb Outdated
- Setup cell: replace the stale git-clone + b3867be pin instructions
  with `pip install 'mosaicolabs>=0.4.0,<0.5' matplotlib pandas`,
  matching the Dockerfile + README bump in 4012fdf.
- Fleet comparison cell: pick the two sequences that are actually
  stationary instead of taking `lidar_sequences[0]` and `[1]`. The
  previous alphabetical selection silently paired robot-01 (noise)
  with robot-02 (noise under rotation), so the "noise vs drift"
  comparison promised in the markdown collapsed to noise vs noise.
  Now we intersect `lidar_sequences` with the stationary result
  from §6 so the plot compares robot-01 (noise) against robot-03
  (drift) as documented.
@mfaferek93 mfaferek93 merged commit ccea072 into main Apr 24, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants