feat(mosaico_integration): Mosaico demo with 3-robot fleet variant#57
feat(mosaico_integration): Mosaico demo with 3-robot fleet variant#57mfaferek93 merged 3 commits intomainfrom
Conversation
883b019 to
2dbc307
Compare
fc85c79 to
222d947
Compare
bburda
left a comment
There was a problem hiding this comment.
Nice demo - clean layout and the fleet scenario genuinely exercises the compound .Q query. Inline comments below: a few real functional bugs (SSE resume, savefig path, silent script timeout), some internal inconsistencies, and a batch of nits.
…as queryable Mosaico sequences
A fault fires on the simulated LiDAR, medkit confirms it and flushes
its 15 s pre-fault + 10 s post-fault ring buffer to an .mcap file.
A small Python bridge picks up the SSE event, downloads the bag from
the gateway REST API, and ingests it into mosaicod over Apache Arrow
Flight using Mosaico's own Python SDK. From docker compose up to a
queryable Sequence in mosaicod takes roughly a minute.
## Two stacks
- docker-compose.yml: one sensor-demo + one bridge (single-robot).
- docker-compose.fleet.yml: three sensor-demos (warehouse-A, warehouse-B,
outdoor-yard) each with its own bridge, all feeding one mosaicod.
## Fleet scenario exercises both query types
All three robots fire LIDAR_SIM. Robot-02 is rotating
(IMU drift_rate = 0.3 rad/s) during its fault window, so:
- Step 1 (QueryTopic by LaserScan ontology tag) matches 3 of 3.
- Step 2 (QueryOntologyCatalog, 6-axis IMU stationarity AND) matches
2 of 3 - robot-02 is excluded because angular_velocity.z sits at
0.300 rad/s, outside between(-0.1, 0.1).
- Content pull on the two stationary matches shows noise signature
on robot-01 (range_std spike 0.41 to 0.63) vs drift on robot-03
(range_mean 2.3 to 3.5 m, range_std collapses to 0 as all beams
saturate at sensor max).
## Bridge
- Subscribes SSE at /api/v1/faults/stream and resumes via Last-Event-ID
on reconnect.
- Resolves the SOVD entity that owns each bag by enumerating apps +
components and HEAD-probing /bulk-data/rosbags/{fault_code}.
A follow-up in medkit (ros2_medkit#380) can replace the probe with
an x-medkit SSE extension or per-entity streams.
- FAULT_CODE_ALLOWLIST env var keeps the ingested catalog scoped to
the topic the demo compares (LIDAR_SIM). The fleet compose sets
it so the IMU DRIFTING diagnostic that robot-02 emits alongside
its LIDAR fault does not land as a second Mosaico sequence.
- Verifies the MCAP magic header before calling RosbagInjector so a
race with the rosbag2 finalizer does not mask itself as an ingest
failure. Transport errors (FlightError, OSError) are caught;
programmer errors (AttributeError, KeyError) propagate so SDK drift
surfaces loudly.
- Sequence name includes robot_id so fleet runs cannot collide when
two robots hit the same event_id in the same wall-clock second.
## Mosaico SDK pin
PR #368 (ROS adapters for the futures ontology) merged on 2026-04-13
as commit b3867be. The subsequent mosaicolabs==0.3.2 PyPI wheel is
missing the futures subpackage from the distributed artifact, so the
bridge Dockerfile installs from the upstream repo at b3867be until
a packaging-fixed release ships. Swap for pip install
mosaicolabs==<fixed_version> when it lands.
## Snapshot contents and storage
Four topics captured: /sensors/scan (LaserScan 10 Hz),
/sensors/imu (Imu 100 Hz), /sensors/fix (NavSatFix 1 Hz), /diagnostics.
/sensors/image_raw (30 Hz raw camera) is intentionally excluded from
snapshot capture - that single topic would dominate the bag at
~27 MB/s; drop in a CompressedImage topic when vision forensics are
needed. Bag size is ~2 MB per 25 s snapshot; 24/7 recording of the
same four topics would be ~6 GB/robot/day, so at ~5 confirmed
faults/robot/day the smart-snapshot catalog stays at ~10 MB/robot/day.
## License-safe
mosaicod runs as the unmodified upstream Docker image. The bridge is
a separate Python process speaking the public Apache Arrow Flight
protocol via Mosaico's own SDK. We never link or modify mosaicod or
its Rust crates.
## Verified end-to-end
After ./scripts/trigger-fleet-faults.sh: three LIDAR_SIM sequences
land in mosaicod with distinct robot_id metadata, QueryTopic matches
3, compound IMU .Q returns 2 (robot-02 excluded by measured
angular_velocity.z mean = 0.300 rad/s), noise and drift range
statistics are visible in the pulled LaserScan data. Lint, yaml, and
nbformat validation all pass.
58c1e21 to
0d00085
Compare
|
This is awesome!!! 😄🔥 |
|
If you are not in a hurry for merging this PR, I'd like to review the code (on the mosaico side) to propose some updates, if any. |
Not in a hurry at all, your review is genuinely welcome. Please take all the time you need :) we'd much rather merge something that has your team's eyes on it than ship fast. Happy to iterate on anything you flag, whether it's on the bridge side or how we use the SDK. |
fdicorato
left a comment
There was a problem hiding this comment.
Great work guys! I've left a few comments inline
- Replace git-clone+pin SDK install with mosaicolabs>=0.4.0 from PyPI.
Release 0.4.0 (2026-04-21) is the first wheel shipping the futures
subpackage correctly; the earlier packaging gap that forced the pin
is resolved upstream. Bridge image drops ~100 MB.
- Store captured_at_seconds (float) alongside captured_at_iso so
the timestamp is range-queryable via
QuerySequence().with_user_metadata("captured_at_seconds", gt=X).
Mosaico metadata filters only support numeric types for range
predicates; the ISO string stays for human-readable listing.
- Update README SDK section and troubleshooting tip accordingly.
- Setup cell: replace the stale git-clone + b3867be pin instructions with `pip install 'mosaicolabs>=0.4.0,<0.5' matplotlib pandas`, matching the Dockerfile + README bump in 4012fdf. - Fleet comparison cell: pick the two sequences that are actually stationary instead of taking `lidar_sequences[0]` and `[1]`. The previous alphabetical selection silently paired robot-01 (noise) with robot-02 (noise under rotation), so the "noise vs drift" comparison promised in the markdown collapsed to noise vs noise. Now we intersect `lidar_sequences` with the stationary result from §6 so the plot compares robot-01 (noise) against robot-03 (drift) as documented.
Docker Compose demo showing medkit fault snapshots flowing into mosaicod as queryable sequences.
Two variants:
docker-compose.yml) - one sensor-demo + bridge. Proves the pipeline: SSE fault event to bag download to Arrow Flight ingest.docker-compose.fleet.yml) - three robots with different fault signatures (LiDAR noise, IMU failure, LiDAR drift) sharing one mosaicod. That is what makes cross-robot.Qqueries actually interesting.Bridge is a separate Python process talking Arrow Flight via the
mosaicolabsSDK.mosaicodruns as the unmodified upstream image, no linking or patching.Run it
See
README.mdfor the single-robot flow and architecture diagram.Notes
LaserScanAdapterstill pinned to Mosaico PR #368 commit8e090cduntil it lands in a release.