Skip to content

[FEAT] Evaluate CBOR as the self-describing IR in the type-erased core (alt to serde_json::Value) #156

Description

@lxsaah

Background

AimDB's value proposition is a typed data-plane, yet the core stores records type-erased behind dyn AnyRecord — the concrete T is gone for storage/routing uniformity. The bridge that keeps the edge typed across that erasure is serde_json::Value: a self-describing, structure-preserving stand-in for the lost type.

The mechanism: the codec is captured at config time in with_remote_access (aimdb-core/src/typed_record.rs), the one moment T: RemoteSerialize is still in scope. SerdeJsonCodec (aimdb-core/src/codec.rs:73) is effectively a bottled witness of T's structure that survives into the erased world. See design doc docs/design/032-M16-aimx-json-codec.md.

This works because JSON is self-describing: an erased record can still describe itself (field names, nesting, value kinds) to a peer that has no T — the AimX remote-access path, the MCP server, the CLI, the web/WASM dashboard.

Proposal

Evaluate replacing the self-describing IR at the type-erased boundary from serde_json::Value to a binary self-describing format — CBOR via ciborium::Value (RFC 8949), or MessagePack via rmpv::Value.

This is a substitution, not a model change: CBOR/MessagePack share JSON's data model and self-describing property, so they preserve "typedness across erasure" exactly. The contract is unchanged; only the representation gets cheaper.

Why it's interesting

  • Smaller + faster encode/decode — notably the no_std/MCU cost of JSON float/int formatting, which is real on embedded targets.
  • Native binary blobs — no base64 tax for byte-valued records.
  • ciborium::Value is a near drop-in for serde_json::Value as the IR — same dynamic tree shape.
  • Keeps every introspection/self-description property JSON has.

Constraints / risks

  • Design risk: low (the self-describing model doesn't change). Surface area: broadserde_json::Value is the IR across ~13 files / ~100+ sites in aimdb-core (codec.rs, typed_record.rs JsonRecordAccess / latest_json / subscribe_json / set_from_json, buffer/traits.rs JsonBufferReader, session/aimx/dispatch.rs, remote/*).
  • Text-edge transcode: the JSON-RPC/MCP server (tools/aimdb-mcp, JSON-RPC 2.0) and the browser/WASM dashboard want text, so a CBOR↔JSON shim is needed at those edges.
  • Loss of curl/jq/grep debuggability on the wire (mitigated by the transcode at the human-facing edges).

Relationship to the codec seam

If the pluggable byte-oriented record-codec seam lands first (see related issue), this collapses from "rewrite the IR" to "pick CBOR as the default self-describing codec impl." Recommended sequencing: seam first, then CBOR as a codec choice.

Scope

Investigation + decision capture only. A design document (next number after 036) will follow if/when this is picked up.


⚠️ Distinct from binary formats on the data plane: bincode is non-self-describing and cannot sit at the erased boundary — it would hand the erased core opaque bytes and collapse the typed-edge model. That's the other issue, and it targets type-aware interfaces only.


Related: #155 (the byte-oriented codec seam that enables this as a codec choice).

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions