From 16a1117785b5dd8116b6530b927291c8c0a7ffe3 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Cl=C3=A9ment=20Drouin?= Date: Mon, 17 Nov 2025 18:16:23 +0100 Subject: [PATCH] docs(rfd): introduce messageId field --- docs/docs.json | 6 +- docs/rfds/message-id.mdx | 334 +++++++++++++++++++++++++++++++++++++++ 2 files changed, 339 insertions(+), 1 deletion(-) create mode 100644 docs/rfds/message-id.mdx diff --git a/docs/docs.json b/docs/docs.json index 3513b2f..6d18e13 100644 --- a/docs/docs.json +++ b/docs/docs.json @@ -91,7 +91,11 @@ "rfds/about", { "group": "Draft", - "pages": ["rfds/session-list", "rfds/session-config-options"] + "pages": [ + "rfds/session-list", + "rfds/session-config-options", + "rfds/message-id" + ] }, { "group": "Preview", "pages": [] }, { "group": "Completed", "pages": ["rfds/introduce-rfd-process"] } diff --git a/docs/rfds/message-id.mdx b/docs/rfds/message-id.mdx new file mode 100644 index 0000000..ddca073 --- /dev/null +++ b/docs/rfds/message-id.mdx @@ -0,0 +1,334 @@ +--- +title: "Message ID" +--- + +Author(s): [@michelTho](https://github.com/michelTho), [@nemtecl](https://github.com/nemtecl) + +## Elevator pitch + +Add a `messageId` field to `agent_message_chunk` and `user_message_chunk` session updates, and to `session/prompt` responses, to uniquely identify individual messages within a conversation. This enables clients to distinguish between different messages beyond changes in update type and lays the groundwork for future capabilities like message editing. + +## Status quo + +Currently, when an Agent sends message chunks via `session/update` notifications, there is no explicit identifier for the message being streamed: + +```json +{ + "jsonrpc": "2.0", + "method": "session/update", + "params": { + "sessionId": "sess_abc123def456", + "update": { + "sessionUpdate": "agent_message_chunk", + "content": { + "type": "text", + "text": "Let me analyze your code..." + } + } + } +} +``` + +This creates several limitations: + +1. **Ambiguous message boundaries** - When the Agent sends multiple messages in sequence (e.g., alternating between agent and user messages, or multiple agent messages), Clients can only infer message boundaries by detecting a change in the `sessionUpdate` type. If an Agent sends consecutive messages of the same type, Clients cannot distinguish where one message ends and another begins. + +2. **Non-standard workarounds** - Currently, implementations rely on the `_meta` field to work around this limitation. While functional, this approach is not standardized and each implementation may use different conventions. + +3. **Limited future capabilities** - Without stable message identifiers, it's difficult to build features like: + - Message editing or updates + - Message-specific metadata or annotations + - Message threading or references + - Undo/redo functionality + +As an example, consider this sequence where a Client cannot reliably determine message boundaries: + +```json +// First agent message chunk +{ "sessionUpdate": "agent_message_chunk", "content": { "type": "text", "text": "Analyzing..." } } + +// More chunks... but is this still the same message or a new one? +{ "sessionUpdate": "agent_message_chunk", "content": { "type": "text", "text": "Found issues." } } + +// Tool call happens +{ "sessionUpdate": "tool_call", ... } + +// Another agent message - definitely a new message +{ "sessionUpdate": "agent_message_chunk", "content": { "type": "text", "text": "Fixed the issues." } } +``` + +## What we propose to do about it + +Add a `messageId` field to `AgentMessageChunk` and `UserMessageChunk` session updates, and to the `session/prompt` response. This field would: + +1. **Provide stable message identification** - Each message gets a unique identifier that remains constant across all chunks of that message. + +2. **Enable reliable message boundary detection** - Clients can definitively determine when a new message starts by observing a change in `messageId`. + +3. **Create an extension point for future features** - Message IDs can be referenced in future protocol enhancements. + +### Proposed Structure + +When the Client sends a user message via `session/prompt`: + +```json +{ + "jsonrpc": "2.0", + "id": 2, + "method": "session/prompt", + "params": { + "sessionId": "sess_abc123def456", + "prompt": [ + { + "type": "text", + "text": "Can you analyze this code?" + } + ] + } +} +``` + +The Agent assigns a `messageId` to the user message and returns it in the response: + +```json +{ + "jsonrpc": "2.0", + "id": 2, + "result": { + "messageId": "msg_user_001", + "stopReason": "end_turn" + } +} +``` + +For agent message chunks, the Agent includes the `messageId`: + +```json +{ + "jsonrpc": "2.0", + "method": "session/update", + "params": { + "sessionId": "sess_abc123def456", + "update": { + "sessionUpdate": "agent_message_chunk", + "messageId": "msg_agent_001", + "content": { + "type": "text", + "text": "Let me analyze your code..." + } + } + } +} +``` + +If the Agent sends `user_message_chunk` updates, it uses the user message ID: + +```json +{ + "jsonrpc": "2.0", + "method": "session/update", + "params": { + "sessionId": "sess_abc123def456", + "update": { + "sessionUpdate": "user_message_chunk", + "messageId": "msg_user_001", + "content": { + "type": "text", + "text": "Can you..." + } + } + } +} +``` + +The `messageId` field would be: + +- **Required** on `agent_message_chunk` and `user_message_chunk` updates +- **Required** in `session/prompt` responses as `messageId` +- **Unique per message** within a session +- **Stable across chunks** - all chunks belonging to the same message share the same `messageId` +- **Opaque** - Clients treat it as an identifier without parsing its structure +- **Agent-generated** - The Agent generates and manages all message IDs, consistent with how the protocol handles `sessionId`, `terminalId`, and `toolCallId` + +## Shiny future + +Once this feature exists: + +1. **Clear message boundaries** - Clients can reliably render distinct message bubbles in the UI, even when multiple messages of the same type are sent consecutively. + +2. **Better streaming UX** - Clients know exactly which message element to append chunks to, enabling smoother visual updates. + +3. **Foundation for editing** - With stable message identifiers, future protocol versions could add: + - `message/edit` - Agent updates the content of a previously sent message + - `message/delete` - Agent removes a message from the conversation + - `message/replace` - Agent replaces an entire message with new content + +4. **Message metadata** - Future capabilities could reference messages by ID: + - Annotations or reactions to specific messages + - Citation or cross-reference between messages + - Tool calls that reference which message triggered them + +5. **Enhanced debugging** - Implementations can trace message flow more easily with explicit IDs in logs and debugging tools. + +Example future editing capability: + +```json +{ + "jsonrpc": "2.0", + "method": "session/update", + "params": { + "sessionId": "sess_abc123def456", + "update": { + "sessionUpdate": "message_update", + "messageId": "msg_abc123", + "updateType": "replace", + "content": { + "type": "text", + "text": "Actually, let me correct that analysis..." + } + } + } +} +``` + +## Implementation details and plan + +### Phase 1: Core Protocol Changes + +1. **Update schema** (`schema/schema.json`): + - Add required `messageId` field (type: `string`) to `AgentMessageChunk` + - Add required `messageId` field (type: `string`) to `UserMessageChunk` + - Add required `messageId` field (type: `string`) to `PromptResponse` + +2. **Update Rust SDK** (`rust/client.rs` and `rust/agent.rs`): + - Add `message_id: String` field to `ContentChunk` struct + - Add `message_id: String` field to `PromptResponse` struct + - Update serialization to include `messageId` in JSON output + +3. **Update TypeScript SDK** (if applicable): + - Add `messageId` field to corresponding types + +4. **Update documentation** (`docs/protocol/prompt-turn.mdx`): + - Document the `messageId` field and its semantics + - Clarify that the Agent generates all message IDs + - Show that `messageId` is returned in prompt responses + - Add examples showing message boundaries + - Explain that `messageId` changes indicate new messages + +### Phase 2: Reference Implementation + +5. **Update example agents**: + - Modify example agents to generate and include `messageId` in chunks + - Use simple ID generation (e.g., incrementing counter, UUID) + - Demonstrate consistent IDs across chunks of the same message + +6. **Update example clients**: + - Update clients to consume `messageId` field + - Use IDs to properly group chunks into messages + - Demonstrate clear message boundary rendering + +### Backward Compatibility + +Since this adds a **required** field, this would be a **breaking change** and should be part of a major version bump of the protocol. Agents and Clients will need to coordinate upgrades. + +Alternatively, the field could initially be made **optional** to allow gradual adoption: + +- Agents that support it advertise a capability flag during initialization +- Clients check for the capability before relying on `messageId` +- After wide adoption, make it required in a future version + +## Frequently asked questions + +### What alternative approaches did you consider, and why did you settle on this one? + +1. **Continue using `_meta` field** - This is the current workaround but: + - Not standardized across implementations + - Doesn't signal semantic importance + - Easy to overlook or implement inconsistently + +2. **Detect message boundaries heuristically** - Clients could infer boundaries from timing, content types, or session state: + - Unreliable and fragile + - Doesn't work for all scenarios (e.g., consecutive same-type messages) + - Creates inconsistent behavior across implementations + +3. **Use explicit "message start/end" markers** - Wrap messages with begin/end notifications: + - More complex protocol interaction + - Requires additional notifications + - More state to track on both sides + +4. **Client-generated message IDs** - Have the Client generate IDs for user messages: + - Inconsistent with protocol patterns (Agent generates `sessionId`, `terminalId`, `toolCallId`) + - Adds complexity to Client implementations + - Requires coordination on ID namespace to avoid collisions + - Agent is better positioned as single source of truth + +The proposed approach with `messageId` is: + +- **Simple** - Just one new field with clear semantics +- **Flexible** - Enables future capabilities without further protocol changes +- **Consistent** - Aligns with how other resources (sessions, terminals, tool calls) are identified in the protocol +- **Centralized** - Agent as single source of truth for all IDs simplifies uniqueness guarantees + +### Who generates message IDs? + +The **Agent generates all message IDs**, for both user and agent messages: + +- **For user messages**: When the Client sends `session/prompt`, the Agent assigns a message ID and returns it as `messageId` in the response +- **For agent messages**: The Agent generates the ID when creating its response + +This is consistent with how the protocol handles other resource identifiers: + +- `sessionId` - generated by Agent in `session/new` response +- `terminalId` - generated by Agent in `terminal/create` response +- `toolCallId` - generated by Agent in tool call notifications + +Benefits of this approach: + +- **Single source of truth** - Agent controls all ID generation +- **Simpler for Clients** - No ID generation logic needed +- **Better uniqueness guarantees** - Agent controls the namespace +- **Protocol consistency** - Matches established patterns + +### Should this field be required or optional? + +While making it required provides the clearest semantics, it would be a breaking change. The recommendation is to: + +1. Make it **optional** initially with a capability flag +2. Strongly encourage adoption in the documentation +3. Make it **required** in the next major protocol version + +This provides a migration path while moving toward a stronger protocol guarantee. + +### How should Agents generate message IDs? + +The protocol doesn't mandate a specific format. Agents may use: + +- UUIDs (e.g., `msg_550e8400-e29b-41d4-a716-446655440000`) +- Prefixed sequential IDs (e.g., `msg_1`, `msg_2`, ...) +- Hash-based IDs +- Any other unique identifier scheme + +Clients **MUST** treat `messageId` as an opaque string and not rely on any particular format or structure. + +### What about message IDs across session loads? + +When a session is loaded via `session/load`, the Agent may: + +- Preserve original message IDs if replaying the conversation history +- Generate new message IDs if only exposing current state + +The protocol doesn't require message IDs to be stable across session loads, though Agents MAY choose to make them stable if their implementation supports it. + +### Does this apply to other session updates like tool calls or plan updates? + +This RFD specifically addresses `agent_message_chunk` and `user_message_chunk` updates. Other session update types (like `tool_call`, `agent_thought_chunk`, `plan`) already have their own identification mechanisms: + +- Tool calls use `toolCallId` +- Plan entries can be tracked by their position in the `entries` array +- Agent thoughts could benefit from message IDs if they're considered distinct messages + +Future RFDs may propose extending `messageId` to other update types if use cases emerge. + +## Revision history + +- **2025-11-09**: Initial draft