TanStack · jherr · Jan 20, 2026 · Jan 20, 2026 · Jan 22, 2026 · Feb 19, 2026
diff --git a/.changeset/realtime-chat.md b/.changeset/realtime-chat.md
@@ -0,0 +1,11 @@
+---
+'@tanstack/ai': minor
+'@tanstack/ai-client': minor
+'@tanstack/ai-openai': minor
+'@tanstack/ai-elevenlabs': minor
+'@tanstack/ai-react': minor
+---
+
+feat: add realtime chat support with OpenAI and ElevenLabs adapters
+
+Adds realtime voice/text chat capabilities including a provider-agnostic realtime client, OpenAI Realtime API adapter, ElevenLabs conversational AI adapter, React `useRealtimeChat` hook, and shared realtime types across the core, client, and framework packages.
diff --git a/docs/architecture/realtime-chat.md b/docs/architecture/realtime-chat.md
@@ -0,0 +1,313 @@
+---
+title: Realtime Voice Chat Architecture
+id: realtime-chat-architecture
+---
+
+# Realtime Voice Chat Architecture
+
+This document describes the architecture of TanStack AI's realtime voice-to-voice chat capability, which enables browser-based voice conversations with AI models.
+
+## Overview
+
+The realtime chat system provides a vendor-neutral, type-safe abstraction for voice-to-voice AI interactions. It currently supports:
+
+- **OpenAI Realtime API** - WebRTC-based connection with GPT-4o realtime models
+- **ElevenLabs Conversational AI** - SDK-based connection for voice conversations
+
+## Architecture Layers
+
+```
+┌─────────────────────────────────────────────────────────────────┐
+│                        React Application                         │
+│  ┌─────────────────────────────────────────────────────────┐    │
+│  │                   useRealtimeChat()                      │    │
+│  │  - Connection state (status, mode)                       │    │
+│  │  - Messages & transcripts                                │    │
+│  │  - Audio visualization (levels, waveforms)               │    │
+│  │  - Control methods (connect, disconnect, interrupt)      │    │
+│  └─────────────────────────────────────────────────────────┘    │
+└─────────────────────────────────────────────────────────────────┘
+                                │
+                                ▼
+┌─────────────────────────────────────────────────────────────────┐
+│                       @tanstack/ai-client                        │
+│  ┌─────────────────────────────────────────────────────────┐    │
+│  │                    RealtimeClient                        │    │
+│  │  - Connection lifecycle management                       │    │
+│  │  - Token refresh scheduling                              │    │
+│  │  - Event subscription & dispatch                         │    │
+│  │  - Tool execution coordination                           │    │
+│  └─────────────────────────────────────────────────────────┘    │
+└─────────────────────────────────────────────────────────────────┘
+                                │
+                                ▼
+┌─────────────────────────────────────────────────────────────────┐
+│                      Provider Adapters                           │
+│  ┌──────────────────────┐    ┌──────────────────────┐          │
+│  │   openaiRealtime()   │    │ elevenlabsRealtime() │          │
+│  │  - WebRTC connection │    │  - SDK wrapper       │          │
+│  │  - Audio I/O         │    │  - Signed URL auth   │          │
+│  │  - Event mapping     │    │  - Event mapping     │          │
+│  └──────────────────────┘    └──────────────────────┘          │
+└─────────────────────────────────────────────────────────────────┘
+                                │
+                                ▼
+┌─────────────────────────────────────────────────────────────────┐
+│                         Server-Side                              │
+│  ┌─────────────────────────────────────────────────────────┐    │
+│  │              Token Generation Endpoint                   │    │
+│  │  - openaiRealtimeToken() - ephemeral client secrets     │    │
+│  │  - elevenlabsRealtimeToken() - signed URLs              │    │
+│  └─────────────────────────────────────────────────────────┘    │
+└─────────────────────────────────────────────────────────────────┘
+```
+
+## Key Components
+
+### 1. Token Adapters (Server-Side)
+
+Token adapters generate short-lived credentials for client-side connections. This keeps API keys secure on the server.
+
+```typescript
+// Server-side token endpoint
+import { realtimeToken } from '@tanstack/ai'
+import { openaiRealtimeToken } from '@tanstack/ai-openai'
+
+const token = await realtimeToken({
+  adapter: openaiRealtimeToken({
+    model: 'gpt-4o-realtime-preview',
+    voice: 'alloy',
+    instructions: 'You are a helpful assistant.',
+    turnDetection: {
+      type: 'server_vad',
+      threshold: 0.5,
+      silence_duration_ms: 500,
+    },
+  }),
+})
+```
+
+**Token Structure:**
+```typescript
+interface RealtimeToken {
+  provider: string        // 'openai' | 'elevenlabs'
+  token: string          // Ephemeral token or signed URL
+  expiresAt: number      // Expiration timestamp (ms)
+  config: RealtimeSessionConfig  // Session configuration
+}
+```
+
+### 2. Client Adapters (Browser-Side)
+
+Client adapters handle the actual connection to provider APIs, managing:
+- WebRTC or WebSocket connections
+- Audio capture and playback
+- Event translation to common format
+- Audio visualization data
+
+```typescript
+// Client-side adapter usage
+import { openaiRealtime } from '@tanstack/ai-openai'
+
+const adapter = openaiRealtime({
+  connectionMode: 'webrtc', // default
+})
+```
+
+### 3. RealtimeClient
+
+The `RealtimeClient` class manages the connection lifecycle:
+
+- **Connection Management**: Connect, disconnect, reconnect
+- **Token Refresh**: Automatically refreshes tokens before expiry
+- **Event Handling**: Subscribes to adapter events and dispatches to callbacks
+- **State Management**: Tracks status, mode, messages, transcripts
+- **Tool Execution**: Coordinates client-side tool calls
+
+### 4. useRealtimeChat Hook
+
+The React hook provides a reactive interface:
+
+```typescript
+const {
+  // Connection state
+  status,      // 'idle' | 'connecting' | 'connected' | 'reconnecting' | 'error'
+  error,
+  connect,
+  disconnect,
+
+  // Conversation state
+  mode,        // 'idle' | 'listening' | 'thinking' | 'speaking'
+  messages,
+  pendingUserTranscript,
+  pendingAssistantTranscript,
+
+  // Voice control
+  startListening,
+  stopListening,
+  interrupt,
+
+  // Audio visualization
+  inputLevel,
+  outputLevel,
+  getInputTimeDomainData,
+  getOutputTimeDomainData,
+} = useRealtimeChat({
+  getToken: () => fetch('/api/realtime-token').then(r => r.json()),
+  adapter: openaiRealtime(),
+})
+```
+
+## Connection Flow
+
+### OpenAI WebRTC Flow
+
+```mermaid
+sequenceDiagram
+    participant Browser
+    participant Server
+    participant OpenAI
+
+    Browser->>Server: POST /api/realtime-token
+    Server->>OpenAI: POST /v1/realtime/sessions
+    OpenAI-->>Server: { client_secret, expires_at }
+    Server-->>Browser: RealtimeToken
+
+    Browser->>Browser: getUserMedia() - request mic
+    Browser->>Browser: Create RTCPeerConnection
+    Browser->>Browser: Add audio track to PC
+    Browser->>Browser: createOffer()
+
+    Browser->>OpenAI: POST /v1/realtime?model=...
+    Note right of Browser: SDP offer + ephemeral token
+    OpenAI-->>Browser: SDP answer
+
+    Browser->>Browser: setRemoteDescription()
+    Note over Browser,OpenAI: WebRTC connection established
+
+    Browser->>OpenAI: Audio via WebRTC
+    OpenAI-->>Browser: Audio + events via WebRTC
+```
+
+### ElevenLabs Flow
+
+```mermaid
+sequenceDiagram
+    participant Browser
+    participant Server
+    participant ElevenLabs
+
+    Browser->>Server: POST /api/realtime-token
+    Server->>ElevenLabs: POST /v1/convai/conversation/get_signed_url
+    ElevenLabs-->>Server: { signed_url }
+    Server-->>Browser: RealtimeToken
+
+    Browser->>ElevenLabs: Conversation.startSession(signedUrl)
+    Note over Browser,ElevenLabs: SDK handles WebSocket/WebRTC
+
+    Browser->>ElevenLabs: Audio via SDK
+    ElevenLabs-->>Browser: Audio + events via SDK
+```
+
+## Audio Visualization
+
+The system provides real-time audio visualization through the `AudioVisualization` interface:
+
+```typescript
+interface AudioVisualization {
+  inputLevel: number           // 0-1 normalized input volume (RMS)
+  outputLevel: number          // 0-1 normalized output volume (RMS)
+  getInputFrequencyData(): Uint8Array   // FFT frequency bins
+  getOutputFrequencyData(): Uint8Array
+  getInputTimeDomainData(): Uint8Array  // Raw waveform samples
+  getOutputTimeDomainData(): Uint8Array
+  inputSampleRate: number
+  outputSampleRate: number
+}
+```
+
+The OpenAI adapter uses Web Audio API `AnalyserNode` for visualization:
+- `fftSize: 2048` for high-resolution analysis
+- RMS (Root Mean Square) calculation for accurate volume levels
+- Separate analysers for input (microphone) and output (AI voice)
+
+## Event System
+
+Adapters emit standardized events:
+
+| Event | Payload | Description |
+|-------|---------|-------------|
+| `status_change` | `{ status }` | Connection status changed |
+| `mode_change` | `{ mode }` | Conversation mode changed |
+| `transcript` | `{ role, transcript, isFinal }` | Speech-to-text update |
+| `message_complete` | `{ message }` | Full message received |
+| `tool_call` | `{ toolCallId, toolName, input }` | Tool invocation requested |
+| `interrupted` | `{ messageId? }` | Response was interrupted |
+| `error` | `{ error }` | Error occurred |
+
+## Current Status
+
+### Implemented Features
+
+- [x] OpenAI Realtime API integration (WebRTC)
+- [x] ElevenLabs Conversational AI integration
+- [x] Token generation and refresh
+- [x] Audio capture and playback
+- [x] Real-time transcription display
+- [x] Audio visualization (levels, waveforms)
+- [x] Interrupt capability
+- [x] React hook (`useRealtimeChat`)
+- [x] Demo application at `/realtime` route
+
+### Known Limitations
+
+- **Device Selection**: Currently uses system default audio devices. Custom device selection not yet implemented.
+- **ElevenLabs SDK**: Using `@11labs/client@0.2.0` which has limited TypeScript support.
+- **Push-to-Talk**: Manual VAD mode implemented but not exposed in demo UI.
+- **Tool Calling**: Framework supports tools but demo doesn't showcase them.
+
+### Demo Application
+
+The `examples/ts-react-chat` application includes a realtime voice chat demo at the `/realtime` route:
+
+**Features:**
+- Provider selection (OpenAI / ElevenLabs)
+- Connection status indicator
+- Conversation mode indicator (Listening/Thinking/Speaking)
+- Message history with transcripts
+- Audio level meters
+- Waveform visualization (debug mode)
+- Interrupt button during AI speech
+
+**Required Environment Variables:**
+```bash
+OPENAI_API_KEY=sk-...
+ELEVENLABS_API_KEY=xi-...      # Optional, for ElevenLabs
+ELEVENLABS_AGENT_ID=...        # Optional, for ElevenLabs
+```
+
+## Files Reference
+
+### Core Types
+- `packages/typescript/ai/src/realtime/types.ts` - Core type definitions
+- `packages/typescript/ai-client/src/realtime-types.ts` - Client-side types
+
+### Token Generation (Server)
+- `packages/typescript/ai/src/realtime/index.ts` - `realtimeToken()` function
+- `packages/typescript/ai-openai/src/realtime/token.ts` - OpenAI token adapter
+- `packages/typescript/ai-elevenlabs/src/realtime/token.ts` - ElevenLabs token adapter
+
+### Client Adapters
+- `packages/typescript/ai-openai/src/realtime/adapter.ts` - OpenAI WebRTC adapter
+- `packages/typescript/ai-elevenlabs/src/realtime/adapter.ts` - ElevenLabs SDK adapter
+
+### Client Library
+- `packages/typescript/ai-client/src/realtime-client.ts` - RealtimeClient class
+
+### React Integration
+- `packages/typescript/ai-react/src/use-realtime-chat.ts` - React hook
+- `packages/typescript/ai-react/src/realtime-types.ts` - Hook types
+
+### Demo Application
+- `examples/ts-react-chat/src/routes/realtime.tsx` - Demo UI component
+- `examples/ts-react-chat/src/routes/api.realtime-token.ts` - Token API endpoint
diff --git a/examples/ts-react-chat/.env.example b/examples/ts-react-chat/.env.example
@@ -1,3 +1,11 @@
 # OpenAI API Key
 # Get yours at: https://platform.openai.com/api-keys
-OPENAI_API_KEY=sk-...
+OPENAI_API_KEY=sk-...
+
+# ElevenLabs API Key (for realtime voice)
+# Get yours at: https://elevenlabs.io/app/settings/api-keys
+ELEVENLABS_API_KEY=xi-...
+
+# ElevenLabs Agent ID (for realtime voice)
+# Create an agent at: https://elevenlabs.io/app/conversational-ai
+ELEVENLABS_AGENT_ID=...
diff --git a/examples/ts-react-chat/package.json b/examples/ts-react-chat/package.json
@@ -13,6 +13,7 @@
     "@tanstack/ai": "workspace:*",
     "@tanstack/ai-anthropic": "workspace:*",
     "@tanstack/ai-client": "workspace:*",
+    "@tanstack/ai-elevenlabs": "workspace:*",
     "@tanstack/ai-gemini": "workspace:*",
     "@tanstack/ai-grok": "workspace:*",
     "@tanstack/ai-ollama": "workspace:*",

diff --git a/examples/ts-react-chat/src/components/Header.tsx b/examples/ts-react-chat/src/components/Header.tsx
@@ -1,7 +1,7 @@
 import { Link } from '@tanstack/react-router'
 
 import { useState } from 'react'
-import { Guitar, Home, Menu, X } from 'lucide-react'
+import { Guitar, Home, Menu, Mic, X } from 'lucide-react'
 
 export default function Header() {
   const [isOpen, setIsOpen] = useState(false)
@@ -71,6 +71,19 @@ export default function Header() {
             <Guitar size={20} />
             <span className="font-medium">Guitar Demo</span>
           </Link>
+
+          <Link
+            to="/realtime"
+            onClick={() => setIsOpen(false)}
+            className="flex items-center gap-3 p-3 rounded-lg hover:bg-gray-800 transition-colors mb-2"
+            activeProps={{
+              className:
+                'flex items-center gap-3 p-3 rounded-lg bg-cyan-600 hover:bg-cyan-700 transition-colors mb-2',
+            }}
+          >
+            <Mic size={20} />
+            <span className="font-medium">Voice Chat (Realtime)</span>
+          </Link>
         </nav>
       </aside>
     </>