Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
57 changes: 36 additions & 21 deletions docs/getting-started/advanced-topics/scaling.md
Original file line number Diff line number Diff line change
Expand Up @@ -326,27 +326,42 @@ For the full setup guide, see [OpenTelemetry Monitoring](/reference/monitoring/o

Here's what a production-ready scaled deployment typically looks like:

```
┌─────────────────────────────────────────────────────┐
│ Load Balancer │
│ (Nginx, HAProxy, Cloud LB) │
└──────────┬──────────┬──────────┬────────────────────┘
│ │ │
┌─────▼──┐ ┌─────▼──┐ ┌────▼───┐
│ WebUI │ │ WebUI │ │ WebUI │ ← Stateless containers
│ Pod 1 │ │ Pod 2 │ │ Pod N │
└───┬────┘ └───┬────┘ └───┬────┘
│ │ │
┌────▼──────────▼──────────▼────┐
│ PostgreSQL │ ← Shared database
│ (+ PGVector for RAG) │ ← Vector DB (or other Vector DB)
└───────────────────────────────┘
┌───────────────────────────────┐
│ Redis │ ← Shared state & websockets
└───────────────────────────────┘
┌───────────────────────────────┐
│ Shared Storage (NFS or S3) │ ← Shared file storage
└───────────────────────────────┘
```mermaid
flowchart TD
%% Main Flow
LB["Load Balancer<br/>(Nginx, HAProxy, Cloud LB)"]

subgraph Pods ["Stateless Containers"]
direction LR
P1["WebUI<br/>Pod 1"]
P2["WebUI<br/>Pod 2"]
PN["WebUI<br/>Pod N"]
end

subgraph Shared ["Shared Infrastructure"]
direction LR
DB[("PostgreSQL<br/>(+ PGVector for RAG)")]
Redis{{"Redis"}}
Storage[/"Shared Storage<br/>(NFS or S3)"/]
end

%% Annotations
DBNote["Shared database<br/>+ Vector DB"]
RedisNote["Shared state & websockets"]
StoreNote["Shared file storage"]

%% Connections
LB --> P1
LB --> P2
LB --> PN
P1 --> Shared
P2 --> Shared
PN --> Shared

%% Alignment Links
DB -.-> DBNote
Redis -.-> RedisNote
Storage -.-> StoreNote
```

**Running into issues?** The [Scaling & HA Troubleshooting](/troubleshooting/multi-replica) guide covers common problems (login loops, WebSocket failures, database locks, worker crashes) and their solutions. For performance tuning at scale, see [Optimization, Performance & RAM Usage](/troubleshooting/performance).
Expand Down
17 changes: 10 additions & 7 deletions docs/getting-started/quick-start/connect-an-agent/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,13 +41,16 @@ The agent decides when and how to use these tools based on your message, and Ope

Regardless of which agent you connect, the architecture is the same:

```
┌──────────────┐ ┌──────────────────┐ ┌──────────────┐
│ │ HTTP │ │ Tools │ │
│ Open WebUI │────────▶│ Agent Gateway │────────▶│ Terminal, │
│ (frontend) │◀────────│ (API server) │◀────────│ Files, Web │
│ │ Stream │ │ Results│ │
└──────────────┘ └──────────────────┘ └──────────────┘
```mermaid
flowchart LR
A["Open WebUI<br/>(frontend)"]
B["Agent Gateway<br/>(API server)"]
C["Terminal,<br/>Files, Web"]

A -- HTTP --> B
B -- Tools --> C
C -- Results --> B
B -- Stream --> A
```

1. **You type a message** in Open WebUI
Expand Down
Loading