diff --git a/docs/getting-started/advanced-topics/scaling.md b/docs/getting-started/advanced-topics/scaling.md index 3d778ded5..b02872a4d 100644 --- a/docs/getting-started/advanced-topics/scaling.md +++ b/docs/getting-started/advanced-topics/scaling.md @@ -326,27 +326,42 @@ For the full setup guide, see [OpenTelemetry Monitoring](/reference/monitoring/o Here's what a production-ready scaled deployment typically looks like: -``` -┌─────────────────────────────────────────────────────┐ -│ Load Balancer │ -│ (Nginx, HAProxy, Cloud LB) │ -└──────────┬──────────┬──────────┬────────────────────┘ - │ │ │ - ┌─────▼──┐ ┌─────▼──┐ ┌────▼───┐ - │ WebUI │ │ WebUI │ │ WebUI │ ← Stateless containers - │ Pod 1 │ │ Pod 2 │ │ Pod N │ - └───┬────┘ └───┬────┘ └───┬────┘ - │ │ │ - ┌────▼──────────▼──────────▼────┐ - │ PostgreSQL │ ← Shared database - │ (+ PGVector for RAG) │ ← Vector DB (or other Vector DB) - └───────────────────────────────┘ - ┌───────────────────────────────┐ - │ Redis │ ← Shared state & websockets - └───────────────────────────────┘ - ┌───────────────────────────────┐ - │ Shared Storage (NFS or S3) │ ← Shared file storage - └───────────────────────────────┘ +```mermaid +flowchart TD + %% Main Flow + LB["Load Balancer
(Nginx, HAProxy, Cloud LB)"] + + subgraph Pods ["Stateless Containers"] + direction LR + P1["WebUI
Pod 1"] + P2["WebUI
Pod 2"] + PN["WebUI
Pod N"] + end + + subgraph Shared ["Shared Infrastructure"] + direction LR + DB[("PostgreSQL
(+ PGVector for RAG)")] + Redis{{"Redis"}} + Storage[/"Shared Storage
(NFS or S3)"/] + end + + %% Annotations + DBNote["Shared database
+ Vector DB"] + RedisNote["Shared state & websockets"] + StoreNote["Shared file storage"] + + %% Connections + LB --> P1 + LB --> P2 + LB --> PN + P1 --> Shared + P2 --> Shared + PN --> Shared + + %% Alignment Links + DB -.-> DBNote + Redis -.-> RedisNote + Storage -.-> StoreNote ``` **Running into issues?** The [Scaling & HA Troubleshooting](/troubleshooting/multi-replica) guide covers common problems (login loops, WebSocket failures, database locks, worker crashes) and their solutions. For performance tuning at scale, see [Optimization, Performance & RAM Usage](/troubleshooting/performance). diff --git a/docs/getting-started/quick-start/connect-an-agent/index.md b/docs/getting-started/quick-start/connect-an-agent/index.md index 405076140..da43647db 100644 --- a/docs/getting-started/quick-start/connect-an-agent/index.md +++ b/docs/getting-started/quick-start/connect-an-agent/index.md @@ -41,13 +41,16 @@ The agent decides when and how to use these tools based on your message, and Ope Regardless of which agent you connect, the architecture is the same: -``` -┌──────────────┐ ┌──────────────────┐ ┌──────────────┐ -│ │ HTTP │ │ Tools │ │ -│ Open WebUI │────────▶│ Agent Gateway │────────▶│ Terminal, │ -│ (frontend) │◀────────│ (API server) │◀────────│ Files, Web │ -│ │ Stream │ │ Results│ │ -└──────────────┘ └──────────────────┘ └──────────────┘ +```mermaid +flowchart LR + A["Open WebUI
(frontend)"] + B["Agent Gateway
(API server)"] + C["Terminal,
Files, Web"] + + A -- HTTP --> B + B -- Tools --> C + C -- Results --> B + B -- Stream --> A ``` 1. **You type a message** in Open WebUI