diff --git a/docs/getting-started/advanced-topics/scaling.md b/docs/getting-started/advanced-topics/scaling.md
index 3d778ded5..b02872a4d 100644
--- a/docs/getting-started/advanced-topics/scaling.md
+++ b/docs/getting-started/advanced-topics/scaling.md
@@ -326,27 +326,42 @@ For the full setup guide, see [OpenTelemetry Monitoring](/reference/monitoring/o
Here's what a production-ready scaled deployment typically looks like:
-```
-┌─────────────────────────────────────────────────────┐
-│ Load Balancer │
-│ (Nginx, HAProxy, Cloud LB) │
-└──────────┬──────────┬──────────┬────────────────────┘
- │ │ │
- ┌─────▼──┐ ┌─────▼──┐ ┌────▼───┐
- │ WebUI │ │ WebUI │ │ WebUI │ ← Stateless containers
- │ Pod 1 │ │ Pod 2 │ │ Pod N │
- └───┬────┘ └───┬────┘ └───┬────┘
- │ │ │
- ┌────▼──────────▼──────────▼────┐
- │ PostgreSQL │ ← Shared database
- │ (+ PGVector for RAG) │ ← Vector DB (or other Vector DB)
- └───────────────────────────────┘
- ┌───────────────────────────────┐
- │ Redis │ ← Shared state & websockets
- └───────────────────────────────┘
- ┌───────────────────────────────┐
- │ Shared Storage (NFS or S3) │ ← Shared file storage
- └───────────────────────────────┘
+```mermaid
+flowchart TD
+ %% Main Flow
+ LB["Load Balancer
(Nginx, HAProxy, Cloud LB)"]
+
+ subgraph Pods ["Stateless Containers"]
+ direction LR
+ P1["WebUI
Pod 1"]
+ P2["WebUI
Pod 2"]
+ PN["WebUI
Pod N"]
+ end
+
+ subgraph Shared ["Shared Infrastructure"]
+ direction LR
+ DB[("PostgreSQL
(+ PGVector for RAG)")]
+ Redis{{"Redis"}}
+ Storage[/"Shared Storage
(NFS or S3)"/]
+ end
+
+ %% Annotations
+ DBNote["Shared database
+ Vector DB"]
+ RedisNote["Shared state & websockets"]
+ StoreNote["Shared file storage"]
+
+ %% Connections
+ LB --> P1
+ LB --> P2
+ LB --> PN
+ P1 --> Shared
+ P2 --> Shared
+ PN --> Shared
+
+ %% Alignment Links
+ DB -.-> DBNote
+ Redis -.-> RedisNote
+ Storage -.-> StoreNote
```
**Running into issues?** The [Scaling & HA Troubleshooting](/troubleshooting/multi-replica) guide covers common problems (login loops, WebSocket failures, database locks, worker crashes) and their solutions. For performance tuning at scale, see [Optimization, Performance & RAM Usage](/troubleshooting/performance).
diff --git a/docs/getting-started/quick-start/connect-an-agent/index.md b/docs/getting-started/quick-start/connect-an-agent/index.md
index 405076140..da43647db 100644
--- a/docs/getting-started/quick-start/connect-an-agent/index.md
+++ b/docs/getting-started/quick-start/connect-an-agent/index.md
@@ -41,13 +41,16 @@ The agent decides when and how to use these tools based on your message, and Ope
Regardless of which agent you connect, the architecture is the same:
-```
-┌──────────────┐ ┌──────────────────┐ ┌──────────────┐
-│ │ HTTP │ │ Tools │ │
-│ Open WebUI │────────▶│ Agent Gateway │────────▶│ Terminal, │
-│ (frontend) │◀────────│ (API server) │◀────────│ Files, Web │
-│ │ Stream │ │ Results│ │
-└──────────────┘ └──────────────────┘ └──────────────┘
+```mermaid
+flowchart LR
+ A["Open WebUI
(frontend)"]
+ B["Agent Gateway
(API server)"]
+ C["Terminal,
Files, Web"]
+
+ A -- HTTP --> B
+ B -- Tools --> C
+ C -- Results --> B
+ B -- Stream --> A
```
1. **You type a message** in Open WebUI