diff --git a/plans/slow-queries-analysis-2.md b/plans/slow-queries-analysis-2.md
new file mode 100644
index 0000000000..e12e06c3a1
--- /dev/null
+++ b/plans/slow-queries-analysis-2.md
@@ -0,0 +1,184 @@
+# SMP Server Postgres: Slow Query Analysis
+
+Data from three production servers (A, B, C), ~3.5 day observation window.
+
+## Top queries by total time
+
+| Rank | Query | Server A ms | Server B ms | Server C ms |
+|------|-------|-------------|-------------|-------------|
+| 1 | getEntityCounts (6 COUNT subqueries) | 1,682,874 | 1,639,325 | 1,619,892 |
+| 2 | write_message() | 303,393 | 458,262 | 280,375 |
+| 3 | try_del_peek_msg() | 352,912 | 386,036 | 333,877 |
+| 4 | expire_old_messages() | 246,034 | 220,003 | 160,232 |
+| 5 | UPDATE SET updated_at | 234,146 | 216,911 | 211,480 |
+| 6 | INSERT INTO messages | 184,430 | 323,617 | 169,149 |
+| 7 | expire batch cursor (array_agg) | 122,739 | 99,061 | 39,975 |
+| 8 | Batch recipient_id IN lookups | ~134K | ~79K | ~81K |
+| 9 | Batch notifier_id IN lookups | ~143K | ~64K | ~64K |
+| 10 | msg_peek (SELECT FROM messages) | ~112K | ~102K | ~126K |
+
+getEntityCounts alone is **45-48%** of total query time on all three servers.
+
+---
+
+## Verified fixes
+
+### 1. getEntityCounts: replace ③④ with SUM(queue_count)
+
+**Query** (`QueueStore/Postgres.hs:160-167`):
+
+```sql
+SELECT
+  (SELECT COUNT(1) FROM msg_queues WHERE deleted_at IS NULL) AS queue_count,                                            -- ①
+  (SELECT COUNT(1) FROM msg_queues WHERE deleted_at IS NULL AND notifier_id IS NOT NULL) AS notifier_count,              -- ②
+  (SELECT COUNT(1) FROM services WHERE service_role = ?) AS rcv_service_count,                                           -- trivial
+  (SELECT COUNT(1) FROM services WHERE service_role = ?) AS ntf_service_count,                                           -- trivial
+  (SELECT COUNT(1) FROM msg_queues WHERE rcv_service_id IS NOT NULL AND deleted_at IS NULL) AS rcv_service_queues_count,  -- ③
+  (SELECT COUNT(1) FROM msg_queues WHERE ntf_service_id IS NOT NULL AND deleted_at IS NULL) AS ntf_service_queues_count   -- ④
+```
+
+| Server | Calls | Avg ms | Max ms | Total ms |
+|--------|-------|--------|--------|----------|
+| A | 5,058 | 332.7 | 2,061 | 1,682,874 |
+| B | 5,055 | 324.3 | 1,844 | 1,639,325 |
+| C | 5,053 | 320.6 | 1,250 | 1,619,892 |
+
+**Problem**: 4 subqueries scan `msg_queues`. Indexes exist for ③
+(`idx_msg_queues_rcv_service_id(rcv_service_id, deleted_at)`), ④
+(`idx_msg_queues_ntf_service_id(ntf_service_id, deleted_at)`), and potentially ①
+(`idx_msg_queues_updated_at_recipient_id(deleted_at, ...)`), but actual query plans
+and per-subquery cost are unknown without `EXPLAIN ANALYZE`.
+
+**Fix**: Replace ③ and ④:
+
+```sql
+COALESCE((SELECT SUM(queue_count) FROM services WHERE service_role = 'M'), 0) AS rcv_service_queues_count,
+COALESCE((SELECT SUM(queue_count) FROM services WHERE service_role = 'N'), 0) AS ntf_service_queues_count
+```
+
+**Verification**:
+- Trigger logic traced for all transitions (NULL→set, change, soft-delete, physical delete) — correct.
+- FK `rcv_service_id REFERENCES services(service_id)` guarantees equivalence.
+- `queue_count + p_change` is atomic under READ COMMITTED.
+- `update_all_aggregates()` exists as repair mechanism.
+
+**Savings**: Eliminates 2 of 4 msg_queues scans. Exact per-subquery cost unknown — needs `EXPLAIN ANALYZE`.
+
+---
+
+### 2. expire_old_messages: remove trailing COUNTs
+
+At the end of `expire_old_messages` (`Migrations.hs`):
+
+```sql
+r_stored_msgs_count := (SELECT COUNT(1) FROM messages);
+r_stored_queues := (SELECT COUNT(1) FROM msg_queues WHERE deleted_at IS NULL);
+```
+
+| Server | expire avg | r_stored_queues avg | r_stored_msgs avg | COUNTs combined | % of procedure |
+|--------|-----------|---------------------|-------------------|-----------------|----------------|
+| A | 11,716ms | 695ms | 110ms | 805ms | 6.9% |
+| B | 10,476ms | 631ms | 17ms | 648ms | 6.2% |
+| C | 7,630ms | 588ms | 53ms | 641ms | 8.4% |
+
+**How used** (`Server.hs:485-488`):
+- `storedMsgsCount` → resets `msgCount` stat (also maintained incrementally: +1 on send, -1 on ACK)
+- `storedQueues` → **only logged** via `printMessageStats`. Same value available from `getEntityCounts`.
+
+**Fix**: Remove both COUNTs. Return only `r_expired_msgs_count`.
+
+**Verification**: All usages traced. `storedQueues` is only logged. `storedMsgsCount` resets
+an incrementally-maintained counter — removing means potential drift, corrected on restart.
+
+**Savings**: 641–805ms per cycle × 21 cycles = **13.5–16.9s total** (CSV-verified).
+
+---
+
+### 3. Trigger WHEN clause: skip PL/pgSQL call for non-service updates
+
+**Current** (`Migrations.hs:566-568`):
+
+```sql
+CREATE TRIGGER tr_queue_update
+AFTER UPDATE ON msg_queues
+FOR EACH ROW EXECUTE PROCEDURE on_queue_update();
+```
+
+Server C data — ~2.6M updates, only ~110K (4.3%) change service fields:
+
+| UPDATE pattern | Calls | Service fields? |
+|----------------|-------|-----------------|
+| SET updated_at | ~1,275K | No |
+| SET msg_can_write/size/expire (write_message) | ~600K | No |
+| SET msg_can_write/size/expire (try_del_*) | ~331K | No |
+| SET msg_can_write/size/expire (try_del reset) | ~258K | No |
+| SET sender_key | ~17K | No |
+| SET msg_can_write/size/expire (delete_expired) | ~3K | No |
+| **SET rcv_service_id** | **~101K** | **Yes** |
+| **SET deleted_at** | **~10K** | **Yes** |
+
+**Fix**: Add `WHEN` clause — evaluated in C by PostgreSQL, skips function call entirely:
+
+```sql
+CREATE TRIGGER tr_queue_update
+AFTER UPDATE ON msg_queues
+FOR EACH ROW
+WHEN (
+  OLD.deleted_at IS DISTINCT FROM NEW.deleted_at
+  OR OLD.rcv_service_id IS DISTINCT FROM NEW.rcv_service_id
+  OR OLD.ntf_service_id IS DISTINCT FROM NEW.ntf_service_id
+  OR OLD.notifier_id IS DISTINCT FROM NEW.notifier_id
+)
+EXECUTE PROCEDURE on_queue_update();
+```
+
+**Verification**:
+- PostgreSQL supports `OLD`/`NEW` in `WHEN` for `AFTER UPDATE` triggers.
+- 4 conditions match exactly the fields checked inside `on_queue_update()`.
+- Behavioral change: **none**.
+
+**Savings**: ~2.5M PL/pgSQL calls avoided. Per-call overhead estimated ~0.02–0.05ms.
+Total: ~50–125s estimated, not measured.
+
+---
+
+## Needs EXPLAIN ANALYZE
+
+### 4. Partial indexes for getEntityCounts ① and ②
+
+After fix #1, subqueries ① and ② remain. ② has no index covering both
+`deleted_at IS NULL` and `notifier_id IS NOT NULL`.
+
+```sql
+CREATE INDEX idx_msg_queues_live ON msg_queues ((1)) WHERE deleted_at IS NULL;
+CREATE INDEX idx_msg_queues_live_notifier ON msg_queues ((1)) WHERE deleted_at IS NULL AND notifier_id IS NOT NULL;
+```
+
+**Trade-off**: Write overhead on every INSERT/UPDATE/DELETE. Need `EXPLAIN ANALYZE`
+to confirm PostgreSQL uses these for COUNT vs choosing seq scan.
+
+---
+
+## Not problems
+
+- **write_message / try_del_peek_msg** (ranks #2-3): 0.5-0.7ms avg. High total from volume (~600K calls).
+  Max spikes (490-523ms) are lock contention on `FOR UPDATE` — architectural, not fixable.
+
+- **UPDATE SET updated_at** (rank #5): 0.17ms avg, ~1.3M calls. Already minimal.
+  Fix #3 eliminates the trigger overhead on these.
+
+- **SET rcv_service_id** (Server C only, 100K calls): CSV shows rows_affected = calls — all
+  legitimate associations. Haskell guard at `Postgres.hs:487` works correctly.
+
+- **Batch lookups** (ranks #8-9): 1.8-2.0ms avg for ~135 PK probes. Near-optimal.
+
+---
+
+## Summary
+
+| # | Fix | Per-call savings | Calls | Total savings | Verified |
+|---|-----|-----------------|-------|---------------|----------|
+| 1 | getEntityCounts ③④ → `SUM(queue_count)` | 0–158ms (unknown split) | ~5,050 | 0–800s | Correctness: yes. Savings: needs EXPLAIN ANALYZE |
+| 2 | Remove trailing COUNTs from expire_old_messages | 641–805ms | 21 | 13.5–16.9s | Yes (CSV) |
+| 3 | Add WHEN clause to tr_queue_update | ~0.02–0.05ms | ~2.5M skipped | ~50–125s est. | Correctness: yes. Savings: estimated |
+| 4 | Partial indexes for ①② | Unknown | ~5,050 | Unknown | Needs EXPLAIN ANALYZE |
diff --git a/plans/slow-queries-analysis-3.md b/plans/slow-queries-analysis-3.md
new file mode 100644
index 0000000000..d31bb4710d
--- /dev/null
+++ b/plans/slow-queries-analysis-3.md
@@ -0,0 +1,186 @@
+# SMP Server Postgres: Slow Query Analysis (post-reset)
+
+Data from three production servers (A, B, C), ~5.5 hour window after stats reset.
+EXPLAIN ANALYZE from a large server (~30M rows in msg_queues).
+
+## Top queries by total time
+
+| Rank | Query | A total ms | B total ms | C total ms |
+|------|-------|-----------|-----------|-----------|
+| 1 | getEntityCounts (6 COUNT subqueries) | 99,799 | 101,946 | 121,495 |
+| 2 | UPDATE SET updated_at | 35,799 | 30,797 | 26,551 |
+| 3 | try_del_peek_msg() | 28,903 | 23,147 | 26,482 |
+| 4 | write_message() | 23,103 | 18,942 | 20,301 |
+| 5 | Batch recipient_id IN lookups | 16,062 | 12,529 | 9,977 |
+| 6 | INSERT INTO messages | 14,395 | 11,911 | 12,882 |
+| 7 | msg_peek (SELECT FROM messages) | 12,882 | 11,992 | 14,661 |
+| 8 | expire_old_messages() | 9,762 | 11,619 | 10,863 |
+| 9 | delete_expired_msgs() | 7,801 | 5,456 | 6,256 |
+| 10 | expire batch cursor (array_agg) | 4,421 | 5,679 | 4,566 |
+
+Grand totals: A ~292s, B ~317s, C ~288s.
+
+getEntityCounts is **34-42%** of all query time across all three servers.
+
+---
+
+## EXPLAIN ANALYZE results for getEntityCounts
+
+Run on a large server (~30M rows in msg_queues, cold cache):
+
+| Subquery | Time | % of 23.4s | Plan | Rows scanned | Key detail |
+|----------|------|-----------|------|-------------|------------|
+| ① queue_count | **7,851ms** | **33.5%** | Parallel Seq Scan | 30.6M (97% match) | No useful index |
+| ② notifier_count | **6,382ms** | **27.2%** | Parallel Seq Scan | 30.6M (4M match) | No useful index |
+| ③ rcv_service_queues | 0.5ms | 0% | Index Only Scan | 0 rows | No rcv services on this server; similar cost to ④ on servers with rcv services |
+| ④ ntf_service_queues | **8,914ms** | **38.0%** | Parallel Index Only Scan | 3.7M match | **2.7M heap fetches** |
+| Services (③④) | 1.8ms | 0% | Index Only / Bitmap | 0 + 6 rows | Trivial |
+| JIT + Planning | 1,341ms | 5.7% | — | — | JIT compilation overhead |
+| **Total** | **23,440ms** | | | | |
+
+The CSV averages (302-368ms) reflect warm-cache performance. This EXPLAIN is cold cache
+(`shared read=3.4M` vs `shared hit=44K` — 98.7% read from disk).
+
+---
+
+## Verified fixes
+
+### 1. getEntityCounts: replace ③④ with SUM(queue_count)
+
+```sql
+-- Current ③ and ④:
+(SELECT COUNT(1) FROM msg_queues WHERE rcv_service_id IS NOT NULL AND deleted_at IS NULL)  -- ③: 0.5ms
+(SELECT COUNT(1) FROM msg_queues WHERE ntf_service_id IS NOT NULL AND deleted_at IS NULL)  -- ④: 8,914ms
+
+-- Fix:
+COALESCE((SELECT SUM(queue_count) FROM services WHERE service_role = 'M'), 0)  -- ~0ms
+COALESCE((SELECT SUM(queue_count) FROM services WHERE service_role = 'N'), 0)  -- ~0ms
+```
+
+**EXPLAIN ANALYZE confirmed**: ③ returns 0 rows (already free). ④ costs 8,914ms due to
+Parallel Index Only Scan with 2.7M heap fetches on `idx_msg_queues_ntf_service_id`.
+
+**Verification**:
+- Trigger logic traced for all transitions (NULL→set, change, soft-delete, physical delete) — correct.
+- FK `rcv_service_id REFERENCES services(service_id)` guarantees equivalence.
+- `queue_count + p_change` is atomic under READ COMMITTED.
+- `update_all_aggregates()` exists as repair mechanism.
+
+**Savings**: ~8.9s cold cache (38% of query). Warm cache proportionally less but still dominant ④ cost.
+
+---
+
+### 2. getEntityCounts: partial indexes for ① and ②
+
+```sql
+CREATE INDEX idx_msg_queues_active ON msg_queues ((1)) WHERE deleted_at IS NULL;
+CREATE INDEX idx_msg_queues_active_notifier ON msg_queues ((1)) WHERE deleted_at IS NULL AND notifier_id IS NOT NULL;
+```
+
+**EXPLAIN ANALYZE confirmed**: ① does Parallel Seq Scan (7,851ms), ② does Parallel Seq Scan (6,382ms).
+No existing index is used for these subqueries despite `idx_msg_queues_updated_at_recipient_id`
+having `deleted_at` as first column — PostgreSQL chose seq scan because 97% of rows match.
+
+Partial indexes contain only matching rows, enabling fast index-only COUNT without scanning
+the full table.
+
+**Trade-off**: Write overhead on every INSERT/UPDATE/DELETE that changes `deleted_at` or
+`notifier_id`. For the ~30M row table with ~300K updates per 5.5h, this is acceptable.
+
+**Savings**: ~14.2s cold cache (61% of query). Combined with fix #1: **23.1s → ~1.3s** (JIT only).
+
+---
+
+### 3. expire_old_messages: remove trailing COUNTs
+
+At the end of `expire_old_messages` (`Migrations.hs`):
+
+```sql
+r_stored_msgs_count := (SELECT COUNT(1) FROM messages);
+r_stored_queues := (SELECT COUNT(1) FROM msg_queues WHERE deleted_at IS NULL);
+```
+
+| Server | expire avg | r_stored_queues | r_stored_msgs | COUNTs combined | % of procedure |
+|--------|-----------|-----------------|---------------|-----------------|----------------|
+| A | 9,762ms | 535ms | 50ms | 585ms | 6.0% |
+| B | 11,619ms | 390ms | 12ms | 402ms | 3.5% |
+| C | 10,863ms | 659ms | 25ms | 684ms | 6.3% |
+
+**Usage** (`Server.hs:485-488`):
+- `storedMsgsCount` → resets `msgCount` stat (also maintained incrementally: +1 on send, -1 on ACK).
+- `storedQueues` → **only logged**. Same value available from `getEntityCounts`.
+
+**Fix**: Remove both COUNTs. Return only `r_expired_msgs_count`.
+
+**Verification**: All usages traced. `storedQueues` is only logged. `storedMsgsCount` resets
+an incrementally-maintained counter — removing means potential drift, corrected on restart.
+
+**Savings**: 402–684ms per cycle (CSV-verified).
+
+---
+
+### 4. Trigger WHEN clause: skip PL/pgSQL call for non-service updates
+
+**Current** (`Migrations.hs:566-568`):
+
+```sql
+CREATE TRIGGER tr_queue_update
+AFTER UPDATE ON msg_queues
+FOR EACH ROW EXECUTE PROCEDURE on_queue_update();
+```
+
+Server C data — ~297K updates, only ~670 (0.2%) change service-related fields:
+
+| UPDATE pattern | Calls | Service fields? |
+|----------------|-------|-----------------|
+| SET updated_at | ~213K | No |
+| SET msg_can_write/size/expire (write_message) | ~40K | No |
+| SET msg_can_write/size/expire (try_del_*) | ~27K | No |
+| SET msg_can_write/size (try_del reset) | ~14K | No |
+| SET sender_key | ~2K | No |
+| **SET deleted_at** | **~563** | **Yes** |
+| **ntf_service_id/notifier_id changes** | **~108** | **Yes** |
+
+**Fix**: Add `WHEN` clause — evaluated in C by PostgreSQL, skips function call entirely:
+
+```sql
+CREATE TRIGGER tr_queue_update
+AFTER UPDATE ON msg_queues
+FOR EACH ROW
+WHEN (
+  OLD.deleted_at IS DISTINCT FROM NEW.deleted_at
+  OR OLD.rcv_service_id IS DISTINCT FROM NEW.rcv_service_id
+  OR OLD.ntf_service_id IS DISTINCT FROM NEW.ntf_service_id
+  OR OLD.notifier_id IS DISTINCT FROM NEW.notifier_id
+)
+EXECUTE PROCEDURE on_queue_update();
+```
+
+**Verification**:
+- PostgreSQL supports `OLD`/`NEW` in `WHEN` for `AFTER UPDATE` triggers.
+- 4 conditions match exactly the fields checked inside `on_queue_update()`.
+- Behavioral change: **none**.
+
+**Savings**: ~296K PL/pgSQL calls avoided (99.8% of trigger fires). Per-call overhead
+estimated ~0.02–0.05ms. Total: ~6–15s estimated over this 5.5h window.
+
+---
+
+## Not problems
+
+- **write_message / try_del_peek_msg**: 0.5-0.65ms avg. High total from volume. Max spikes are lock contention — architectural.
+- **UPDATE SET updated_at**: 0.12-0.17ms avg. Fix #4 eliminates trigger overhead.
+- **Batch lookups**: 1.7-2.6ms avg for ~135 PK probes. Near-optimal.
+
+---
+
+## Summary
+
+| # | Fix | Savings | Verified |
+|---|-----|---------|----------|
+| 1 | getEntityCounts ③④ → `SUM(queue_count)` | ~8.9s/call cold, ④ eliminated (EXPLAIN ANALYZE) | Yes |
+| 2 | Partial indexes for getEntityCounts ①② | ~14.2s/call cold, ①② eliminated (EXPLAIN ANALYZE) | Yes — plan confirmed, index benefit to verify after creation |
+| 3 | Remove trailing COUNTs from expire_old_messages | 402–684ms/cycle (CSV) | Yes |
+| 4 | Add WHEN clause to tr_queue_update | ~6–15s est. over 5.5h (296K calls skipped) | Correctness: yes. Savings: estimated |
+
+Fixes #1 + #2 combined: getEntityCounts **23.4s → ~1.3s cold cache** (94% reduction).
diff --git a/plans/slow-queries-analysis.md b/plans/slow-queries-analysis.md
new file mode 100644
index 0000000000..1dd69ff09b
--- /dev/null
+++ b/plans/slow-queries-analysis.md
@@ -0,0 +1,199 @@
+# SMP Server Postgres: Slow Query Analysis
+
+Data from three production servers (A, B, C) over a multi-day observation window.
+
+## Verified fixes
+
+### 1. getEntityCounts: replace ③④ with SUM(queue_count)
+
+**Query** (`QueueStore/Postgres.hs:160-167`):
+
+```sql
+SELECT
+  (SELECT COUNT(1) FROM msg_queues WHERE deleted_at IS NULL) AS queue_count,                                            -- ① scan
+  (SELECT COUNT(1) FROM msg_queues WHERE deleted_at IS NULL AND notifier_id IS NOT NULL) AS notifier_count,              -- ② scan
+  (SELECT COUNT(1) FROM services WHERE service_role = ?) AS rcv_service_count,                                           -- trivial (<10 rows)
+  (SELECT COUNT(1) FROM services WHERE service_role = ?) AS ntf_service_count,                                           -- trivial (<10 rows)
+  (SELECT COUNT(1) FROM msg_queues WHERE rcv_service_id IS NOT NULL AND deleted_at IS NULL) AS rcv_service_queues_count,  -- ③ scan
+  (SELECT COUNT(1) FROM msg_queues WHERE ntf_service_id IS NOT NULL AND deleted_at IS NULL) AS ntf_service_queues_count   -- ④ scan
+```
+
+**Performance**: ~315ms avg, ~2s max, ~2500 calls. #1 slow query by total time (~800s).
+
+**Problem**: 4 scans of `msg_queues`. Indexes exist for some subqueries
+(`idx_msg_queues_rcv_service_id(rcv_service_id, deleted_at)` for ③,
+`idx_msg_queues_ntf_service_id(ntf_service_id, deleted_at)` for ④,
+`idx_msg_queues_updated_at_recipient_id(deleted_at, ...)` potentially for ①),
+but whether PostgreSQL uses them for COUNT and the actual per-subquery cost is
+unknown without `EXPLAIN ANALYZE`.
+
+**Fix**: Replace ③ and ④:
+
+```sql
+COALESCE((SELECT SUM(queue_count) FROM services WHERE service_role = 'M'), 0) AS rcv_service_queues_count,
+COALESCE((SELECT SUM(queue_count) FROM services WHERE service_role = 'N'), 0) AS ntf_service_queues_count
+```
+
+**Verification**:
+- Trigger logic traced for all transitions (NULL→set, change, soft-delete, physical delete) — correct.
+- FK `rcv_service_id REFERENCES services(service_id)` guarantees every non-NULL value maps to a row.
+- `queue_count + p_change` is atomic under READ COMMITTED — concurrent-safe.
+- `update_all_aggregates()` exists as repair mechanism.
+- `services` table has <10 rows — SUM is O(1) vs full table scan.
+
+**Savings**: Eliminates 2 of 4 msg_queues scans (③④ → trivial SUM on <10 rows).
+Exact savings unknown — if ③④ already use indexes efficiently, savings may be modest.
+`EXPLAIN ANALYZE` needed to measure actual per-subquery cost.
+
+---
+
+### 2. expire_old_messages: remove trailing COUNTs
+
+**Stored procedure** (`Migrations.hs`), at the end of `expire_old_messages`:
+
+```sql
+r_expired_msgs_count := total_deleted;
+r_stored_msgs_count := (SELECT COUNT(1) FROM messages);           -- 13-114ms per call (CSV)
+r_stored_queues := (SELECT COUNT(1) FROM msg_queues WHERE deleted_at IS NULL);  -- 544-719ms per call (CSV)
+```
+
+**Performance**: 10 calls per observation window. Per-call cost of these COUNTs (from CSV):
+
+| Server | expire_old_messages avg | r_stored_queues avg | r_stored_msgs avg | COUNTs combined | % of procedure |
+|---|---|---|---|---|---|
+| A | 11,798ms | 719ms | 114ms | 833ms | 7.1% |
+| B | 11,242ms | 544ms | 13ms | 557ms | 5.0% |
+| C | 7,296ms | 588ms | 67ms | 655ms | 9.0% |
+
+**How results are used** (`Server.hs:485-488`):
+
+```haskell
+Right msgStats@MessageStats {storedMsgsCount = stored, expiredMsgsCount = expired} -> do
+  atomicWriteIORef (msgCount stats) stored     -- resets msgCount from storedMsgsCount
+  atomicModifyIORef'_ (msgExpired stats) (+ expired)
+  printMessageStats "STORE: messages" msgStats  -- logs all three fields
+```
+
+- `expiredMsgsCount` — computed incrementally in the loop. **Needed, already cheap.**
+- `storedMsgsCount` — used to reset `msgCount` stat. But `msgCount` is also maintained
+  incrementally (`+1` on send at line 1963, `-1` on ACK at line 1916). The reset corrects drift.
+- `storedQueues` — **used only for logging**. Same value available from `getEntityCounts`
+  which runs every ~60s via Prometheus.
+
+**Fix**: Remove both COUNTs from the stored procedure. Return only `r_expired_msgs_count`.
+
+For `storedMsgsCount`: either trust the incremental `msgCount` counter, or query
+`SELECT COUNT(1) FROM messages` separately (in parallel, not blocking the procedure).
+
+For `storedQueues`: use the value from the most recent `getEntityCounts` call.
+
+**Verification**: Traced all usages of `MessageStats` fields from `expireOldMessages` in
+`Server.hs`. `storedQueues` is only logged. `storedMsgsCount` resets a counter that's already
+maintained incrementally — removing the reset means potential drift, but the counter is
+corrected on next server restart anyway.
+
+**Savings**: 560-830ms per expiration cycle × 10 cycles = **5.6-8.3s total** over observation window.
+
+---
+
+### 3. Trigger WHEN clause: skip function call for non-service updates
+
+**Current trigger** (`Migrations.hs:566-568`):
+
+```sql
+CREATE TRIGGER tr_queue_update
+AFTER UPDATE ON msg_queues
+FOR EACH ROW EXECUTE PROCEDURE on_queue_update();
+```
+
+Fires on **every UPDATE** to `msg_queues`. From Server C data, ~1.2M updates per observation
+window, but only ~105K (8.7%) actually change service-related fields:
+
+| UPDATE pattern | Calls | Changes service fields? |
+|---|---|---|
+| SET updated_at | ~427K | No |
+| SET msg_can_write/size/expire (write_message) | ~336K | No |
+| SET msg_can_write/size/expire (try_del_*) | ~196K | No |
+| SET msg_can_write/size/expire (delete_expired) | ~136K | No |
+| SET sender_key | ~8K | No |
+| SET status | ~2K | No |
+| **SET rcv_service_id** | **~101K** | **Yes** |
+| **SET deleted_at** | **~5K** | **Yes** |
+
+The `on_queue_update()` PL/pgSQL function evaluates 8-12 boolean conditions on every call,
+then returns without calling `update_aggregates` for 91% of invocations.
+
+**Fix**: Add a `WHEN` clause to the trigger definition. PostgreSQL evaluates `WHEN` in C code
+before calling the PL/pgSQL function — no function entry overhead at all:
+
+```sql
+CREATE TRIGGER tr_queue_update
+AFTER UPDATE ON msg_queues
+FOR EACH ROW
+WHEN (
+  OLD.deleted_at IS DISTINCT FROM NEW.deleted_at
+  OR OLD.rcv_service_id IS DISTINCT FROM NEW.rcv_service_id
+  OR OLD.ntf_service_id IS DISTINCT FROM NEW.ntf_service_id
+  OR OLD.notifier_id IS DISTINCT FROM NEW.notifier_id
+)
+EXECUTE PROCEDURE on_queue_update();
+```
+
+**Verification**:
+- PostgreSQL supports `OLD`/`NEW` in `WHEN` clauses for `AFTER UPDATE` triggers.
+- The 4 conditions match exactly the fields checked inside `on_queue_update()`.
+- When WHEN is false, the function is **never called** — zero PL/pgSQL overhead.
+- When WHEN is true, the function runs identically to today.
+- Behavioral change: **none** — same aggregates updated in same cases.
+
+**Savings**: ~1.1M PL/pgSQL function calls avoided. Each call has fixed overhead
+(function entry, OLD/NEW row extraction, condition evaluation, return). Exact savings
+need measurement, but function call overhead is non-trivial at this volume.
+
+---
+
+## Fixes that need EXPLAIN ANALYZE
+
+### 4. Partial indexes for getEntityCounts ① and ②
+
+Subqueries ① and ② still scan msg_queues. ① may use
+`idx_msg_queues_updated_at_recipient_id(deleted_at, ...)` but ② has no index covering
+both `deleted_at IS NULL` and `notifier_id IS NOT NULL`. Actual plans unknown.
+
+Candidate indexes:
+
+```sql
+-- For ① queue_count: enables index-only COUNT
+CREATE INDEX idx_msg_queues_live ON msg_queues ((1)) WHERE deleted_at IS NULL;
+
+-- For ② notifier_count: enables index-only COUNT
+CREATE INDEX idx_msg_queues_live_notifier ON msg_queues ((1)) WHERE deleted_at IS NULL AND notifier_id IS NOT NULL;
+```
+
+**Trade-off**: Each index adds write overhead on every INSERT/UPDATE/DELETE touching the
+filtered columns. Need `EXPLAIN ANALYZE` to confirm the COUNT actually uses the index
+(PostgreSQL may choose seq scan if the partial index covers most rows).
+
+---
+
+## Not fixable (architectural)
+
+- **write_message / try_del_peek_msg max times (490-523ms)**: Lock contention on
+  `FOR UPDATE` of the same `recipient_id` row. Inherent to concurrent queue access — cannot
+  use `SKIP LOCKED` because these operations require the lock for correctness.
+
+- **UPDATE msg_queues SET updated_at (~430K calls, 83-90s total, 0.20ms avg)**: Per-call cost
+  is already minimal. Trigger does zero aggregate work for this pattern (verified — all
+  IS DISTINCT FROM checks fail, no `update_aggregates` called). Fix #3 eliminates even
+  the function call overhead.
+
+---
+
+## Summary
+
+| # | Fix | Per-call savings | Calls | Total savings | Verified |
+|---|-----|-----------------|-------|---------------|----------|
+| 1 | getEntityCounts ③④ → `SUM(queue_count)` | 0–158ms (unknown split across 4 subqueries) | ~2,500 | 0–395s | Correctness: yes. Savings: needs EXPLAIN ANALYZE |
+| 2 | Remove trailing COUNTs from expire_old_messages | 557–833ms (CSV-verified) | 10 | 5.6–8.3s | Yes (CSV verified) |
+| 3 | Add WHEN clause to tr_queue_update | ~0.02–0.05ms (PL/pgSQL entry overhead estimate) | ~1.1M skipped | ~22–55s | Correctness: yes. Savings: estimated, not measured |
+| 4 | Partial indexes for ①② | Unknown | ~2,500 | Unknown | No — needs EXPLAIN ANALYZE |