From 6baa52ad5ba564d2ad4c125a6becf06d7dca75b3 Mon Sep 17 00:00:00 2001
From: sokoliva <sokolaj@google.com>
Date: Fri, 8 May 2026 11:23:10 +0000
Subject: [PATCH 1/4] docs: consolidate agent guidance in AGENTS.md and improve
 mistake-reflection workflow

---
 .agents/skills/mistake-reflection/SKILL.md | 109 +++++++++++++++++++++
 AGENTS.md                                  |  54 +++++++++-
 GEMINI.md                                  |  51 +---------
 docs/ai/evidence_rules.md                  |  24 +++++
 docs/ai/mandatory_checks.md                |  31 ++----
 5 files changed, 200 insertions(+), 69 deletions(-)
 create mode 100644 .agents/skills/mistake-reflection/SKILL.md
 create mode 100644 docs/ai/evidence_rules.md
diff --git a/.agents/skills/mistake-reflection/SKILL.md b/.agents/skills/mistake-reflection/SKILL.md
new file mode 100644
index 000000000..d73be0c1e
--- /dev/null
+++ b/.agents/skills/mistake-reflection/SKILL.md
@@ -0,0 +1,109 @@
+---
+name: mistake-reflection
+description: Use when you discover you made a mistake — caught by the user, by a tool result, by your own re-reading, or by a failed check. Appends a structured entry to docs/ai/ai_learnings.md and re-reads recent entries to avoid repeats.
+---
+
+# Mistake Reflection
+
+Implements the mistake-handling step of `AGENTS.md` §"Mandatory
+workflow".
+
+## When to load this skill
+
+Trigger on ANY of these, without waiting for the user to ask:
+
+- The user corrects a factual claim, code change, or assumption.
+- A tool result contradicts something you just stated or did
+  (lint failure, test failure, type-check failure, file not found,
+  command exit non-zero on something you said would succeed).
+- You re-read a file or doc and realize a prior statement was wrong
+  or unverified.
+- You realize mid-task that you skipped a required step
+  (e.g. didn't read `docs/ai/coding_conventions.md` /
+  `docs/ai/mandatory_checks.md` at task start).
+- You stated an inference as a fact without a `file:line` citation
+  and later had to walk it back.
+
+If unsure whether something counts: it counts. False positives are
+cheap; false negatives are how the same mistake recurs.
+
+## Procedure
+
+Do these in order. Do NOT defer to the end of the task.
+
+1. **Acknowledge the mistake to the user explicitly** in the current
+   response. One or two sentences. No hedging, no minimization.
+2. **Read recent entries** in `docs/ai/ai_learnings.md` (at minimum
+   the last 5 entries, or the whole file if shorter). If the current
+   mistake is a recurrence of an existing rule, say so explicitly and
+   reference the prior entry's date — do not silently duplicate.
+3. **Append a new entry** to `docs/ai/ai_learnings.md` using the
+   template below. Append; do not rewrite existing entries.
+4. **Continue the original task** only after steps 1–3 are done.
+
+## Entry template
+
+Copy this verbatim, fill in each field, append to the end of the file
+(after the existing `---` separator):
+
+```markdown
+## YYYY-MM-DD — <one-line summary>
+
+- **Mistake**: What went wrong. Be concrete. Quote the wrong claim or
+  describe the wrong action. Include `file:line` references where
+  applicable.
+- **Trigger**: How the mistake surfaced (user correction, tool output,
+  self-review). Include the specific signal if it was a tool result.
+- **Root cause**: Why it happened. Distinguish between (a) missing
+  knowledge, (b) skipped verification step, (c) false assumption from
+  pattern-matching, (d) workflow gap. Avoid generic "I didn't think
+  carefully" — name the specific failure mode.
+- **Recurrence of**: If this matches an existing rule, link to the
+  prior entry's date. Otherwise write "new".
+- **Rule**: A concrete, checkable rule that would have prevented this.
+  Phrase as an imperative ("Before X, do Y"). If the rule already
+  exists and was violated, the rule should be about *enforcement*
+  (e.g. a check to add to a skill, a step to add to AGENTS.md), not a
+  restatement of the existing rule.
+```
+
+## Anti-patterns to avoid
+
+- **Don't restate the same lesson with new wording.**  If
+  you'd write essentially the same rule again, the real fix is to
+  make the rule self-enforcing (update a skill or `AGENTS.md`), not
+  to add a third entry.
+- **Don't let rules go stale.**  When you read prior entries, flag stale 
+  tooling references and either update them or note the staleness in your 
+  new entry.
+- **Don't write rules that depend on you remembering to follow them.**
+  If a rule is "remember to do X at the start of every task", it will
+  be skipped. Prefer rules that bind to a tool, a skill trigger, or a
+  CI check.
+- **Don't bury the acknowledgement.** Tell the user up front in the
+  response that you got it wrong, before describing the fix.
+
+## Cleanup ritual
+
+Before appending, check the file's length:
+
+- **≥ 10 entries**: pause and propose to the user that one or more
+  entries be either (a) deleted (if obsolete or one-off), or (b)
+  promoted into the workflow somewhere it will actually be read. If
+  the candidate rule is about claims/citations/evidence specifically,
+  `docs/ai/evidence_rules.md` is a natural target — otherwise leave
+  the choice of destination to the user. Do this *before* adding the 
+  new entry, so the file doesn't grow monotonically and stop being read.
+
+This ritual is the only mechanism preventing `ai_learnings.md` from
+becoming a write-only graveyard.
+
+## Repo-specific notes
+
+- `docs/ai/ai_learnings.md` is **gitignored** (`.gitignore:15`).
+  Entries are local to the developer's checkout and will not be seen
+  by other agents or in CI. The file is for the human developer to
+  improve `AGENTS.md` / skills based on patterns.
+- The protocol source and trigger pointer both live in `AGENTS.md`
+  §"Mandatory workflow". `GEMINI.md` is a deprecated stub.
+- Date format is `YYYY-MM-DD` to match existing entries.
diff --git a/AGENTS.md b/AGENTS.md
index 05b234a01..6cc4bf2f8 100644
--- a/AGENTS.md
+++ b/AGENTS.md
@@ -1,3 +1,53 @@
-Always check @./GEMINI.md for the full instruction list.
+# AGENTS.md
+
+Python SDK for the [Agent2Agent (A2A) Protocol](https://a2a-protocol.org/latest/specification/)
+(`a2a` module, `a2a-sdk` distribution).  It handles complex messaging, task management, 
+and communication across different transports (REST, gRPC, JSON-RPC).
+
+## Technology Stack & Architecture
+
+- **Language**: Python 3.10+
+- **Package Manager**: `uv`
+- **Lead Transports**: Starlette (REST/JSON-RPC), gRPC
+- **Data Layer**: SQLAlchemy (SQL), Pydantic (Logic/Legacy), Protobuf (Modern Messaging)
+- **Key Directories**:
+    - `/src`: Core implementation logic.
+    - `/tests`: Comprehensive test suite.
+    - `/docs`: AI guides and migration documentation.
+
+## Mandatory workflow
+
+You MUST do all of the following:
+
+1. **At the start of every task that touches files**, read
+   `docs/ai/coding_conventions.md`, `docs/ai/mandatory_checks.md`,
+   and `docs/ai/evidence_rules.md`.
+2. **Before declaring any task done**, run the full check sequence
+   in `docs/ai/mandatory_checks.md` — including for
+   markdown/comment/whitespace-only changes.
+3. **On any mistake**, load the `mistake-reflection` skill at
+   `.agents/skills/mistake-reflection/SKILL.md` **before** continuing
+   your response. The skill appends a structured entry to
+   `docs/ai/ai_learnings.md` (gitignored local journal) so the user
+   can use those findings to improve the workflow.
+
+   When unsure: load the skill. False positives are free; false
+   negatives are how the same mistake recurs.
+
+## Layout footguns
+
+- `src/a2a/types/` and `src/a2a/compat/v0_3/*_pb2*` — generated
+  protobuf, **do not hand-edit**. Excluded from `ty`, `ruff`, coverage.
+  Regenerate via `scripts/gen_proto.sh` /
+  `scripts/gen_proto_compat.sh`.
+- `tck/`, `itk/`, `tests/` — subprojects with their own
+  `pyproject.toml`; not part of the main test run.
+- `samples/` is minimal; real samples live in `a2aproject/a2a-samples`.
+
+## Optional extras
+
+`pyproject.toml` defines extras (`grpc`, `telemetry`, `postgresql`,
+etc.). The dev group installs `a2a-sdk[all]`, so anything gated behind
+an extra must still **import lazily** at runtime — the install-smoke
+harness verifies this per profile.
 
-This file exists for compatibility with tools that look for AGENTS.md.
diff --git a/GEMINI.md b/GEMINI.md
index e6bf43b65..6393f21d2 100644
--- a/GEMINI.md
+++ b/GEMINI.md
@@ -1,48 +1,7 @@
-# Agent Command Center
+# GEMINI.md
 
-## 1. Project Overview & Purpose
-**Primary Goal**: This is the Python SDK for the Agent2Agent (A2A) Protocol. It allows developers to build and run agentic applications as A2A-compliant servers. It handles complex messaging, task management, and communication across different transports (REST, gRPC, JSON-RPC).
-**Specification**: [A2A-Protocol](https://a2a-protocol.org/latest/specification/)
+> This file exists for Gemini auto-loads.
+> The source of truth for agent guidance is
+> [`AGENTS.md`](./AGENTS.md). Please read that file.
 
-## 2. Technology Stack & Architecture
-
-- **Language**: Python 3.10+
-- **Package Manager**: `uv`
-- **Lead Transports**: Starlette (REST/JSON-RPC), gRPC
-- **Data Layer**: SQLAlchemy (SQL), Pydantic (Logic/Legacy), Protobuf (Modern Messaging)
-- **Key Directories**:
-    - `/src`: Core implementation logic.
-    - `/tests`: Comprehensive test suite.
-    - `/docs`: AI guides.
-
-## 3. Style Guidelines & Mandatory Checks
-- **Style Guidelines**: Follow the rules in @./docs/ai/coding_conventions.md for every response involving code.
-- **Mandatory Checks**: Run the commands in @./docs/ai/mandatory_checks.md after making any changes to the code and before committing.
-
-## 4. Mandatory AI Workflow for Coding Tasks
-1. **Required Reading**: You MUST read the contents of @./docs/ai/coding_conventions.md and @./docs/ai/mandatory_checks.md at the very beginning of EVERY coding task.
-2. **Initial Checklist**: Every `task.md` you create MUST include a section for **Mandatory Checks** from @./docs/ai/mandatory_checks.md.
-3. **Verification Requirement**: You MUST run all mandatory checks before declaring any task finished.
-
-## 5. Mistake Reflection Protocol
-
-> [!NOTE] for Users:
-> `docs/ai/ai_learnings.md` is a local-only file (excluded from git) meant to be
-> read by the developer to improve AI assistant behavior on this project. Use its
-> findings to improve the GEMINI.md setup.
-
-When you realise you have made a mistake — whether caught by the user,
-by a tool, or by your own reasoning — you MUST:
-
-1. **Acknowledge the mistake explicitly** and explain what went wrong.
-2. **Reflect on the root cause**: was it a missing check, a false assumption, skipped verification, or a gap in the workflow?
-3. **Immediately append a new entry to `docs/ai/ai_learnings.md`** — this is not optional and does not require user confirmation. Do it before continuing, then update the user about the workflow change.
-
-   **Entry format:**
-   - **Mistake**: What went wrong.
-   - **Root cause**: Why it happened.
-   - **Rule**: The concrete rule added to prevent recurrence.
-
-The goal is to treat every mistake as a signal that the workflow is
-incomplete, and to improve it in place so the same mistake cannot
-happen again.
+See [`AGENTS.md`](./AGENTS.md).
diff --git a/docs/ai/evidence_rules.md b/docs/ai/evidence_rules.md
new file mode 100644
index 000000000..5da67b13a
--- /dev/null
+++ b/docs/ai/evidence_rules.md
@@ -0,0 +1,24 @@
+# Evidence Rules
+
+Rules for what counts as adequate evidence when making claims about
+this codebase. These are graduated learnings — promoted from
+`docs/ai/ai_learnings.md` (local journal) once a rule has earned a
+permanent home.
+
+When in doubt, the bar is: **a future agent reading your response
+should be able to verify the claim from the citations alone, without
+re-doing your investigation.**
+
+## Claims about runtime behavior
+
+Back any claim about how code behaves at runtime with a `file:line`
+reference from a tool call in the same response, or with a runnable
+demonstration.
+
+The citation must support the specific claim. The *existence* of code
+is not evidence of its *behavior*: a function being defined doesn't
+mean it's called; an exception being raised doesn't mean it
+propagates; a parameter being declared doesn't mean it's honored; a
+config option existing doesn't mean it takes effect. Behavior claims
+require control-flow evidence (call chain, test output, log) — not
+just a definition site.
diff --git a/docs/ai/mandatory_checks.md b/docs/ai/mandatory_checks.md
index c950cc8bf..5595d4011 100644
--- a/docs/ai/mandatory_checks.md
+++ b/docs/ai/mandatory_checks.md
@@ -1,25 +1,14 @@
-### Test and Fix Commands
+# Mandatory Checks
 
-Exact shell commands required to test the project and fix formatting issues.
+Run in this order before declaring any task done — including for
+markdown/comment/whitespace-only changes:
 
-1. **Formatting & Linting**:
-   ```bash
-   uv run ruff check --fix
-   uv run ruff format
-   ```
+```bash
+./scripts/lint.sh        # ruff check --fix, ruff format, ty check
+uv run pytest
 
-2. **Type Checking**:
-   ```bash
-   uv run ty check
-   ```
+# Only before commit, when src/ changed:
+uv run pytest --cov=src --cov-report=term-missing
+```
 
-3. **Testing**:
-   ```bash
-   uv run pytest
-   ```
-
-4. **Coverage**:
-Only run this command after adding new source code and before committing.
-   ```bash
-   uv run pytest --cov=src --cov-report=term-missing
-   ```
+CI enforces `--cov-fail-under=88` on the `a2a` package.

From 1affeba10c2202fb2a299dfe597c829f0b6c92b5 Mon Sep 17 00:00:00 2001
From: sokoliva <sokolaj@google.com>
Date: Fri, 8 May 2026 11:35:59 +0000
Subject: [PATCH 2/4] fixes

---
 AGENTS.md | 11 +----------
 1 file changed, 1 insertion(+), 10 deletions(-)

diff --git a/AGENTS.md b/AGENTS.md
index 6cc4bf2f8..a4fbea5b2 100644
--- a/AGENTS.md
+++ b/AGENTS.md
@@ -1,7 +1,7 @@
 # AGENTS.md
 
 Python SDK for the [Agent2Agent (A2A) Protocol](https://a2a-protocol.org/latest/specification/)
-(`a2a` module, `a2a-sdk` distribution).  It handles complex messaging, task management, 
+(`a2a` module, `a2a-sdk` distribution).  It handles complex messaging, task management,
 and communication across different transports (REST, gRPC, JSON-RPC).
 
 ## Technology Stack & Architecture
@@ -34,15 +34,6 @@ You MUST do all of the following:
    When unsure: load the skill. False positives are free; false
    negatives are how the same mistake recurs.
 
-## Layout footguns
-
-- `src/a2a/types/` and `src/a2a/compat/v0_3/*_pb2*` — generated
-  protobuf, **do not hand-edit**. Excluded from `ty`, `ruff`, coverage.
-  Regenerate via `scripts/gen_proto.sh` /
-  `scripts/gen_proto_compat.sh`.
-- `tck/`, `itk/`, `tests/` — subprojects with their own
-  `pyproject.toml`; not part of the main test run.
-- `samples/` is minimal; real samples live in `a2aproject/a2a-samples`.
 
 ## Optional extras
 

From 0c8f51798830119d9ee219f825a4bdb1ff6e5399 Mon Sep 17 00:00:00 2001
From: sokoliva <sokolaj@google.com>
Date: Fri, 8 May 2026 11:37:28 +0000
Subject: [PATCH 3/4] does anybody read these?

---
 .agents/skills/mistake-reflection/SKILL.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/.agents/skills/mistake-reflection/SKILL.md b/.agents/skills/mistake-reflection/SKILL.md
index d73be0c1e..2167c6fc7 100644
--- a/.agents/skills/mistake-reflection/SKILL.md
+++ b/.agents/skills/mistake-reflection/SKILL.md
@@ -100,7 +100,7 @@ becoming a write-only graveyard.
 
 ## Repo-specific notes
 
-- `docs/ai/ai_learnings.md` is **gitignored** (`.gitignore:15`).
+- `docs/ai/ai_learnings.md` is **gitignored**.
   Entries are local to the developer's checkout and will not be seen
   by other agents or in CI. The file is for the human developer to
   improve `AGENTS.md` / skills based on patterns.

From fdc629ed01879b7a8657eca5e15ea037939dbf62 Mon Sep 17 00:00:00 2001
From: sokoliva <sokolaj@google.com>
Date: Fri, 8 May 2026 11:38:33 +0000
Subject: [PATCH 4/4] change

---
 .agents/skills/mistake-reflection/SKILL.md | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/.agents/skills/mistake-reflection/SKILL.md b/.agents/skills/mistake-reflection/SKILL.md
index 2167c6fc7..dfb176989 100644
--- a/.agents/skills/mistake-reflection/SKILL.md
+++ b/.agents/skills/mistake-reflection/SKILL.md
@@ -19,8 +19,9 @@ Trigger on ANY of these, without waiting for the user to ask:
 - You re-read a file or doc and realize a prior statement was wrong
   or unverified.
 - You realize mid-task that you skipped a required step
-  (e.g. didn't read `docs/ai/coding_conventions.md` /
-  `docs/ai/mandatory_checks.md` at task start).
+  (e.g. didn't read `docs/ai/coding_conventions.md`,
+  `docs/ai/mandatory_checks.md`, or `docs/ai/evidence_rules.md` at
+  task start).
 - You stated an inference as a fact without a `file:line` citation
   and later had to walk it back.