From 6baa52ad5ba564d2ad4c125a6becf06d7dca75b3 Mon Sep 17 00:00:00 2001 From: sokoliva Date: Fri, 8 May 2026 11:23:10 +0000 Subject: [PATCH 1/4] docs: consolidate agent guidance in AGENTS.md and improve mistake-reflection workflow --- .agents/skills/mistake-reflection/SKILL.md | 109 +++++++++++++++++++++ AGENTS.md | 54 +++++++++- GEMINI.md | 51 +--------- docs/ai/evidence_rules.md | 24 +++++ docs/ai/mandatory_checks.md | 31 ++---- 5 files changed, 200 insertions(+), 69 deletions(-) create mode 100644 .agents/skills/mistake-reflection/SKILL.md create mode 100644 docs/ai/evidence_rules.md diff --git a/.agents/skills/mistake-reflection/SKILL.md b/.agents/skills/mistake-reflection/SKILL.md new file mode 100644 index 000000000..d73be0c1e --- /dev/null +++ b/.agents/skills/mistake-reflection/SKILL.md @@ -0,0 +1,109 @@ +--- +name: mistake-reflection +description: Use when you discover you made a mistake — caught by the user, by a tool result, by your own re-reading, or by a failed check. Appends a structured entry to docs/ai/ai_learnings.md and re-reads recent entries to avoid repeats. +--- + +# Mistake Reflection + +Implements the mistake-handling step of `AGENTS.md` §"Mandatory +workflow". + +## When to load this skill + +Trigger on ANY of these, without waiting for the user to ask: + +- The user corrects a factual claim, code change, or assumption. +- A tool result contradicts something you just stated or did + (lint failure, test failure, type-check failure, file not found, + command exit non-zero on something you said would succeed). +- You re-read a file or doc and realize a prior statement was wrong + or unverified. +- You realize mid-task that you skipped a required step + (e.g. didn't read `docs/ai/coding_conventions.md` / + `docs/ai/mandatory_checks.md` at task start). +- You stated an inference as a fact without a `file:line` citation + and later had to walk it back. + +If unsure whether something counts: it counts. False positives are +cheap; false negatives are how the same mistake recurs. + +## Procedure + +Do these in order. Do NOT defer to the end of the task. + +1. **Acknowledge the mistake to the user explicitly** in the current + response. One or two sentences. No hedging, no minimization. +2. **Read recent entries** in `docs/ai/ai_learnings.md` (at minimum + the last 5 entries, or the whole file if shorter). If the current + mistake is a recurrence of an existing rule, say so explicitly and + reference the prior entry's date — do not silently duplicate. +3. **Append a new entry** to `docs/ai/ai_learnings.md` using the + template below. Append; do not rewrite existing entries. +4. **Continue the original task** only after steps 1–3 are done. + +## Entry template + +Copy this verbatim, fill in each field, append to the end of the file +(after the existing `---` separator): + +```markdown +## YYYY-MM-DD — + +- **Mistake**: What went wrong. Be concrete. Quote the wrong claim or + describe the wrong action. Include `file:line` references where + applicable. +- **Trigger**: How the mistake surfaced (user correction, tool output, + self-review). Include the specific signal if it was a tool result. +- **Root cause**: Why it happened. Distinguish between (a) missing + knowledge, (b) skipped verification step, (c) false assumption from + pattern-matching, (d) workflow gap. Avoid generic "I didn't think + carefully" — name the specific failure mode. +- **Recurrence of**: If this matches an existing rule, link to the + prior entry's date. Otherwise write "new". +- **Rule**: A concrete, checkable rule that would have prevented this. + Phrase as an imperative ("Before X, do Y"). If the rule already + exists and was violated, the rule should be about *enforcement* + (e.g. a check to add to a skill, a step to add to AGENTS.md), not a + restatement of the existing rule. +``` + +## Anti-patterns to avoid + +- **Don't restate the same lesson with new wording.** If + you'd write essentially the same rule again, the real fix is to + make the rule self-enforcing (update a skill or `AGENTS.md`), not + to add a third entry. +- **Don't let rules go stale.** When you read prior entries, flag stale + tooling references and either update them or note the staleness in your + new entry. +- **Don't write rules that depend on you remembering to follow them.** + If a rule is "remember to do X at the start of every task", it will + be skipped. Prefer rules that bind to a tool, a skill trigger, or a + CI check. +- **Don't bury the acknowledgement.** Tell the user up front in the + response that you got it wrong, before describing the fix. + +## Cleanup ritual + +Before appending, check the file's length: + +- **≥ 10 entries**: pause and propose to the user that one or more + entries be either (a) deleted (if obsolete or one-off), or (b) + promoted into the workflow somewhere it will actually be read. If + the candidate rule is about claims/citations/evidence specifically, + `docs/ai/evidence_rules.md` is a natural target — otherwise leave + the choice of destination to the user. Do this *before* adding the + new entry, so the file doesn't grow monotonically and stop being read. + +This ritual is the only mechanism preventing `ai_learnings.md` from +becoming a write-only graveyard. + +## Repo-specific notes + +- `docs/ai/ai_learnings.md` is **gitignored** (`.gitignore:15`). + Entries are local to the developer's checkout and will not be seen + by other agents or in CI. The file is for the human developer to + improve `AGENTS.md` / skills based on patterns. +- The protocol source and trigger pointer both live in `AGENTS.md` + §"Mandatory workflow". `GEMINI.md` is a deprecated stub. +- Date format is `YYYY-MM-DD` to match existing entries. diff --git a/AGENTS.md b/AGENTS.md index 05b234a01..6cc4bf2f8 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -1,3 +1,53 @@ -Always check @./GEMINI.md for the full instruction list. +# AGENTS.md + +Python SDK for the [Agent2Agent (A2A) Protocol](https://a2a-protocol.org/latest/specification/) +(`a2a` module, `a2a-sdk` distribution). It handles complex messaging, task management, +and communication across different transports (REST, gRPC, JSON-RPC). + +## Technology Stack & Architecture + +- **Language**: Python 3.10+ +- **Package Manager**: `uv` +- **Lead Transports**: Starlette (REST/JSON-RPC), gRPC +- **Data Layer**: SQLAlchemy (SQL), Pydantic (Logic/Legacy), Protobuf (Modern Messaging) +- **Key Directories**: + - `/src`: Core implementation logic. + - `/tests`: Comprehensive test suite. + - `/docs`: AI guides and migration documentation. + +## Mandatory workflow + +You MUST do all of the following: + +1. **At the start of every task that touches files**, read + `docs/ai/coding_conventions.md`, `docs/ai/mandatory_checks.md`, + and `docs/ai/evidence_rules.md`. +2. **Before declaring any task done**, run the full check sequence + in `docs/ai/mandatory_checks.md` — including for + markdown/comment/whitespace-only changes. +3. **On any mistake**, load the `mistake-reflection` skill at + `.agents/skills/mistake-reflection/SKILL.md` **before** continuing + your response. The skill appends a structured entry to + `docs/ai/ai_learnings.md` (gitignored local journal) so the user + can use those findings to improve the workflow. + + When unsure: load the skill. False positives are free; false + negatives are how the same mistake recurs. + +## Layout footguns + +- `src/a2a/types/` and `src/a2a/compat/v0_3/*_pb2*` — generated + protobuf, **do not hand-edit**. Excluded from `ty`, `ruff`, coverage. + Regenerate via `scripts/gen_proto.sh` / + `scripts/gen_proto_compat.sh`. +- `tck/`, `itk/`, `tests/` — subprojects with their own + `pyproject.toml`; not part of the main test run. +- `samples/` is minimal; real samples live in `a2aproject/a2a-samples`. + +## Optional extras + +`pyproject.toml` defines extras (`grpc`, `telemetry`, `postgresql`, +etc.). The dev group installs `a2a-sdk[all]`, so anything gated behind +an extra must still **import lazily** at runtime — the install-smoke +harness verifies this per profile. -This file exists for compatibility with tools that look for AGENTS.md. diff --git a/GEMINI.md b/GEMINI.md index e6bf43b65..6393f21d2 100644 --- a/GEMINI.md +++ b/GEMINI.md @@ -1,48 +1,7 @@ -# Agent Command Center +# GEMINI.md -## 1. Project Overview & Purpose -**Primary Goal**: This is the Python SDK for the Agent2Agent (A2A) Protocol. It allows developers to build and run agentic applications as A2A-compliant servers. It handles complex messaging, task management, and communication across different transports (REST, gRPC, JSON-RPC). -**Specification**: [A2A-Protocol](https://a2a-protocol.org/latest/specification/) +> This file exists for Gemini auto-loads. +> The source of truth for agent guidance is +> [`AGENTS.md`](./AGENTS.md). Please read that file. -## 2. Technology Stack & Architecture - -- **Language**: Python 3.10+ -- **Package Manager**: `uv` -- **Lead Transports**: Starlette (REST/JSON-RPC), gRPC -- **Data Layer**: SQLAlchemy (SQL), Pydantic (Logic/Legacy), Protobuf (Modern Messaging) -- **Key Directories**: - - `/src`: Core implementation logic. - - `/tests`: Comprehensive test suite. - - `/docs`: AI guides. - -## 3. Style Guidelines & Mandatory Checks -- **Style Guidelines**: Follow the rules in @./docs/ai/coding_conventions.md for every response involving code. -- **Mandatory Checks**: Run the commands in @./docs/ai/mandatory_checks.md after making any changes to the code and before committing. - -## 4. Mandatory AI Workflow for Coding Tasks -1. **Required Reading**: You MUST read the contents of @./docs/ai/coding_conventions.md and @./docs/ai/mandatory_checks.md at the very beginning of EVERY coding task. -2. **Initial Checklist**: Every `task.md` you create MUST include a section for **Mandatory Checks** from @./docs/ai/mandatory_checks.md. -3. **Verification Requirement**: You MUST run all mandatory checks before declaring any task finished. - -## 5. Mistake Reflection Protocol - -> [!NOTE] for Users: -> `docs/ai/ai_learnings.md` is a local-only file (excluded from git) meant to be -> read by the developer to improve AI assistant behavior on this project. Use its -> findings to improve the GEMINI.md setup. - -When you realise you have made a mistake — whether caught by the user, -by a tool, or by your own reasoning — you MUST: - -1. **Acknowledge the mistake explicitly** and explain what went wrong. -2. **Reflect on the root cause**: was it a missing check, a false assumption, skipped verification, or a gap in the workflow? -3. **Immediately append a new entry to `docs/ai/ai_learnings.md`** — this is not optional and does not require user confirmation. Do it before continuing, then update the user about the workflow change. - - **Entry format:** - - **Mistake**: What went wrong. - - **Root cause**: Why it happened. - - **Rule**: The concrete rule added to prevent recurrence. - -The goal is to treat every mistake as a signal that the workflow is -incomplete, and to improve it in place so the same mistake cannot -happen again. +See [`AGENTS.md`](./AGENTS.md). diff --git a/docs/ai/evidence_rules.md b/docs/ai/evidence_rules.md new file mode 100644 index 000000000..5da67b13a --- /dev/null +++ b/docs/ai/evidence_rules.md @@ -0,0 +1,24 @@ +# Evidence Rules + +Rules for what counts as adequate evidence when making claims about +this codebase. These are graduated learnings — promoted from +`docs/ai/ai_learnings.md` (local journal) once a rule has earned a +permanent home. + +When in doubt, the bar is: **a future agent reading your response +should be able to verify the claim from the citations alone, without +re-doing your investigation.** + +## Claims about runtime behavior + +Back any claim about how code behaves at runtime with a `file:line` +reference from a tool call in the same response, or with a runnable +demonstration. + +The citation must support the specific claim. The *existence* of code +is not evidence of its *behavior*: a function being defined doesn't +mean it's called; an exception being raised doesn't mean it +propagates; a parameter being declared doesn't mean it's honored; a +config option existing doesn't mean it takes effect. Behavior claims +require control-flow evidence (call chain, test output, log) — not +just a definition site. diff --git a/docs/ai/mandatory_checks.md b/docs/ai/mandatory_checks.md index c950cc8bf..5595d4011 100644 --- a/docs/ai/mandatory_checks.md +++ b/docs/ai/mandatory_checks.md @@ -1,25 +1,14 @@ -### Test and Fix Commands +# Mandatory Checks -Exact shell commands required to test the project and fix formatting issues. +Run in this order before declaring any task done — including for +markdown/comment/whitespace-only changes: -1. **Formatting & Linting**: - ```bash - uv run ruff check --fix - uv run ruff format - ``` +```bash +./scripts/lint.sh # ruff check --fix, ruff format, ty check +uv run pytest -2. **Type Checking**: - ```bash - uv run ty check - ``` +# Only before commit, when src/ changed: +uv run pytest --cov=src --cov-report=term-missing +``` -3. **Testing**: - ```bash - uv run pytest - ``` - -4. **Coverage**: -Only run this command after adding new source code and before committing. - ```bash - uv run pytest --cov=src --cov-report=term-missing - ``` +CI enforces `--cov-fail-under=88` on the `a2a` package. From 1affeba10c2202fb2a299dfe597c829f0b6c92b5 Mon Sep 17 00:00:00 2001 From: sokoliva Date: Fri, 8 May 2026 11:35:59 +0000 Subject: [PATCH 2/4] fixes --- AGENTS.md | 11 +---------- 1 file changed, 1 insertion(+), 10 deletions(-) diff --git a/AGENTS.md b/AGENTS.md index 6cc4bf2f8..a4fbea5b2 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -1,7 +1,7 @@ # AGENTS.md Python SDK for the [Agent2Agent (A2A) Protocol](https://a2a-protocol.org/latest/specification/) -(`a2a` module, `a2a-sdk` distribution). It handles complex messaging, task management, +(`a2a` module, `a2a-sdk` distribution). It handles complex messaging, task management, and communication across different transports (REST, gRPC, JSON-RPC). ## Technology Stack & Architecture @@ -34,15 +34,6 @@ You MUST do all of the following: When unsure: load the skill. False positives are free; false negatives are how the same mistake recurs. -## Layout footguns - -- `src/a2a/types/` and `src/a2a/compat/v0_3/*_pb2*` — generated - protobuf, **do not hand-edit**. Excluded from `ty`, `ruff`, coverage. - Regenerate via `scripts/gen_proto.sh` / - `scripts/gen_proto_compat.sh`. -- `tck/`, `itk/`, `tests/` — subprojects with their own - `pyproject.toml`; not part of the main test run. -- `samples/` is minimal; real samples live in `a2aproject/a2a-samples`. ## Optional extras From 0c8f51798830119d9ee219f825a4bdb1ff6e5399 Mon Sep 17 00:00:00 2001 From: sokoliva Date: Fri, 8 May 2026 11:37:28 +0000 Subject: [PATCH 3/4] does anybody read these? --- .agents/skills/mistake-reflection/SKILL.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/.agents/skills/mistake-reflection/SKILL.md b/.agents/skills/mistake-reflection/SKILL.md index d73be0c1e..2167c6fc7 100644 --- a/.agents/skills/mistake-reflection/SKILL.md +++ b/.agents/skills/mistake-reflection/SKILL.md @@ -100,7 +100,7 @@ becoming a write-only graveyard. ## Repo-specific notes -- `docs/ai/ai_learnings.md` is **gitignored** (`.gitignore:15`). +- `docs/ai/ai_learnings.md` is **gitignored**. Entries are local to the developer's checkout and will not be seen by other agents or in CI. The file is for the human developer to improve `AGENTS.md` / skills based on patterns. From fdc629ed01879b7a8657eca5e15ea037939dbf62 Mon Sep 17 00:00:00 2001 From: sokoliva Date: Fri, 8 May 2026 11:38:33 +0000 Subject: [PATCH 4/4] change --- .agents/skills/mistake-reflection/SKILL.md | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/.agents/skills/mistake-reflection/SKILL.md b/.agents/skills/mistake-reflection/SKILL.md index 2167c6fc7..dfb176989 100644 --- a/.agents/skills/mistake-reflection/SKILL.md +++ b/.agents/skills/mistake-reflection/SKILL.md @@ -19,8 +19,9 @@ Trigger on ANY of these, without waiting for the user to ask: - You re-read a file or doc and realize a prior statement was wrong or unverified. - You realize mid-task that you skipped a required step - (e.g. didn't read `docs/ai/coding_conventions.md` / - `docs/ai/mandatory_checks.md` at task start). + (e.g. didn't read `docs/ai/coding_conventions.md`, + `docs/ai/mandatory_checks.md`, or `docs/ai/evidence_rules.md` at + task start). - You stated an inference as a fact without a `file:line` citation and later had to walk it back.