Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 17 additions & 0 deletions docs/academic_research.md
Original file line number Diff line number Diff line change
Expand Up @@ -417,6 +417,21 @@ This document explains the cognitive and social-science mechanisms that justify

---

### 28. Active Listening — Paraphrase-Clarify-Summarize

| | |
|---|---|
| **Source** | Rogers, C. R., & Farson, R. E. (1957). *Active Listening*. Industrial Relations Center, University of Chicago. |
| **Date** | 1957 |
| **URL** | — |
| **Alternative** | McNaughton, D. et al. (2008). Learning to Listen. *Topics in Early Childhood Special Education*, 27(4), 223–231. (LAFF strategy: Listen, Ask, Focus, Find) |
| **Status** | Confirmed — foundational clinical research; widely replicated across professional and educational contexts |
| **Core finding** | Active listening — paraphrasing what was heard in the listener's own words, asking clarifying questions, then summarizing the main points and intent — reduces misunderstanding, builds trust, and confirms mutual understanding before proceeding. The three-step responding sequence (Paraphrase → Clarify → Summarize) is the operationalizable form of the broader active listening construct. |
| **Mechanism** | Paraphrasing forces the listener to reconstruct the speaker's meaning in their own language, surfacing gaps immediately. Clarifying questions address residual ambiguity. Summarizing creates a shared record that both parties can confirm or correct. Together they eliminate the assumption that "I heard" equals "I understood." Without this protocol, agents (human or AI) proceed on partial or misread requirements, producing work that is technically complete but semantically wrong. |
| **Where used** | PO summarization protocol in `scope/SKILL.md`: after each interview round, the PO must produce a "Here is what I understood" block (paraphrase → clarify → summarize) before moving to Phase 3 (Stories) or Phase 4 (Criteria). The stakeholder confirms or corrects before the PO proceeds. |

---

## Bibliography

1. Cialdini, R. B. (2001). *Influence: The Psychology of Persuasion* (rev. ed.). HarperBusiness.
Expand Down Expand Up @@ -450,3 +465,5 @@ This document explains the cognitive and social-science mechanisms that justify
29. Liu, N. F. et al. (2023). Lost in the Middle: How Language Models Use Long Contexts. *Transactions of the Association for Computational Linguistics*. arXiv:2307.03172. https://arxiv.org/abs/2307.03172
30. McKinnon, R. (2025). arXiv:2511.05850. https://arxiv.org/abs/2511.05850
31. Sharma, A., & Henley, A. (2026). Modular Prompt Optimization. arXiv:2601.04055. https://arxiv.org/abs/2601.04055
32. Rogers, C. R., & Farson, R. E. (1957). *Active Listening*. Industrial Relations Center, University of Chicago.
33. McNaughton, D., Hamlin, D., McCarthy, J., Head-Reeves, D., & Schreiner, M. (2008). Learning to Listen: Teaching an Active Listening Strategy to Preservice Education Professionals. *Topics in Early Childhood Special Education*, 27(4), 223–231.
258 changes: 258 additions & 0 deletions feedback.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,258 @@
# Workflow Improvement Feedback

Collected and clarified: 2026-04-17

---

## 1. PO Summarization Before Proceeding

**Problem:** The PO moves on too quickly without demonstrating understanding of what the stakeholder said.

**Research basis:** Active listening (Rogers & Farson, 1957) — the listener paraphrases what they heard in their own words, asks clarifying questions, then offers a concise summary of main points and intent before proceeding. This reduces misunderstanding, builds trust, and confirms the PO captured the right requirements.

**Proposed fix:** After each interview round, the PO must produce a "Here is what I understood" block before moving to stories or criteria:
1. Paraphrase the stakeholder's intent in the PO's own words
2. Identify any remaining ambiguities and ask targeted follow-up questions
3. Summarize the main points and confirm with the stakeholder before freezing discovery

This applies at Phase 1 (project discovery), Phase 2 (feature discovery), and before baseline.

---

## 2. Developer Avoids OO and Design Patterns — Code Smell Detection

**Problem:** The developer uses plain functions to avoid classes, SOLID, and Object Calisthenics. It does not smell the code to recognize when a simple function should be refactored into a class or design pattern.

**Root cause:** The rules list principles but do not teach the developer to recognize when complexity warrants a structural upgrade. The developer lacks a smell-triggered refactoring instinct.

**Proposed fix:** Add a code smell detection step to the REFACTOR phase. When a solution grows complex (e.g. a function accumulates state, multiple functions share data, behavior varies by type), the developer must ask: "Does this smell indicate I should refactor to a class or design pattern?" The answer drives the refactor, not just the line count or nesting rules.

See also: Item 5 (self-check examples) and Item 6 (ObjCal rule clarity).

---

## 3. Design Principle Priority Order Misleads

**Problem:** `YAGNI > KISS > DRY > SOLID > ObjCal > design patterns` implies that design patterns are a last resort and rarely needed. This is incorrect. Good design patterns are better than complex code.

**Python Zen:** The Zen of Python is missing from the priority order. Specifically: "Complex is better than complicated." This matters because:
- Good design patterns > complex code (patterns reduce complexity)
- Complex code > complicated code (complicated is hard to reason about)
- Complicated code > failing code (at least it runs)
- Failing code > no code (at least it exists)

**Proposed fix:** Replace the flat priority order with a quality hierarchy that reflects this:

```
1. No code (nothing implemented) ← worst
2. Failing code (broken)
3. Complicated code (hard to reason about)
4. Complex code (many parts, but clear)
5. Code following YAGNI/KISS/DRY/SOLID/OC
6. Code using appropriate design patterns ← best
```

Add the Python Zen reference: "Simple is better than complex. Complex is better than complicated." The goal is to reach level 6, not to stop at level 5 because "YAGNI says don't add patterns."

---

## 4. Architecture Approval by PO Is Hollow

**Problem:** The PO approves architecture at Step 2 but has no knowledge of ObjCal, SOLID, or whether entities are properly modeled. The PO always approves, making the gate meaningless.

**Additional problem:** The developer starts architecture for the in-progress feature without reading the full backlog. This leads to solutions that work for the current feature but require extensive rework when the next feature arrives, because the architecture did not account for the big picture.

**Proposed fix:**
- Remove PO architecture approval
- Replace with a developer self-declaration mental check covering:
1. Read all backlog and in-progress feature files (discovery + entities sections at minimum)
2. Identify entities, interactions, and constraints across all planned features
3. Declare: "I have considered the full feature set. This architecture is the best design for the known requirements."
- The developer must justify the architecture against the full feature set, not just the current feature

---

## 5. Self-Check Table Lost Generalized Examples

**Problem:** The self-check table contains examples like `_x`, but the AI treats `_x` as a literal match rather than understanding it represents any short, meaningless variable name (e.g. `_val`, `_tmp`, `_data`). The rule lacks generalization guidance.

**Proposed fix:** For each ObjCal rule (and SOLID rule), add:
- A plain-English explanation of what the rule means
- A "bad" code example showing a violation
- A "good" code example showing compliance
- A generalization note: e.g. "This applies to any single-letter or abbreviation variable name, not just `_x`"

This makes the rules learnable and independently verifiable by both the developer and the reviewer.

---

## 6. Self-Declaration Accountability Format

**Problem:** The current self-declaration checklist is passive. The developer ticks boxes without being accountable for each claim.

**Proposed format:**

```
As a [agent-alias] I declare [item] follows [rule] — YES | NO
```

**If NO:**
- The developer generates a self-correction plan for the failed item
- The developer restarts the Red-Green-Refactor cycle from the affected tests
- Affected tests are marked as rework in TODO.md (format: open to proposal — consider `[R]` prefix or a `## Rework` section, respecting the 15-line limit)
- The cycle does not proceed to the reviewer until all declarations are YES

**If all YES:** proceed to reviewer as normal.

---

## 7. Reviewer Must Independently Verify — No Blind Acceptance

**Problem:** The reviewer accepts self-declared YES claims without independently verifying them. Worse, when the reviewer does not understand a rule (e.g. "types are first class" in ObjCal), it skips the check or accepts the developer's claim unchallenged.

**Two-part fix:**

1. **ObjCal (and SOLID) rules must have plain-English explanations + code examples** (see Item 5). The reviewer should never accept a claim it cannot independently verify.

2. **Reviewer scope:** The reviewer only audits YES declarations. Self-declared NO items are already known failures — the reviewer does not need to re-report them. The reviewer's job is to try to break what the developer claims is correct.

---

## 8. PO Not Refining Enough Before Proceeding

**Problem:** The PO moves through discovery phases without pushing back, asking follow-up questions, or confirming understanding. Stories and criteria are written on incomplete understanding.

**Proposed fix:** Enforce the active listening summarization protocol (Item 1) at every phase transition. The PO must not move to Phase 3 (Stories) until the stakeholder has confirmed the PO's paraphrase is accurate. The PO must not move to Phase 4 (Criteria) until each Rule has been validated against the stakeholder's intent.

---

## 9. Workflow Verbosity — Test Output and Fail-Fast

**Problem:** The workflow has unnecessary checks, fast test output is too verbose, and there is no fail-fast limit.

**Proposed fixes:**
- Fast test path (`test-fast`) should suppress passing test output — show only failures. Follow pytest best practice: use `-q` (quiet) flag or equivalent for the fast path.
- Add a fail-fast threshold configurable in `pyproject.toml` (e.g. `--maxfail=N`). Suggested default: 5.
- Remove redundant checks that are already covered by tooling (see Item 13).

---

## 10. Offload Templated Checks to Scripts

**Problem:** Repetitive checks (e.g. verifying `@id` uniqueness, orphan detection) are performed manually by agents, consuming context and introducing error.

**Proposed fix:** Identify all templated checks currently done by agents and create scripts for them. Agents invoke the script and act on the result. Candidates include:
- `@id` uniqueness check (already partially done by `gen-tests`)
- Orphan test detection (`gen-tests --orphans`)
- Self-declaration formatting validation
- Session state consistency check

---

## 11. docs/workflow.md Is Out of Date

**Problem:** `docs/workflow.md` does not reflect the current workflow. Specifically:
- It references a separate `discovery.md` file; discovery is now merged into `.feature` files
- The feature folder structure and conventions have changed
- The self-declaration section references a 21-item checklist that may no longer match

**Proposed fix:** Update `docs/workflow.md` to reflect the current state of the workflow, including:
- Discovery merged into `.feature` file description section
- Current folder structure (`backlog/`, `in-progress/`, `completed/`)
- Current self-declaration format (post Item 6 changes)
- Removal of references to `discovery.md` as a separate file

---

## 12. Squash on Merge for Feature Branches

**Problem:** Feature branches produce many small commits (one per test). Merging into `main` with a standard merge commit preserves all of them, making history noisy.

**Proposed fix:** Feature branches into `main` should be squashed. Document this in the `pr-management` skill as a required step. One squash commit per feature, with a summary message covering all tests implemented.

---

## 13. Code Smell in Self-Declaration

**Problem:** The self-declaration checklist does not include code smell detection. A developer can declare all SOLID/ObjCal rules as YES while the code is full of smells.

**Proposed fix:** Add a code smell section to the self-declaration, covering both categories:

**Standard smells (language-agnostic):**
- Long method
- Feature envy (method uses another class's data more than its own)
- Data clumps (same group of variables appear together repeatedly)
- Primitive obsession (using primitives instead of small objects)
- Shotgun surgery (one change requires many small changes across many classes)
- Divergent change (one class changed for many different reasons)
- Middle man (class delegates most of its work)

**Python-specific smells:**
- Bare `except:` clauses
- Mutable default arguments
- LBYL (Look Before You Leap) where EAFP (Easier to Ask Forgiveness than Permission) is idiomatic
- Using `type()` instead of `isinstance()`
- Overuse of `*args` / `**kwargs` hiding interface contracts

---

## 14. Session Continuity — Pick Up Where Left Off

**Problem:** When a session ends and a new one begins, the agent cannot reliably determine the current step, cycle phase, and next action. TODO.md provides some context but lacks precision for mid-cycle resumption.

**Proposed fix:** Open to proposal. Consider extending TODO.md with a `## Cycle State` section that captures:
- Current step (1-6)
- Current `@id` under work
- Current phase (RED / GREEN / REFACTOR / SELF-DECLARE / REVIEWER / COMMITTED)
- Last completed action
- Exact next action

The session-workflow skill should enforce reading and updating this section at session start and end. The goal: any agent, in any session, can read TODO.md and know exactly what to do next without re-reading the entire feature file.

---

## 15. ID Checks Are Redundant for Agents

**Problem:** Agents manually verify `@id` uniqueness and coverage. This is already done by `gen-tests`. Duplicating this check wastes context and distracts agents from implementation cycles.

**Proposed fix:** Remove manual `@id` verification from all agent checklists. Rely on `gen-tests` for ID validation. Agents should only run `gen-tests` and act on its output.

---

## 16. Session Memory and State Tracking

**Problem:** Agents lose session state between sessions. TODO.md is a 15-line bookmark but may not capture enough metadata to track complex multi-session features reliably.

**Proposed fix:** Open to proposal. Evaluate whether TODO.md extensions (Item 14) are sufficient, or whether a separate lightweight state file (e.g. `CYCLE.md` or `.opencode/state.json`) is needed. The artifact should be:
- Machine-readable by agents
- Human-readable for debugging
- Minimal — only what is needed to resume a session

---

## 17. AGENTS.md Is Generally Outdated

**Problem:** `AGENTS.md` does not fully reflect the current workflow, conventions, and tooling.

**Proposed fix:** After all other items are resolved, perform a full pass on `AGENTS.md` to align it with:
- Current feature file structure (discovery merged into `.feature`)
- Updated self-declaration format
- Updated principle priority order
- Removal of hollow PO architecture approval
- Any new scripts or tools added

---

## 18. Developer Does Not Read Full Backlog Before Architecture

**Problem:** The developer starts implementing the in-progress feature without reading the full backlog. This produces a working solution that requires extensive rework when the next feature arrives, because the architecture did not account for future requirements.

**Concrete example:** A feature was implemented correctly in isolation, but the next feature required significant structural changes because the original architecture assumed a design that did not compose well.

**Proposed fix:** At Step 2 (Architecture), the developer must:
1. Read the discovery and entities sections of every feature in `backlog/` and `in-progress/`
2. Identify cross-feature entities, shared interfaces, and likely extension points
3. Design the architecture to accommodate the known future, not just the current feature
4. Self-declare: "I have read all backlog features and this architecture accounts for the full known feature set"

This is distinct from Item 4 (hollow PO approval) — the fix here is about the developer's reading obligation before making architectural decisions.
Loading