chore: plan 427, PR 2 of agent-first development plan#478
chore: plan 427, PR 2 of agent-first development plan#478
Conversation
Greptile SummaryThis PR populates 10 previously-stub architecture documentation files and adds GitHub issue/PR templates plus a skill update to support the agent-assisted contribution workflow (Phase 3 of plan #427). The documentation was written from source code analysis and is generally accurate — cross-references between docs are internally consistent and the descriptions of the plugin system, MCP subsystem, CLI layering, dataset builders, and agent introspection subsystem all match the source. Key changes:
|
| Filename | Overview |
|---|---|
| architecture/models.md | Comprehensive model subsystem doc; one factual inaccuracy in the Design Decisions section — the '429s excluded from transport retries' claim is only true for the async engine path, not the sync path. |
| architecture/dataset-builders.md | Accurate and detailed description of sequential and async dataset builder execution, DAG, CompletionTracker, and DatasetBatchManager, consistent with source code. |
| architecture/overview.md | Clear high-level architecture overview with accurate package layout, dependency direction, and end-to-end data flow description. |
| .github/PULL_REQUEST_TEMPLATE.md | New PR template with Summary, Related Issue, Changes, Testing checklist, and Checklist sections — well-structured and aligned with the updated SKILL.md. |
| .github/ISSUE_TEMPLATE/bug-report.yml | Adds optional Agent Diagnostic textarea and a checklist (two required, one optional) for structured bug reports. |
| .agents/skills/create-pr/SKILL.md | Updated to produce PR descriptions conforming to the new PR template structure (Summary, Related Issue, Changes, Testing, Checklist), with clearer section guidelines. |
Sequence Diagram
sequenceDiagram
participant G as ColumnGenerator
participant MF as ModelFacade
participant TM as ThrottledModelClient
participant RT as RetryTransport
participant API as LLM API
G->>MF: completion(messages)
MF->>TM: chat(messages)
TM->>TM: acquire_sync/async (AIMD)
TM->>RT: POST /chat/completions
alt Transient error (502/503/504)
RT-->>RT: retry (transport layer)
end
alt Rate limit 429 — async engine (strip_rate_limit_codes=True)
RT-->>TM: 429 surfaces immediately
TM->>TM: release_rate_limited (AIMD decrease)
TM->>TM: cooldown + retry
end
alt Rate limit 429 — sync engine (strip_rate_limit_codes=False)
RT-->>RT: 429 retried at transport layer
end
API-->>RT: 200 OK
RT-->>TM: response
TM->>TM: release_success (AIMD increase)
TM-->>MF: ChatCompletion
opt MCP tool calls in response
MF->>MF: MCPFacade.process_completion_response
MF->>API: tool results → next completion
end
MF-->>G: parsed result
Comments Outside Diff (1)
-
architecture/models.md, line 73 (link)Inaccurate claim: 429s are not excluded from transport retries on the sync path
The Design Decisions section says "429s excluded from transport retries" without qualification, but the actual behavior is engine-mode dependent:
- Async engine (
strip_rate_limit_codes=True): 429s are stripped from the retryable status set and surface to theThrottledModelClientAIMD feedback loop. ✓ - Sync engine (
strip_rate_limit_codes=False): 429s are retried at the transport layer transparently; there is no AIMD salvage path for the sync engine.
This split is confirmed in
packages/data-designer-engine/src/data_designer/engine/models/clients/adapters/http_model_client.py(sync path at line 107, async path at line 125) and documented in theRetryConfigdocstring inretry.py.The Key Components section above already gets this right by saying "excluded ... when
strip_rate_limit_codes=True". The Design Decisions entry should match: - Async engine (
Prompt To Fix All With AI
This is a comment left during a code review.
Path: architecture/models.md
Line: 73
Comment:
**Inaccurate claim: 429s are not excluded from transport retries on the sync path**
The Design Decisions section says "429s excluded from transport retries" without qualification, but the actual behavior is engine-mode dependent:
- **Async engine** (`strip_rate_limit_codes=True`): 429s are stripped from the retryable status set and surface to the `ThrottledModelClient` AIMD feedback loop. ✓
- **Sync engine** (`strip_rate_limit_codes=False`): 429s **are** retried at the transport layer transparently; there is no AIMD salvage path for the sync engine.
This split is confirmed in `packages/data-designer-engine/src/data_designer/engine/models/clients/adapters/http_model_client.py` (sync path at line 107, async path at line 125) and documented in the `RetryConfig` docstring in `retry.py`.
The Key Components section above already gets this right by saying "excluded ... when `strip_rate_limit_codes=True`". The Design Decisions entry should match:
```suggestion
- **429s excluded from transport retries in async mode** so rate-limit signals reach the throttle manager for AIMD backoff. In the sync engine, 429s are handled directly by the transport retry layer (no salvage queue).
```
How can I resolve this? If you propose a fix, please make it concise.Reviews (2): Last reviewed commit: "Merge branch 'main' into nmulepati/docs/..." | Re-trigger Greptile
📋 Summary
Implements PR 2 of the agent-first development plan (#427), covering Phase 3 (GitHub machinery) and architecture doc population. PR 1 (#454) restructured the top-level documentation and created the architecture stubs; this PR fills them with content and adds the GitHub templates that make the agent-assisted contribution workflow concrete.
🔗 Related Issue
PR-2 for #427
🔄 Changes
Architecture Documentation (10 files populated)
All
architecture/stubs from PR 1 are now populated with content covering overview, key components, data flow, design decisions, and cross-references:overview.md— System architecture: three-package layout, PEP 420 namespace packages, end-to-end data flow, dual execution enginesconfig.md— Config layer: builder API, column configs, discriminated unions, model configs, plugin injection, lazy importsengine.md— Engine layer: compilation pipeline, registry system, column generator hierarchy, ResourceProvidermodels.md— Model subsystem: facade pattern, AIMD throttling, retry transport, usage tracking, MCP tool loopsmcp.md— MCP subsystem: MCPIOService, session pooling, tool schema coalescing, turn limitsdataset-builders.md— Dataset builders: sequential/async execution, ExecutionGraph, CompletionTracker, DAG, DatasetBatchManagersampling.md— Sampling: DatasetGenerator, constraint system, person/entity generation, managed datasetscli.md— CLI: lazy command loading, controller/service/repo pattern, generation commandsagent-introspection.md— Agent introspection: FamilySpec, type discovery, state commands, error handlingplugins.md— Plugin system: entry-point discovery, PluginRegistry, union injection, custom columns comparisonGitHub Templates (Phase 3)
.github/PULL_REQUEST_TEMPLATE.md— New PR template: Summary, Related Issue, Changes, Testing checklist, Checklist.github/ISSUE_TEMPLATE/bug-report.yml— Added Agent Diagnostic field and investigation checklist.github/ISSUE_TEMPLATE/feature-request.yml— Added Agent Investigation field and review checklist.github/ISSUE_TEMPLATE/development-task.yml— Added Investigation/Context and Agent Plan fields.github/ISSUE_TEMPLATE/config.yml— Updated Discussions link copy to mention agent-assisted workflowSkill Template Conformance
.agents/skills/create-pr/SKILL.md— Updated to produce PR descriptions matching the new PR template structure (Summary, Related Issue, Changes, Testing, Checklist)Not Yet Addressed (from plan step list)
gh label createafter merge, or in a follow-up🔍 Attention Areas
dataset-builders.md(async engine) andmodels.md(AIMD throttling).🧪 Testing
✅ Checklist