Skip to content

Refactor/issue 349 shared imports#446

Open
wxj006007 wants to merge 11 commits into
shareAI-lab:mainfrom
wxj006007:refactor/issue-349-shared-imports
Open

Refactor/issue 349 shared imports#446
wxj006007 wants to merge 11 commits into
shareAI-lab:mainfrom
wxj006007:refactor/issue-349-shared-imports

Conversation

@wxj006007

Copy link
Copy Markdown

Reduce lesson code duplication by extracting shared mechanisms (issue #349)

Summary

Closes #349.

The issue reported that each lesson's code.py re-copied earlier lessons' code, growing from ~110 lines (s01) to 2076 lines (s20) — readers had to scroll through previously-taught material to find the new content. A prior refactor (common.py) only extracted pure boilerplate; this PR addresses the mechanism-layer duplication that was the issue's actual pain point.

Introduces a mechanisms/ package of 5 shared modules and rewrites s11-s20 to import from it, while preserving the first-appearance rule (the lesson that introduces a concept shows it inline; only later lessons abstract it).

What changed

New: mechanisms/ package (621 lines across 5 modules)

Module Origin lesson Reused by Kept inline
tasks.py s12 s13-s20
prompt_assembly.py s10 s11-s18 s19/s20 (lesson-specific extensions)
messagebus.py s15 s16-s20
background.py s13 s14-s16 s20 (variant signature)
cron.py s14 s15 s20 (65% variant, too divergent)

Each module uses a consistent pattern:

  • init_<mech>(workdir) binds path/state (mirrors how lessons already treat WORKDIR)
  • make_<mech>(...) factory for mechanisms needing lesson-specific dependencies (Background takes execute_tool; Prompt Assembly takes sections + verbose)
  • Closure-private state avoids module-level global collisions across lessons in one process

Per-lesson changes

  • s01: inlined init_env + run_repl (was importing from common on first appearance — backwards). Now self-contained, +48 lines.
  • s11-s18: Prompt Assembly inline pair → make_prompt_assembly() factory call (verbose flag matches each lesson's existing behavior)
  • s13-s16: Background Tasks state + functions → make_background(execute_tool) factory; execute_tool stays local (handler dict is lesson-specific)
  • s14/s15: Cron Scheduler → init_cron(WORKDIR) (s14 keeps inline as origin)
  • s15-s20: MessageBus class → from mechanisms.messagebus import ...
  • s13-s20: Task System dataclass + 8 functions → from mechanisms.tasks import ...
    • s17-s20 inline an enhanced claim_task (they teach the owner-check + missing-deps split; base module carries the simple version)
    • Task carries an optional worktree slot (forward-compatible: s12-s17 leave it None; s18-s20 set it via bind_task_to_worktree)

Tests

  • test_all_lessons_import_commontest_all_lessons_import_shared_modules (s01 now self-contained by design)
  • New test_task_system_not_reduplicated: asserts only s12 (the origin) defines class Task:
  • All 49 tests + 30 subtests pass on every commit

Design decisions

First-appearance rule. A concept is shown inline in its origin lesson (e.g. s12 inlines the Task System), and only later lessons import the abstraction. This keeps each origin lesson self-contained for teaching, while eliminating copy-paste in the 7-9 lessons that follow.

s20 kept inline for several mechanisms. s20 is the capstone and carries divergent variants (Background's start_background_task(block, handlers) signature; Cron with 65% similarity). Forcing these into the shared module would require parameterizing the abstractions for a single consumer — net negative.

s19/s20 keep assemble_system_prompt inline. They extend it with MCP server list / skills catalog / current time sections that the shared factory doesn't cover. Future work could parameterize extra sections if desired.

Numbers

  • 19 files changed, +899 / −1468 (net −569 lines in lessons; +621 in mechanisms/)
  • s15: 945 → 569 lines (−40%)
  • s16: 841 → 647 (−23%)
  • s17: 762 → 657 (−14%)
  • s20: 2076 → 1984 (−4%; capstone keeps most variants inline)
  • grep "class Task:" hits 1 file (s12) instead of 9

Commits

Five focused commits, one per mechanism, each independently green:

  1. refactor(issue #349): extract Task System mechanism to mechanisms/tasks.py
  2. refactor(issue #349): extract Prompt Assembly mechanism to mechanisms/prompt_assembly.py
  3. refactor(issue #349): extract MessageBus mechanism to mechanisms/messagebus.py
  4. refactor(issue #349): extract Background Tasks mechanism to mechanisms/background.py
  5. refactor(issue #349): extract Cron Scheduler mechanism to mechanisms/cron.py

What's not in this PR

  • Memory mechanism: s10-s20 only share 2 constants + 4-line read in update_context — too light to justify abstraction.
  • ProtocolState (s16-s20): only 4 consumers, and s16 is its origin; left inline.
  • s20's Cron/Background variants: divergent enough that sharing would hurt readability.

wxj006007 added 11 commits June 29, 2026 21:40
Extracts the verbatim-duplicated foundation from every lesson into common.py:
- init_env(): readline + .env + Anthropic client + MODEL_ID + WORKDIR
- make_base_tools(workdir): safe_path/run_bash/run_read/run_write/run_edit/
  run_glob as workdir-bound closures with original signatures
- BASE_TOOLS: the 5 base tool schemas + select_tools(names) helper
- run_repl(): the standard CLI REPL shared by every lesson's __main__

run_bash uses the utf-8-safe version (from s02) for correct non-ASCII output
on Windows; the dangerous-command check is intentionally absent (s01 teaches
it inline, s03+ handle it via permission/hooks).

Refs shareAI-lab#349
s01-s07 now import init_env/make_base_tools/BASE_TOOLS/run_repl from common.py
instead of duplicating the env/client setup, readline config, base tool
implementations and base tool schemas verbatim. Each lesson file keeps only
the mechanism it teaches:

  s01  run_bash inline (teaches dangerous-cmd check) + agent loop
  s02  4 file tools + dispatch map (intro lesson, kept inline)
  s03  three-gate permission pipeline
  s04  hook system
  s05  todo_write + nag reminder
  s06  subagent (fresh messages, summary only)
  s07  skill loading (catalog in SYSTEM, content on demand)

sys.path bootstrap at the top of each file makes rom common import ...
work both when run directly and when loaded by tests via spec_from_file_location.

Net -493 lines across s01-s07. Existing tests still pass (24 + 30 subtests).
s08-s11 import init_env/make_base_tools/select_tools from common.py; s10/s11
use select_tools for the 3 base tool schemas, s08/s09 keep BASE_TOOLS + their
own compaction/memory layers. s12/s13/s14 keep the task system inline (lesson
content) and use select_tools for read_file/write_file; s13/s14 re-define
run_bash(command, run_in_background=False) locally to delegate to the base.

__main__ stays inline for s11-s14 (dict-style error text blocks / s14's
queue_processor_loop daemon) where run_repl's print doesn't fit.

Fixes an import-os bug in s11 (was using a walrus __import__ workaround).
s15_agent_teams, s16_team_protocols: custom bash (run_in_background) stays
inline; read_file/write_file schemas via select_tools. run_bash redefined
locally to drop the run_in_background arg before delegating.

s17_autonomous_agents: standard bash/read_file/write_file via select_tools
(no signature drift). __main__ stays inline (consume_lead_inbox routing).

s18_worktree_isolation, s19_mcp_plugin: cwd-param drift — local _base_tools(cwd)
helper delegates to make_base_tools(cwd or WORKDIR); base tools are thin
wrappers. Standard schemas via select_tools. __main__ stays inline.

s20_comprehensive: cwd + run_in_background + offset drift — _base_tools(cwd)
helper; run_read stays local (offset param base lacks). Both BUILTIN_TOOLS and
SUB_TOOLS use select_tools for standard schemas, keeping custom bash/read_file
inline. __main__ stays inline (terminal_print thread safety).

All __main__ REPLs stay inline (event queues, inbox routing, terminal_print).
Verified: py_compile, pytest (24 passed, 30 subtests), spec_from_file_location
smoke load with correct tool counts.

Net: +157/-392 across 6 files.
tests/test_lessons_compile.py covers:
- py_compile on all 20 s*/code.py files (parametrized)
- common.py exists, compiles, and is importable with anthropic/dotenv stubs
- common.py exposes init_env, make_base_tools, BASE_TOOLS, select_tools, run_repl
- select_tools round-trips the full BASE_TOOLS set
- every lesson source contains 'from common import'

Guards against syntax errors and broken imports from the issue shareAI-lab#349 refactor.
All 48 tests pass (24 new + 24 existing, 30 subtests).
Explain in README.md (How to Read + Project Structure) and README-zh.md
(版本说明 + 项目结构) that each lesson's code.py now only contains what
that lesson teaches, while the repeated boilerplate (readline/.env/client
init, base file tools, schemas, standard REPL) lives in common.py and is
imported via 'from common import ...'.

code.py description updated from 'standalone runnable code' to
'runnable code, imports shared boilerplate from common.py'. common.py added
to the Project Structure tree.

README-ja.md left for a follow-up translation pass.
…hanisms/tasks.py

The Task System dataclass + 8 functions (create/save/load/list/get/can_start/
claim/complete) was copy-pasted verbatim into s12-s20 (9 files, ~90 lines each).
Previous common.py refactor only handled pure boilerplate; this addresses the
mechanism-layer duplication that was the issue's actual pain point.

Changes:
- Add mechanisms/tasks.py: Task dataclass + 8 functions, sourced from s12
  (first-appearance rule: s12 keeps it inline, s13+ import)
- Task carries optional  slot (forward-compatible: s12-s17 leave
  it None; s18-s20 set it via bind_task_to_worktree)
- Add get_task_json alias for s18-s20's tool schemas
- s13-s16: replace 92-line inline block with 3-line import + init_tasks
- s17: import base + inline enhanced claim_task (s17 teaches owner-check +
  missing-deps split); scan_unclaimed_tasks uses list_tasks()
- s18-s20: import base + inline same enhanced claim_task (s17's enhancement,
  carried forward); scan_unclaimed_tasks uses list_tasks()
- s01: inline init_env + run_repl (first-appearance fix — was importing from
  common on first appearance, which is backwards)
- common.py: docstring notes mirror of s01
- tests: rename test_all_lessons_import_common -> test_all_lessons_import_shared_modules
  (allow s01 self-contained); add test_task_system_not_reduplicated (asserts
  only s12 defines class Task)

Verified: 49 tests + 30 subtests pass; worktree round-trip + legacy JSON
compat verified via behavior smoke test.

Net: -714/+285 lines. grep 'class Task:' now hits only s12.

Refs: shareAI-lab#349
… mechanisms/prompt_assembly.py

The assemble_system_prompt + get_system_prompt pair (with _last_context_*
memoization globals) was copy-pasted into s10-s18. s10 is the origin (keeps
it inline). s11-s16 reused it verbatim with cache-hit logging; s17-s18
reused a silent variant. s19-s20 extend assemble_system_prompt with
lesson-specific sections (MCP, skills catalog, current time) so they
keep it inline.

Changes:
- Add mechanisms/prompt_assembly.py: make_prompt_assembly(sections, verbose)
  factory returning (assemble_system_prompt, get_system_prompt) closures
  bound to the lesson's PROMPT_SECTIONS. Closure-private memoization state
  avoids module-level global collisions.
- s11-s16: replace inline assemble+get pair (~20 lines each) with factory
  call (verbose=True, matches s10's cache-hit logging)
- s17-s18: same but verbose=False (matches their silent simplification)
- PROMPT_SECTIONS stays local in each lesson — the tools list differs per
  lesson (each adds new tools)

s19-s20 kept inline (they add MCP/skills/time sections that the shared
factory does not cover). Future work could parameterize extra sections.

Verified: 49 tests + 30 subtests pass.

Net: -176 lines across 8 files; mechanisms/prompt_assembly.py adds 76.

Refs: shareAI-lab#349
…anisms/messagebus.py

The MessageBus class (send + read_inbox + peek) was copy-pasted into s15-s20
(6 files, ~35 lines each). s15 is the origin (keeps it inline). s16 added the
optional `metadata` parameter to `send` — that variant is carried forward
here. s17-s20 reused s16's variant verbatim.

Changes:
- Add mechanisms/messagebus.py: MessageBus class + init_messagebus(workdir)
  that binds MAILBOX_DIR (same late-bind pattern as init_tasks). Carries the
  s16 variant (send with optional metadata) + s15's peek method (harmless for
  s16-s20 which don't use it).
- s16-s20: replace 30-34 line inline MessageBus block with import +
  init_messagebus + BUS + active_teammates.
- s15 keeps its own inline version (origin, first-appearance rule).

Verified: 49 tests + 30 subtests pass.

Net: -156 lines across 5 files; mechanisms/messagebus.py adds 62.

Refs: shareAI-lab#349
…o mechanisms/background.py

The background-task state (_bg_counter + dicts + lock) + is_slow_operation +
should_run_background + start_background_task + collect_background_results
+ has_pending_background was copy-pasted into s13-s16. s13 is the origin (keeps
it inline). execute_tool is lesson-specific (handler dict differs per lesson).

Changes:
- Add mechanisms/background.py: is_slow_operation + should_run_background
  (pure module-level) + make_background(execute_fn) factory returning
  (start_background_task, collect_background_results, has_pending_background)
  with closure-private state. Each lesson passes its own execute_tool.
- s14-s16: replace 60-70 line inline block with import + factory call.
  execute_tool stays local (lesson-specific handler dict). s16's execute_tool
  lives in a separate Tool Dispatch section — factory call placed there.
- s20 keeps inline (variant: start_background_task accepts handlers param).

Verified: 49 tests + 30 subtests pass.

Net: -210/+121 lines across 4 files (3 lessons + new module).

Refs: shareAI-lab#349
…mechanisms/cron.py

The Cron Scheduler (CronJob dataclass + cron matching/validation + scheduler
thread + durable persistence) was copy-pasted from s14 into s15 (217 lines,
99.4% identical). The mechanism is fully self-contained (no execute_tool
dependency).

Changes:
- Add mechanisms/cron.py: CronJob + _cron_field_matches / cron_matches /
  validate_cron / save_durable_jobs / load_durable_jobs / schedule_job /
  cancel_job / cron_scheduler_loop / consume_cron_queue / has_cron_queue /
  list_scheduled_jobs (new, for thread-safe snapshot). init_cron(workdir)
  binds DURABLE_PATH, loads durable jobs, starts scheduler thread.
- s15: replace 202-line inline Cron block with import + init_cron(WORKDIR).
  run_list_crons now calls list_scheduled_jobs() instead of touching
  cron_lock + scheduled_jobs directly.
- s14 keeps inline (origin, first-appearance rule).
- s20 keeps inline (65.6% variant — too many differences to share).

Verified: 49 tests + 30 subtests pass.

Net: -202 lines in s15; mechanisms/cron.py adds 244.

Refs: shareAI-lab#349
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

建议每节课只展示新代码,旧代码用 import 引用

1 participant