feat(loop): <promise> completion tag for early exit#46
Open
paulnsorensen wants to merge 4 commits intocomputerlovetech:mainfrom
Open
feat(loop): <promise> completion tag for early exit#46paulnsorensen wants to merge 4 commits intocomputerlovetech:mainfrom
paulnsorensen wants to merge 4 commits intocomputerlovetech:mainfrom
Conversation
Adds opt-in early-exit to the loop via a structured completion tag. When stop_on_completion_signal: true is set in frontmatter and the agent emits a matching <TEXT> tag in its output, ralphify stops after the current iteration instead of continuing to the iteration budget. - _promise.py: strict TEXT parser with whitespace normalization - Two optional frontmatter fields: completion_signal and stop_on_completion_signal - examples/promise-completion/RALPH.md: canonical example - Docs updated across getting-started, cli, how-it-works, agents, cookbook, quick-reference - Tests in test_promise.py, test_agent.py, test_cli.py, test_engine.py (~98% coverage)
#6) * feat: seed CLI adapter layer with Claude, Codex, Copilot adapters Introduces a pluggable CLIAdapter Protocol and lands the three concrete adapters that exercise it, so the abstraction ships with demonstrated value on day one. - src/ralphify/adapters/__init__.py — CLIAdapter Protocol, AdapterEvent, first-match ADAPTERS registry, select_adapter() dispatch with GenericAdapter fallback. - src/ralphify/adapters/_generic.py — blocking-path fallback adapter. - src/ralphify/adapters/claude.py — stem "claude", --output-format stream-json --verbose, parses tool_use blocks in assistant messages, extracts completion signal from terminal result event. - src/ralphify/adapters/codex.py — stem "codex", --json, maps CollabToolCall/McpToolCall/CommandExecution to tool_use events, scans TurnCompleted/TaskComplete for promise tags. - src/ralphify/adapters/copilot.py — stem "copilot" (not "gh"), --output-format json, renders_structured=False, supports_soft_windown =False. install_wind_down_hook is part of the Protocol surface but every concrete adapter currently raises NotImplementedError — the actual hook plumbing (soft wind-down, AgentHook fanout, max_turns enforcement) lands in subsequent PRs. This PR is pure additions with no engine wiring: adapters are registered and discoverable, but the run loop still uses the old hard-coded Claude path. Tests: 50 new tests covering each adapter's event parsing, command building, completion-signal extraction, and registry dispatch. Co-Authored-By: WOZCODE <contact@withwoz.com> * feat: route Claude detection and promise completion through adapter Wire the CLI adapter layer into the three sites that previously hard-coded Claude knowledge: - _agent.py: execute_agent resolves the adapter (or accepts one from the caller) and delegates flag construction to adapter.build_command. _run_agent_streaming no longer appends --output-format/--verbose itself — the cmd it receives is already flagged. - _console_emitter.py: startup-hint text picks "live activity on" vs "live output on" via adapter.renders_structured instead of a claude-binary check. - engine.py: promise completion runs through adapter.extract_completion_signal(captured_stdout, signal). ClaudeAdapter scans the terminal result event only; GenericAdapter scans full stdout to preserve behavior for unknown CLIs. The engine-side _record_promise_fragment incremental scanner is gone. Tightens ClaudeAdapter.extract_completion_signal to return False when no result event is present so embedded <promise> tags inside status or assistant JSON can't trigger early completion. * fix: address copilot review feedback on adapter seed PR - Rename `supports_soft_windown` → `supports_soft_wind_down` across the Protocol and all adapters to match the `install_wind_down_hook` / "wind-down" terminology used elsewhere (caught before it baked into the public adapter surface). - Claude and Copilot `build_command()` now detect the `--output-format` flag pair and overwrite a user-supplied value instead of treating the flag and value tokens independently (which produced invalid argv when a caller pre-set `--output-format <other>`). - Align Claude `parse_event()` docstring with actual behavior: `result` events return `kind="result"` and non-assistant events return `kind="message"` rather than `None`. - Flip Codex `renders_structured` to False so the peek panel stays in raw-line mode until the emitter is adapter-driven; Codex's streaming path currently feeds an `_IterationPanel` that only understands Claude's stream-json schema. * refactor: split renders_structured into supports_streaming and renders_structured_peek Addresses the architectural concern in PR #6 review: the old `renders_structured` flag conflated two orthogonal capabilities: 1. Whether the adapter emits a structured JSON stream that the execution path can parse into on_activity callbacks. 2. Whether the console peek panel understands the adapter's event schema and should render it via `_IterationPanel` instead of raw lines. Before the split, an adapter had to choose between a useful streaming path with an empty peek panel (Codex's previous state) or a working raw peek with no activity callbacks. The split makes the correct combination representable: - ClaudeAdapter: supports_streaming=True, renders_structured_peek=True - CodexAdapter: supports_streaming=True, renders_structured_peek=False - CopilotAdapter / GenericAdapter: both False Codex now takes the streaming path again (on_activity callbacks fire) while the peek panel stays in raw-line mode until the emitter is adapter-driven. Callers updated: - _agent.execute_agent dispatches on adapter.supports_streaming - _console_emitter consumes adapter.renders_structured_peek via _agent_renders_structured_peek (renamed) - tests/conftest autouse fixture forces both flags to False * perf: skip stdout capture when adapter exposes result_text The previous adapter seed forced capture_stdout=True on every iteration so the engine could feed the buffered transcript into adapter.extract_completion_signal. For verbose streaming agents this regressed memory vs the prior tail-scan approach, and Claude in particular was paying for a full transcript buffer it did not need: agent.result_text already carries the terminal assistant message. Add a requires_full_stdout_for_completion capability flag and a keyword-only extract_completion_signal(result_text=, stdout=, user_signal=) signature so each adapter declares its own input. Claude sets the flag False and reads result_text directly; Codex / Copilot / Generic keep the full-stdout path. The engine now gates capture on log_dir or (stop_on_completion_signal AND requires_full_stdout_for_completion), so streaming agents skip the buffer entirely unless logging is on, and non-streaming agents only buffer when the user opts into completion signalling. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> --------- Co-authored-by: WOZCODE <contact@withwoz.com> Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
) * refactor: extract adapter Protocol and registry into _protocol.py The adapter package's __init__ both imports concrete adapter modules (to populate ADAPTERS) and defines the Protocol those modules depend on. That's a latent circular-import hazard: it works today only because __init__ defines the Protocol before it imports concrete adapters. As soon as an adapter grows a dependency that could re-enter the package namespace, the order-of-execution becomes load-bearing. Split the Protocol, the AdapterEvent NamedTuple, the Literal kind enums, and the ADAPTERS list into a new leaf module _protocol.py. Concrete adapters (claude, codex, copilot, _generic) now import from _protocol directly. __init__.py re-exports the public names so existing `from ralphify.adapters import CLIAdapter` imports keep working, and keeps select_adapter plus the builtin-registration bootstrap. No behaviour change. * feat: add turn-cap event types and max_turns frontmatter fields Preparatory foundation for the forthcoming turn-cap enforcement work: no enforcement yet, just the event-type surface and the validated RunConfig fields that enforcement will read. - _events.py: TOOL_USE, ITERATION_TURN_APPROACHING_LIMIT, and ITERATION_TURN_CAPPED event kinds with typed data payloads (ToolUseData, TurnApproachingLimitData, TurnCappedData) threaded through the EventData union. - _frontmatter.py: FIELD_MAX_TURNS and FIELD_MAX_TURNS_GRACE constants. - _run_types.py: max_turns (int | None, default None) and max_turns_grace (int, default 2) fields on RunConfig. - cli.py: _validate_max_turns / _validate_max_turns_grace helpers with range and type checks; grace must be strictly less than the cap when the cap is set. _build_run_config threads both values into the returned RunConfig. - tests/test_cli_frontmatter_fields.py: unit coverage for the validators plus end-to-end _build_run_config assertions for the default, valid, and invalid cases. Nothing consumes max_turns yet; it will be read by the adapter-driven enforcement path in a follow-up PR. * fix: mock shutil.which in frontmatter field tests CI runners don't have 'claude' on PATH, so _build_run_config's agent validation was failing before reaching the turn-cap assertions.
#10) * workspace: bootstrap improve-codebase Co-authored-by: Ralphify <noreply@ralphify.co> * refactor: drop redundant `if parts else ""` — empty join already returns "" `" · ".join([])` returns `""`, so the explicit empty-case guard in `_format_params` is dead. Behavior is identical either way. Co-authored-by: Ralphify <noreply@ralphify.co> * workspace: record `_format_params` simplification + verified-live audit - iterations.md: log 4ccfa9a - backlog.md: note vulture false-positives confirmed live; add two future-win candidates to triage - coverage/_console_emitter.md: first coverage note for this module Co-authored-by: Ralphify <noreply@ralphify.co> * refactor: extract `_stop_compact_live_unlocked` to dedupe compact-Live teardown The three-line `if self._live is not None: self._live.stop(); self._live = None` pattern appeared in three call sites (`_stop_live_unlocked`, `enter_fullscreen`, `_on_iteration_ended`). Collapsed into a single helper — behavior unchanged. Co-authored-by: Ralphify <noreply@ralphify.co> * workspace: record `_stop_compact_live_unlocked` extraction + backlog note Co-authored-by: Ralphify <noreply@ralphify.co> * refactor: drop redundant `_iteration_order` list — dict insertion order suffices Python dicts preserve insertion order since 3.7, so the parallel `_iteration_order` list tracking the same sequence as `_iteration_history` was pure duplication. Archive re-orders via pop-then-insert, eviction iterates the dict (oldest-first), and enter_fullscreen's most-recent lookup uses `reversed()`. Behavior is unchanged; the only test hit was a direct field reference. Co-authored-by: Ralphify <noreply@ralphify.co> * workspace: record `_iteration_order` removal Co-authored-by: Ralphify <noreply@ralphify.co> * refactor: extract `_call_safely` helper for best-effort observer callbacks The same guarded-callback pattern (`if cb is not None: try: cb(...); except Exception: pass`, with identical "best-effort; draining must not stop" comment) was duplicated three times across `_read_agent_stream` and `_pump_stream`. Consolidate into a single `_call_safely` helper defined next to the callback type aliases. Behavior unchanged — the helper preserves the None guard, suppresses all exceptions, and only fires the callback once with the provided arguments. Co-authored-by: Ralphify <noreply@ralphify.co> * workspace: record `_call_safely` extraction + seed `_agent.py` coverage note Co-authored-by: Ralphify <noreply@ralphify.co> * refactor: drop redundant max(total - visible, 1) guard in `_scrollbar_metrics` The early return `if total <= visible` above guarantees `total > visible`, so `total - visible` is always ≥ 1. Inline the subtraction directly and note the invariant in a comment. Co-authored-by: Ralphify <noreply@ralphify.co> * workspace: record scrollbar-metrics simplification Co-authored-by: Ralphify <noreply@ralphify.co> * refactor: drop redundant empty-dict guard in `_format_categories` `" · ".join([])` already returns `""`, so the early-return on an empty `_tool_categories` dict was dead code — same pattern as 4ccfa9a's `_format_params` cleanup. Co-authored-by: Ralphify <noreply@ralphify.co> * workspace: record `_format_categories` empty-dict guard removal Co-authored-by: Ralphify <noreply@ralphify.co> * refactor: extract `_step_iteration` to dedupe prev/next iteration browsing `prev_iteration` and `next_iteration` were 12-line mirror images differing only in step direction and eviction-fallback choice (oldest vs newest). Extracted to a single direction-parametric helper — behavior preserved. Co-authored-by: Ralphify <noreply@ralphify.co> * workspace: record `_step_iteration` extraction Co-authored-by: Ralphify <noreply@ralphify.co> * refactor: use public `iteration_id` property in history eviction The one cross-class `_iteration_id` access in the module now goes through the public property, keeping `_FullscreenPeek`'s private attribute private. Co-authored-by: Ralphify <noreply@ralphify.co> * workspace: record `iteration_id` property-access cleanup Co-authored-by: Ralphify <noreply@ralphify.co> * refactor: drop `_reset_view` — identical to `scroll_to_bottom` `_FullscreenPeek._reset_view` set `_offset = 0` and `_auto_scroll = True`, the same two assignments as `scroll_to_bottom`. Dropped the private helper and called `scroll_to_bottom()` from the two `_step_iteration` sites instead. Added the docstring that used to live on `_reset_view` to the surviving method so the "snap to newest line + follow" intent is still recorded. Co-authored-by: Ralphify <noreply@ralphify.co> * workspace: record `_reset_view` removal Co-authored-by: Ralphify <noreply@ralphify.co> * refactor: defer `panel_for` guard to `is_live` to dedupe identical condition Both methods checked `self._current_iteration == iteration_id and self._active_renderable is not None`. Rewrite `panel_for` to call `is_live` so the condition lives in exactly one place — if the "this is the live iteration" check ever needs to grow (e.g. add a paused state), `panel_for` follows automatically. Co-authored-by: Ralphify <noreply@ralphify.co> * workspace: record `panel_for` / `is_live` dedup Co-authored-by: Ralphify <noreply@ralphify.co> * refactor: inline `total_in` alias in `_format_tokens` The local `total_in` variable was a pointless rename of `self._input_tokens` — it hinted at an aggregated "total" that no longer exists (cache-read tokens are intentionally excluded). Read `self._input_tokens` directly so the label reflects what the value actually is. Co-authored-by: Ralphify <noreply@ralphify.co> * workspace: record `total_in` alias inlining Co-authored-by: Ralphify <noreply@ralphify.co> * refactor: drop redundant f-string wrap around `_plural` result `_plural` already returns a str, so `f"{_plural(...)}"` constructs the same string via an extra format step. Pass the value straight through, matching the style of the sibling `header.append(...)` calls. Co-authored-by: Ralphify <noreply@ralphify.co> * workspace: record `_plural` f-string wrap cleanup Co-authored-by: Ralphify <noreply@ralphify.co> * refactor: drop redundant empty-user_args branch in `resolve_args` `_ARGS_RE.sub` with a callable that returns `dict.get(name, "")` already resolves every placeholder to the empty string when the dict is empty, so the `if not user_args: return _ARGS_RE.sub("", prompt)` early return was dead code producing the same output as the general path. Co-authored-by: Ralphify <noreply@ralphify.co> * workspace: record `resolve_args` empty-branch cleanup Co-authored-by: Ralphify <noreply@ralphify.co> * refactor: promote 40-line fallback height to `_DEFAULT_CONSOLE_HEIGHT` constant Both `_FullscreenPeek._console_height` (pre-render default) and `_fullscreen_page_size`'s except-branch fallback used the literal 40 to mean "reasonable terminal height when the real value isn't available." Naming the shared intent in one place keeps the two in lockstep. Co-authored-by: Ralphify <noreply@ralphify.co> * workspace: record `_DEFAULT_CONSOLE_HEIGHT` constant promotion Shift current phase from Phase 1 (dead code — exhausted) to Phase 3 (magic values), update the _console_emitter.py coverage note with the new sha, and log the iteration. Co-authored-by: Ralphify <noreply@ralphify.co> * refactor: narrow `name_col` scope to `if arg:` in `_apply_assistant` The tool_use branch computed the padded `name_col` unconditionally, then only used it when `arg` was truthy — the `else` branch uses the raw `name` instead. Move the pad-to-column formatting inside `if arg:` so the helper variable only exists where it's actually rendered, and no unnecessary f-string work happens when a tool call has no primary arg. Co-authored-by: Ralphify <noreply@ralphify.co> * workspace: record `name_col` scope narrowing and open Phase 4 Phase 3 (magic values) is essentially drained — mark it so and open Phase 4 (complex conditionals & long functions) as the current phase. The 134078d narrowing is the first concrete Phase 4 item. Co-authored-by: Ralphify <noreply@ralphify.co> * refactor: drop redundant `_active_renderable` guard in `_on_iteration_started` `_archive_current_iteration_unlocked` already no-ops when no iteration is active (guarded at its entry), so wrapping the call in `if self._active_renderable is not None:` added nothing. Dropped the `if` and updated the surrounding comment to note the archive call's no-op behavior. Same shape as 5337d88 and 4ccfa9a's dead-guard removals. Co-authored-by: Ralphify <noreply@ralphify.co> * workspace: record `_on_iteration_started` guard removal Co-authored-by: Ralphify <noreply@ralphify.co> * refactor: move `_structured_agent` check out of lock in `_on_agent_output_line` Mirrors the pattern already used in `_on_agent_activity`: the `_structured_agent` flag is write-once (set in `_on_run_started` before any iteration events can flow), so the short-circuit check is safe lock-free. Also avoids acquiring the console lock for every stdout line when running Claude in structured mode. Co-authored-by: Ralphify <noreply@ralphify.co> * workspace: record `_on_agent_output_line` lock-free check move Co-authored-by: Ralphify <noreply@ralphify.co> * refactor: skip dead `"".join(...)` in `_run_agent_streaming` when not logging The trailing block joined `stream.stdout_lines` and `stderr_lines` unconditionally, then the AgentResult ternary discarded both joined strings when `log_dir is None` (they were only ever consumed by `_write_log` and the `captured_stdout`/`captured_stderr` fields). Fold the `if log_dir is not None` check up into the join so the no-log path skips the work entirely — same shape as the already-lazy `_run_agent_blocking` tail, which uses `"".join(...) if x is not None else None` exactly once. Drops the duplicated ternary on the AgentResult fields too. Same observable behavior (verified by `test_captured_output_set_when_logging` and `test_no_log_when_dir_not_set`). Co-authored-by: Ralphify <noreply@ralphify.co> * workspace: record `_run_agent_streaming` lazy-join refactor Co-authored-by: Ralphify <noreply@ralphify.co> * refactor: drop `line_count` alias in `_IterationSpinner._build_footer` The local was computed unconditionally but only read on the truthy branch (both as predicate and as `_plural` arg). Inline the truthy check on `self._scroll_lines` and call `len()` only inside the branch that needs it — matches the sibling `_IterationPanel._build_footer` style (`if self._tool_count > 0:` with direct attribute references). Co-authored-by: Ralphify <noreply@ralphify.co> * workspace: record `_IterationSpinner._build_footer` alias drop Co-authored-by: Ralphify <noreply@ralphify.co> * refactor: inline `agent` alias in `_on_run_started` The local was read exactly once — the only use was the immediately following `_is_claude_command(agent)` call. Reading `data["agent"]` directly matches the style of the sibling fc5e1cb inline of `total_in` and keeps the function focused on what it does. Co-authored-by: Ralphify <noreply@ralphify.co> * workspace: record `agent` alias inline in `_on_run_started` Co-authored-by: Ralphify <noreply@ralphify.co> * refactor: drop `new_offset` alias in `_FullscreenPeek.scroll_down` The local was assigned immediately to `self._offset`, then the follow-mode check re-read it as `new_offset == 0`. Reading `self._offset` directly after assignment has identical semantics — the alias added no value. Sibling `scroll_up` keeps its local because it compares the new value against the old one *before* the assignment (to conditionally disable auto-scroll), which requires two distinct values. `scroll_down` has no such comparison, so the alias is dead. Same shape as ef176bf (`line_count`) and 134078d (`name_col`). Co-authored-by: Ralphify <noreply@ralphify.co> * workspace: record `_FullscreenPeek.scroll_down` alias drop Co-authored-by: Ralphify <noreply@ralphify.co> * refactor: inline `msg` alias in `_apply_assistant` The `msg = raw.get("message", {})` local was read exactly once on the next line as `msg.get("usage")`. Reading `raw.get("message", {}).get("usage")` directly matches the chained-get style already used in `_iter_content_blocks` and the inline-alias style established by 497c028 (`agent`) and fc5e1cb (`total_in`). Co-authored-by: Ralphify <noreply@ralphify.co> * workspace: record `msg` alias inline in `_apply_assistant` Co-authored-by: Ralphify <noreply@ralphify.co> * refactor: inline `_fullscreen_page_size()` into space/b lambdas The `page` local in `_handle_fullscreen_key` was computed unconditionally in the non-exit branch but only consumed by the space/b action lambdas (page-down / page-up). Inlining `self._fullscreen_page_size()` into those two lambdas means it's only evaluated when space or b is actually pressed — j/k/g/G/[/] now skip the call entirely. Behavior is identical: the dict is built and a single action is invoked per keypress, and the page value is read at action-invocation time under the same lock. Same Phase 4 shape as 134078d / ef176bf / b19625e — narrow the scope of a local that was only live on one branch. Co-authored-by: Ralphify <noreply@ralphify.co> * workspace: record `_fullscreen_page_size()` lambda-inline refactor Co-authored-by: Ralphify <noreply@ralphify.co> * refactor: use `next(reversed(...), None)` in `enter_fullscreen` history fallback The compound `if initial_id is None and self._iteration_history:` guard was doing two jobs: pick the fallback only when there's no live iteration, *and* avoid `next(reversed({}))` raising StopIteration on an empty dict. Switching to `next(reversed(...), None)` lets the standard sentinel default handle the empty case so the outer `if` only encodes the "fall back when nothing is live" intent. Same observable behavior: when both `_current_iteration` is None and `_iteration_history` is empty, `initial_id` ends up None and the existing `if initial_id is None or self.panel_for(...) is None:` branch prints "Full peek: no iterations yet" and returns False — covered by `test_enter_without_iteration_prints_hint`. Co-authored-by: Ralphify <noreply@ralphify.co> * workspace: record `enter_fullscreen` history-fallback simplification Co-authored-by: Ralphify <noreply@ralphify.co> * refactor: drop redundant BOM-startswith guard before `removeprefix` `str.removeprefix` is already a no-op when the prefix is absent — the `if text.startswith(_UTF8_BOM)` guard was dead defensive code. Same shape as 5337d88 / 4ccfa9a / 8cb0d47's "drop the guard when the operation already handles the empty/no-match case." Co-authored-by: Ralphify <noreply@ralphify.co> * workspace: record BOM-guard drop in `parse_frontmatter` Co-authored-by: Ralphify <noreply@ralphify.co> * refactor: replace `parsed = None` sentinel with try/except/else in `_read_agent_stream` The sentinel was a no-op placeholder for the error path: setting `parsed = None` just so `isinstance(parsed, dict)` would skip the forwarding block on the next line. Using the `try/except/else` structure expresses the same intent directly — the dict-handling logic runs only when `json.loads` succeeds — and drops the sentinel assignment plus the redundant dict check on the JSON-error path. Behavior preserved: the inner `isinstance(parsed, dict)` still gates forwarding for valid-but-non-dict JSON (lists, strings, numbers). Covered by the existing stream tests (`test_agent.py::test_stream_json_*`). Co-authored-by: Ralphify <noreply@ralphify.co> * workspace: record try/except/else refactor in `_read_agent_stream` Co-authored-by: Ralphify <noreply@ralphify.co> * refactor: replace `source._outcome` cross-class access with public `outcome` property `_FullscreenPeek._build_header` was reading `source._outcome` directly on a `_LivePanelBase` instance, crossing class boundaries into a private attribute. Expose an `outcome` property on `_LivePanelBase` (matching the `iteration_id` property added in ef9a178 for the parallel case) so the fullscreen header reads the value through a public API. Same observable behavior; the attribute is still the single source of truth inside `freeze`. Co-authored-by: Ralphify <noreply@ralphify.co> * workspace: record `outcome` property refactor on `_LivePanelBase` Iteration note, coverage update, and backlog entry for d0060b3 — the ef9a178-style public-property substitution applied to the remaining `source._outcome` cross-class access in the fullscreen header. Co-authored-by: Ralphify <noreply@ralphify.co> * refactor: narrow `secs` scope into `if minutes < _MINUTES_PER_HOUR:` branch `secs = total % _SECONDS_PER_MINUTE` was computed unconditionally but only consumed by the `f"{minutes}m {secs}s"` return on the truthy branch. The hours branch derives its own `mins`/`hours` pair and never references `secs`. Moving the computation inside the branch that uses it keeps the local co-located with its use site. Same Phase 4 narrow-the-scope shape as 134078d (`name_col`), ef176bf (`line_count`), 59b0e34 (`page`), b19625e (`new_offset`). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Co-authored-by: Ralphify <noreply@ralphify.co> * workspace: record `secs` scope-narrowing in `format_duration` Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Co-authored-by: Ralphify <noreply@ralphify.co> * refactor: inline `text` alias in `collect_output` The local was assigned from `ensure_str(stream)` and read exactly once on the next line as `parts.append(text)`. Inlining matches the alias-inline style from fc5e1cb / 497c028 / 52e0272 — the helper name already documents the decode step. Co-authored-by: Ralphify <noreply@ralphify.co> * workspace: record `text` alias inline in `collect_output` Co-authored-by: Ralphify <noreply@ralphify.co> * refactor: narrow `line` scope past early-return in `_on_agent_output_line` `escape_markup(data["line"])` was computed unconditionally before the `if not isinstance(target, _IterationSpinner): return` guard, so the work was wasted whenever the event hit the early-return branch (no active panel, or structured panel for a non-structured-agent edge case). Moving the binding below the guard keeps the same output for the _IterationSpinner branch and drops the dead work for the other branch. Same Phase 4 shape as 134078d's `name_col` narrowing and 7730dd4's `secs` narrowing. Co-authored-by: Ralphify <noreply@ralphify.co> * workspace: record `line` scope narrowing in `_on_agent_output_line` Co-authored-by: Ralphify <noreply@ralphify.co> * refactor: inline `reason` alias in `run_loop` stop-event emit The `reason = state.status.reason` local was read exactly once on the next line as the `reason=` kwarg to `RunStoppedData(...)`. Collapsing to `reason=state.status.reason` matches the inline-alias pattern from ce487d3 (`text`) / 52e0272 (`msg`) / 497c028 (`agent`) / fc5e1cb (`total_in`). The property access is safe at this call site: the immediately-preceding `if state.status == RunStatus.RUNNING: state.status = RunStatus.COMPLETED` ensures `state.status` is always a terminal status (STOPPED / COMPLETED / FAILED), so `RunStatus.reason` cannot raise. Co-authored-by: Ralphify <noreply@ralphify.co> * workspace: record `reason` alias inline in `run_loop` Added coverage/engine.md noting the inline-alias refactor and the safety argument for the `state.status.reason` property access. Co-authored-by: Ralphify <noreply@ralphify.co> * refactor: inline `binary` alias in `_supports_stream_json` The local was read exactly once on the next line as `binary == CLAUDE_BINARY`. Collapsing to `Path(cmd[0]).stem == CLAUDE_BINARY` matches the sibling check in `_console_emitter.py:_is_claude_command`, which already uses the chained form, and the inline-alias pattern from ce487d3 / 52e0272 / 497c028 / fc5e1cb. Co-authored-by: Ralphify <noreply@ralphify.co> * workspace: record `binary` alias inline in `_supports_stream_json` Co-authored-by: Ralphify <noreply@ralphify.co> * refactor: inline `visible` alias in `_LivePanelBase._build_body` The `visible = self._scroll_lines[-_MAX_VISIBLE_SCROLL:]` local was read exactly once as the iterable of the very next `for line in ...` loop. Collapsing to `for line in self._scroll_lines[-_MAX_VISIBLE_SCROLL:]:` matches the inline-alias pattern from 497c028 / fc5e1cb / 52e0272 / ce487d3 / e1ad87a — the slice already documents "the last N scroll lines" without needing a named intermediate. Co-authored-by: Ralphify <noreply@ralphify.co> * workspace: record `visible` alias inline in `_build_body` Co-authored-by: Ralphify <noreply@ralphify.co> * refactor: inline `reader` thread handle in `_read_agent_stream` The `reader` local served only to call `.start()` — the thread is never joined explicitly because termination is signalled through the queue's None sentinel and the daemon flag. Collapsing into the fluent `Thread(...).start()` form matches the fire-and-forget intent and drops an unused binding. Python's `threading` module keeps live threads reachable via `threading._active`, so dropping the local reference does not affect thread lifetime. Co-authored-by: Ralphify <noreply@ralphify.co> * workspace: record `reader` thread handle inline in `_read_agent_stream` Co-authored-by: Ralphify <noreply@ralphify.co> * refactor: inline `remaining` alias in `_read_agent_stream` timeout calc The `remaining = deadline - time.monotonic()` local served a single use on the very next line as `max(remaining, 0)`. Collapsing to `max(deadline - time.monotonic(), 0)` matches the inline-alias style established by e1ad87a (`binary`), 497c028 (`agent`), fc5e1cb (`total_in`), 52e0272 (`msg`), and ce487d3 (`text`). Updated the adjacent comment to refer to "clamp to 0" since the `remaining` name no longer exists. Behavior preserved — same value reaches `line_q.get(timeout=...)` on every iteration of the read loop. Co-authored-by: Ralphify <noreply@ralphify.co> * workspace: record `remaining` alias inline in `_read_agent_stream` Co-authored-by: Ralphify <noreply@ralphify.co> * refactor: inline `stream_cmd` into Popen call in `_run_agent_streaming` The local was constructed and read exactly once on the very next statement as the first positional arg to `subprocess.Popen(...)`. The three appended tokens are already named constants (`_OUTPUT_FORMAT_FLAG`, `_STREAM_FORMAT`, `_VERBOSE_FLAG`) so the intent reads clearly at the call site without the intermediate binding. Same Phase 4 inline-alias shape as 66d6c60 (`remaining`), b24accf (`reader`), 2fda4f0 (`visible`), and e1ad87a (`binary`). Co-authored-by: Ralphify <noreply@ralphify.co> * workspace: record `stream_cmd` inline in `_run_agent_streaming` Co-authored-by: Ralphify <noreply@ralphify.co> --------- Co-authored-by: Kasper Junge <kasperjunge@Kaspers-MacBook-Pro.local> Co-authored-by: Ralphify <noreply@ralphify.co> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds opt-in early-exit to the loop via a structured completion tag. When
stop_on_completion_signal: trueis set in frontmatter and the agentemits a matching
<promise>TEXT</promise>tag in its output, ralphifystops after the current iteration instead of continuing to the iteration
budget. The default inner text is
RALPH_PROMISE_COMPLETE; override itwith the optional
completion_signalfield.Why the
<promise>tag formatThe tag format intentionally mirrors what the Claude Code ralphify
plugin (Ralph-Wiggum) already uses to signal structured completion
back to its harness. Aligning on the same
<promise>...</promise>shapemeans a single prompt body works on both sides — a ralph written against
the Claude plugin can stop a ralphify loop with no adapter code, and
vice versa. Ralphify still runs its own command/prompt loop; only the
completion-tag format is shared.
What's new
src/ralphify/_promise.py— strict<promise>TEXT</promise>parserwith whitespace normalization (bare marker text no longer counts)
completion_signal— inner text of the promise tag (defaultRALPH_PROMISE_COMPLETE)stop_on_completion_signal— bool gate for early exit (defaultfalse)examples/promise-completion/RALPH.md— canonical example, surfacedfrom
ralph scaffoldhelp and the cookbookralph run --helpandralph scaffold --helpnow explain the patterncookbook, quick-reference
tests/test_promise.py,tests/test_agent.py,tests/test_cli.py,tests/test_engine.py(changed modules land at ~98% coverage)
Backwards compatibility
Default behavior is unchanged. Loops still run to their iteration budget
unless both
stop_on_completion_signal: trueand the configured tagare present. Existing ralphs are untouched.
Test plan
uv run pytest— 661 passed, 1 xpasseduv run ruff check .uv run ruff format --check .uv run ty checkuv run mkdocs build --strictexamples/promise-completion/exits early on<promise>COMPLETE</promise>Note on help-text test robustness
tests/test_cli.py::TestHelpasserts that<promise>...</promise>appears in
--helpoutput. Rich wraps help text differently in headlessCI vs. a local TTY, which splits the tag across lines and breaks a
naïve substring check. The tests strip ANSI codes and collapse
whitespace before asserting, so they're stable on both sides.