feat: 集成 DeerFlow Agent Runner 并优化流式处理#5581
feat: 集成 DeerFlow Agent Runner 并优化流式处理#5581zouyonghe wants to merge 22 commits intoAstrBotDevs:masterfrom
Conversation
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly expands the system's capabilities by integrating DeerFlow as a new third-party agent runner. It introduces a dedicated API client and runner implementation designed for robust streaming and error management, ensuring a smoother and more reliable interaction with external agent services. The changes also include necessary configuration updates and UI enhancements to support DeerFlow, making it a fully integrated option for users. Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Hey - 我发现了 1 个问题
给 AI Agent 的提示词
Please address the comments from this code review:
## Individual Comments
### Comment 1
<location path="astrbot/core/pipeline/process_stage/method/agent_sub_stages/third_party.py" line_range="95" />
<code_context>
+
+ _MAX_VALUES_HISTORY = 200
+
+ @dataclass
+ class _StreamState:
+ streamed_text: str = ""
</code_context>
<issue_to_address>
**issue (complexity):** 建议收紧输出契约,并重构流式/非流式两条路径,在保留新的 runner 语义的前提下,减少辅助层级和条件判断。
你可以在不改变现有语义(错误标记、DeerFlow 支持、runner 关闭)的前提下,通过以下方式减少几层复杂度:
1. 让 `_ThirdPartyRunnerOutput.chain` 成为非可选,并且永远不要返回 `None`。
2. 完全移除 `_iter_runner_output_chain`。
3. 简化非流式回退时的累积逻辑。
### 1. 让 `_ThirdPartyRunnerOutput.chain` 成为非可选并且不再返回 `None`
目前你之所以需要 `_iter_runner_output_chain`,只是因为 `_ThirdPartyRunnerOutput.chain` 可能为 `None`。如果你收紧契约,保证 `run_third_party_agent` 永远不返回 `chain=None`,就可以删掉这个辅助函数,并去掉一整类 `if output.chain` 的判断。
```python
@dataclass
class _ThirdPartyRunnerOutput:
chain: MessageChain
is_error: bool = False
async def run_third_party_agent(
runner: "BaseAgentRunner",
stream_to_general: bool = False,
custom_error_message: str | None = None,
) -> AsyncGenerator["_ThirdPartyRunnerOutput", None]:
try:
async for resp in runner.step_until_done(max_step=30): # type: ignore[misc]
if resp.type == "streaming_delta":
if stream_to_general:
continue
yield _ThirdPartyRunnerOutput(chain=resp.data["chain"], is_error=False)
elif resp.type == "llm_result":
if stream_to_general:
yield _ThirdPartyRunnerOutput(chain=resp.data["chain"], is_error=False)
elif resp.type == "err":
# assume resp.data["chain"] is always a MessageChain for err
yield _ThirdPartyRunnerOutput(chain=resp.data["chain"], is_error=True)
except Exception as e:
logger.error(f"Third party agent runner error: {e}")
err_msg = custom_error_message or (
f"Error occurred during AI execution.\n"
f"Error Type: {type(e).__name__} (3rd party)\n"
f"Error Message: {str(e)}"
)
yield _ThirdPartyRunnerOutput(
chain=MessageChain().message(err_msg),
is_error=True,
)
```
这样就可以删除 `_iter_runner_output_chain`,并简化所有使用它的地方。
### 2. 简化流式路径(不再使用 `_iter_runner_output_chain`,并减少检查)
在输出类型已经被归一化之后,这里不需要额外的生成器层:
```python
if streaming_response and not stream_to_general:
stream_has_runner_error = False
async def _stream_runner_chain() -> AsyncGenerator[MessageChain, None]:
nonlocal stream_has_runner_error
async for runner_output in run_third_party_agent(
runner,
stream_to_general=False,
custom_error_message=custom_error_message,
):
if runner_output.is_error:
stream_has_runner_error = True
event.set_extra(THIRD_PARTY_RUNNER_ERROR_EXTRA_KEY, True)
# chain is guaranteed non-None
yield runner_output.chain
event.set_result(
MessageEventResult()
.set_result_content_type(ResultContentType.STREAMING_RESULT)
.set_async_stream(_stream_runner_chain()),
)
yield
if runner.done():
final_resp = runner.get_final_llm_resp()
if final_resp and final_resp.result_chain:
is_runner_error = stream_has_runner_error or final_resp.role == "err"
event.set_extra(THIRD_PARTY_RUNNER_ERROR_EXTRA_KEY, is_runner_error)
event.set_result(
MessageEventResult(
chain=final_resp.result_chain.chain or [],
result_content_type=ResultContentType.STREAMING_FINISH,
),
)
```
这样可以保留你当前的所有行为(额外标记、流式 + 最终响应),但去掉了一层异步生成器跳转以及所有 `if runner_output.chain` 分支。
### 3. 简化非流式回退的累积逻辑
你可以避免维护一个 `list[MessageChain]` 然后再在最后 merge;改为在遍历时直接构建最终的 `merged_chain` 列表,同时只跟踪一个布尔值表示是否出错:
```python
else:
# 非流式响应或转换为普通响应
merged_chain: list = []
fallback_is_error = False
async for output in run_third_party_agent(
runner,
stream_to_general=stream_to_general,
custom_error_message=custom_error_message,
):
merged_chain.extend(output.chain.chain or [])
if output.is_error:
fallback_is_error = True
yield
final_resp = runner.get_final_llm_resp()
if not final_resp or not final_resp.result_chain:
if merged_chain:
logger.warning(
"Agent Runner returned no final response, fallback to streamed error/result chain."
)
content_type = (
ResultContentType.AGENT_RUNNER_ERROR
if fallback_is_error
else ResultContentType.LLM_RESULT
)
event.set_extra(
THIRD_PARTY_RUNNER_ERROR_EXTRA_KEY,
fallback_is_error,
)
event.set_result(
MessageEventResult(
chain=merged_chain,
result_content_type=content_type,
),
)
yield
return
logger.warning("Agent Runner 未返回最终结果。")
return
content_type = (
ResultContentType.AGENT_RUNNER_ERROR
if final_resp.role == "err"
else ResultContentType.LLM_RESULT
)
event.set_extra(
THIRD_PARTY_RUNNER_ERROR_EXTRA_KEY,
final_resp.role == "err",
)
event.set_result(
MessageEventResult(
chain=final_resp.result_chain.chain or [],
result_content_type=content_type,
),
)
yield
```
这样可以在保持相同行为(在回退链和 final_resp 之间做决策、错误类型和 extra 标记)的同时,减少以下内容:
- `fallback_chains` 这个 `list[MessageChain]` → 直接变为单个消息元素列表。
- `if output.chain` 检查 → 在非可选契约下可以删除。
- 结尾处的嵌套合并循环 → 不再需要。
这些调整可以让数据流更容易跟踪,同时保留你实现的所有 DeerFlow 和错误语义。
</issue_to_address>帮我变得更有用!请对每条评论点 👍 或 👎,我会根据你的反馈改进后续的 review。
Original comment in English
Hey - I've found 1 issue
Prompt for AI Agents
Please address the comments from this code review:
## Individual Comments
### Comment 1
<location path="astrbot/core/pipeline/process_stage/method/agent_sub_stages/third_party.py" line_range="95" />
<code_context>
+
+ _MAX_VALUES_HISTORY = 200
+
+ @dataclass
+ class _StreamState:
+ streamed_text: str = ""
</code_context>
<issue_to_address>
**issue (complexity):** Consider tightening the output contract and restructuring the streaming/non-streaming paths to reduce helper layers and conditional checks while preserving the new runner semantics.
You can keep all the new semantics (error flagging, DeerFlow support, runner closing) while shaving off a couple of layers of complexity, mostly by:
1. Making `_ThirdPartyRunnerOutput.chain` non-optional and never yielding `None`.
2. Removing `_iter_runner_output_chain` entirely.
3. Simplifying the non-streaming fallback accumulation.
### 1. Make `_ThirdPartyRunnerOutput.chain` non-optional and never yield `None`
Right now you need `_iter_runner_output_chain` solely because `_ThirdPartyRunnerOutput.chain` can be `None`. If you tighten the contract so `run_third_party_agent` never yields `chain=None`, you can drop this helper and a class of `if output.chain` checks.
```python
@dataclass
class _ThirdPartyRunnerOutput:
chain: MessageChain
is_error: bool = False
async def run_third_party_agent(
runner: "BaseAgentRunner",
stream_to_general: bool = False,
custom_error_message: str | None = None,
) -> AsyncGenerator["_ThirdPartyRunnerOutput", None]:
try:
async for resp in runner.step_until_done(max_step=30): # type: ignore[misc]
if resp.type == "streaming_delta":
if stream_to_general:
continue
yield _ThirdPartyRunnerOutput(chain=resp.data["chain"], is_error=False)
elif resp.type == "llm_result":
if stream_to_general:
yield _ThirdPartyRunnerOutput(chain=resp.data["chain"], is_error=False)
elif resp.type == "err":
# assume resp.data["chain"] is always a MessageChain for err
yield _ThirdPartyRunnerOutput(chain=resp.data["chain"], is_error=True)
except Exception as e:
logger.error(f"Third party agent runner error: {e}")
err_msg = custom_error_message or (
f"Error occurred during AI execution.\n"
f"Error Type: {type(e).__name__} (3rd party)\n"
f"Error Message: {str(e)}"
)
yield _ThirdPartyRunnerOutput(
chain=MessageChain().message(err_msg),
is_error=True,
)
```
With that, you can delete `_iter_runner_output_chain` and simplify usages.
### 2. Simplify streaming path (no `_iter_runner_output_chain` + fewer checks)
You don’t need an extra generator layer here once the output type is normalized:
```python
if streaming_response and not stream_to_general:
stream_has_runner_error = False
async def _stream_runner_chain() -> AsyncGenerator[MessageChain, None]:
nonlocal stream_has_runner_error
async for runner_output in run_third_party_agent(
runner,
stream_to_general=False,
custom_error_message=custom_error_message,
):
if runner_output.is_error:
stream_has_runner_error = True
event.set_extra(THIRD_PARTY_RUNNER_ERROR_EXTRA_KEY, True)
# chain is guaranteed non-None
yield runner_output.chain
event.set_result(
MessageEventResult()
.set_result_content_type(ResultContentType.STREAMING_RESULT)
.set_async_stream(_stream_runner_chain()),
)
yield
if runner.done():
final_resp = runner.get_final_llm_resp()
if final_resp and final_resp.result_chain:
is_runner_error = stream_has_runner_error or final_resp.role == "err"
event.set_extra(THIRD_PARTY_RUNNER_ERROR_EXTRA_KEY, is_runner_error)
event.set_result(
MessageEventResult(
chain=final_resp.result_chain.chain or [],
result_content_type=ResultContentType.STREAMING_FINISH,
),
)
```
This keeps all your current behavior (extra flag, streaming + final resp) but drops one async-generator hop and all `if runner_output.chain` branches.
### 3. Simplify non-streaming fallback accumulation
You can avoid keeping a `list[MessageChain]` and then merging them later; instead, build the final `merged_chain` list as you iterate, and track only a boolean for errors:
```python
else:
# 非流式响应或转换为普通响应
merged_chain: list = []
fallback_is_error = False
async for output in run_third_party_agent(
runner,
stream_to_general=stream_to_general,
custom_error_message=custom_error_message,
):
merged_chain.extend(output.chain.chain or [])
if output.is_error:
fallback_is_error = True
yield
final_resp = runner.get_final_llm_resp()
if not final_resp or not final_resp.result_chain:
if merged_chain:
logger.warning(
"Agent Runner returned no final response, fallback to streamed error/result chain."
)
content_type = (
ResultContentType.AGENT_RUNNER_ERROR
if fallback_is_error
else ResultContentType.LLM_RESULT
)
event.set_extra(
THIRD_PARTY_RUNNER_ERROR_EXTRA_KEY,
fallback_is_error,
)
event.set_result(
MessageEventResult(
chain=merged_chain,
result_content_type=content_type,
),
)
yield
return
logger.warning("Agent Runner 未返回最终结果。")
return
content_type = (
ResultContentType.AGENT_RUNNER_ERROR
if final_resp.role == "err"
else ResultContentType.LLM_RESULT
)
event.set_extra(
THIRD_PARTY_RUNNER_ERROR_EXTRA_KEY,
final_resp.role == "err",
)
event.set_result(
MessageEventResult(
chain=final_resp.result_chain.chain or [],
result_content_type=content_type,
),
)
yield
```
This keeps the same behavior (fallback vs final_resp decision, error typing, extras) but reduces:
- `fallback_chains` list of `MessageChain` → a single list of message items.
- `if output.chain` checks → removed under the non-optional contract.
- Nested merging loop at the end → unnecessary.
These changes should make the data flow much easier to follow while preserving all new DeerFlow and error semantics as implemented.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
astrbot/core/pipeline/process_stage/method/agent_sub_stages/third_party.py
Outdated
Show resolved
Hide resolved
|
@sourcery-ai review |
There was a problem hiding this comment.
Code Review
本次 PR 成功集成了 DeerFlow 作为新的第三方 Agent Runner,并对第三方 Runner 的流式处理和错误处理机制进行了重要的优化和重构。主要改动包括:
- 新增了
DeerFlowAgentRunner及 其DeerFlowAPIClient,实现了与 LangGraph 兼容后端的对接,并包含了健壮的 SSE 流处理逻辑。 - 改进了第三方 Agent Runner 的生命周期管理,增加了对
close钩子的支持,确保资源能够被正确释放。 - 增强了错误处理语义,通过引入
AGENT_RUNNER_ERROR结果类型和_ThirdPartyRunnerOutput数据类,使得 Runner 的错误状态能够更清晰地在处理管道中传递。 - 对相关的配置、迁移脚本和前端界面都进行了相应的更新,以支持 DeerFlow 的集成。
整体代码质量很高,改动考虑周全。我提出了一些关于代码重构的建议,旨在提高代码的可读性和可维护性,主要集中在消除冗余代码方面。请查看具体的审查评论。
| async def _stream_sse(resp: ClientResponse) -> AsyncGenerator[dict[str, Any], None]: | ||
| """Parse SSE response blocks into event/data dictionaries.""" | ||
| decoder = codecs.getincrementaldecoder("utf-8")() | ||
| buffer = "" | ||
|
|
||
| async for chunk in resp.content.iter_chunked(8192): | ||
| buffer += _normalize_sse_newlines(decoder.decode(chunk)) | ||
|
|
||
| while "\n\n" in buffer: | ||
| block, buffer = buffer.split("\n\n", 1) | ||
| if not block.strip(): | ||
| continue | ||
|
|
||
| event_name = "message" | ||
| data_lines: list[str] = [] | ||
|
|
||
| for line in block.splitlines(): | ||
| if line.startswith("event:"): | ||
| event_name = line[6:].strip() | ||
| elif line.startswith("data:"): | ||
| data_lines.append(line[5:].lstrip()) | ||
|
|
||
| if not data_lines: | ||
| continue | ||
|
|
||
| data = _parse_sse_data_lines(data_lines) | ||
|
|
||
| yield {"event": event_name, "data": data} | ||
|
|
||
| # flush any remaining buffered text | ||
| buffer += _normalize_sse_newlines(decoder.decode(b"", final=True)) | ||
| if not buffer.strip(): | ||
| return | ||
|
|
||
| event_name = "message" | ||
| data_lines = [] | ||
| for line in buffer.splitlines(): | ||
| if line.startswith("event:"): | ||
| event_name = line[6:].strip() | ||
| elif line.startswith("data:"): | ||
| data_lines.append(line[5:].lstrip()) | ||
| if not data_lines: | ||
| return | ||
|
|
||
| data = _parse_sse_data_lines(data_lines) | ||
| yield {"event": event_name, "data": data} |
There was a problem hiding this comment.
在 _stream_sse 函数中,处理 SSE 块的逻辑在 while 循环内部(处理中间块)和循环之后(处理末尾剩余的 buffer)存在重复。这部分重复的代码可以被提取到一个辅助函数中,以提高代码的可读性和可维护性。
async def _stream_sse(resp: ClientResponse) -> AsyncGenerator[dict[str, Any], None]:
"""Parse SSE response blocks into event/data dictionaries."""
decoder = codecs.getincrementaldecoder("utf-8")()
buffer = ""
def _parse_block(block_content: str) -> dict[str, Any] | None:
if not block_content.strip():
return None
event_name = "message"
data_lines: list[str] = []
for line in block_content.splitlines():
if line.startswith("event:"):
event_name = line[6:].strip()
elif line.startswith("data:"):
data_lines.append(line[5:].lstrip())
if not data_lines:
return None
data = _parse_sse_data_lines(data_lines)
return {"event": event_name, "data": data}
async for chunk in resp.content.iter_chunked(8192):
buffer += _normalize_sse_newlines(decoder.decode(chunk))
while "\n\n" in buffer:
block, buffer = buffer.split("\n\n", 1)
if event := _parse_block(block):
yield event
# flush any remaining buffered text
final_block = buffer + _normalize_sse_newlines(decoder.decode(b"", final=True))
if event := _parse_block(final_block):
yield event| stream_has_runner_error = True | ||
| event.set_extra(THIRD_PARTY_RUNNER_ERROR_EXTRA_KEY, True) | ||
| yield runner_output.chain | ||
|
|
||
| event.set_result( | ||
| MessageEventResult() | ||
| .set_result_content_type(ResultContentType.STREAMING_RESULT) | ||
| .set_async_stream(_stream_runner_chain()), | ||
| ) | ||
| yield | ||
| if runner.done(): | ||
| final_resp = runner.get_final_llm_resp() | ||
| if final_resp and final_resp.result_chain: | ||
| is_runner_error = ( |
There was a problem hiding this comment.
_stream_runner_chain 的实现可以被简化。当前的实现中,_iter_runner_output_chain 过滤了 chain 为空的输出,然后在 _stream_runner_chain 内部又检查了一次 if runner_output.chain:,这造成了冗余。
建议直接在 _stream_runner_chain 中处理,这样可以移除 _iter_runner_output_chain 辅助函数,使代码更简洁。
| stream_has_runner_error = True | |
| event.set_extra(THIRD_PARTY_RUNNER_ERROR_EXTRA_KEY, True) | |
| yield runner_output.chain | |
| event.set_result( | |
| MessageEventResult() | |
| .set_result_content_type(ResultContentType.STREAMING_RESULT) | |
| .set_async_stream(_stream_runner_chain()), | |
| ) | |
| yield | |
| if runner.done(): | |
| final_resp = runner.get_final_llm_resp() | |
| if final_resp and final_resp.result_chain: | |
| is_runner_error = ( | |
| async def _stream_runner_chain() -> AsyncGenerator[MessageChain, None]: | |
| nonlocal stream_has_runner_error | |
| async for runner_output in run_third_party_agent( | |
| runner, | |
| stream_to_general=False, | |
| custom_error_message=custom_error_message, | |
| ): | |
| if not runner_output.chain: | |
| continue | |
| if runner_output.is_error: | |
| stream_has_runner_error = True | |
| event.set_extra(THIRD_PARTY_RUNNER_ERROR_EXTRA_KEY, True) | |
| yield runner_output.chain | |
There was a problem hiding this comment.
Hey - 我发现了 2 个问题,并给出了一些高层次的反馈:
ThirdPartyAgentSubStage.process方法目前已经变得相当复杂,包含单独的流式/非流式路径,并且在设置THIRD_PARTY_RUNNER_ERROR_EXTRA_KEY和ResultContentType时存在重复逻辑;建议抽取一些小的帮助函数(例如用于最终响应解析以及错误标记的传递),从而减少分支,让状态流转更容易理解。- 新的
_ThirdPartyRunnerOutput+THIRD_PARTY_RUNNER_ERROR_EXTRA_KEY+ResultContentType.AGENT_RUNNER_ERROR协议在多个地方定义了 runner 错误语义;可以考虑将其集中到一个小工具函数或共享抽象中,这样未来的第三方 runner 或调用方就不需要手动复刻这些标记 / content-type 的映射。
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The `ThirdPartyAgentSubStage.process` method has become quite complex with separate streaming/non-streaming paths and duplicated logic for setting `THIRD_PARTY_RUNNER_ERROR_EXTRA_KEY` and `ResultContentType`; consider extracting small helpers (e.g. for final response resolution and error-flag propagation) to reduce branching and keep the state transitions easier to reason about.
- The new `_ThirdPartyRunnerOutput` + `THIRD_PARTY_RUNNER_ERROR_EXTRA_KEY` + `ResultContentType.AGENT_RUNNER_ERROR` protocol defines runner error semantics in several places; you might centralize this into a small utility or shared abstraction so that any future third-party runners or callers don’t need to manually replicate the flag/content-type mapping.
## Individual Comments
### Comment 1
<location path="astrbot/core/pipeline/process_stage/method/agent_sub_stages/third_party.py" line_range="323-324" />
<code_context>
+ ),
+ )
+ yield
+ finally:
+ await _close_runner_if_supported(runner)
asyncio.create_task(
</code_context>
<issue_to_address>
**issue (bug_risk):** Closing the runner in `finally` may race with ongoing streaming consumption and prematurely tear down resources.
In the streaming path, the async generator (`_stream_runner_chain` / `run_third_party_agent`) is still being consumed after `process` returns. With `finally: await _close_runner_if_supported(runner)`, the runner and its client (e.g., DeerFlow’s SSE session) may be closed while the consumer is still iterating, leading to mid-stream errors. Cleanup for the streaming case should be deferred until consumption is complete (e.g., via a callback from the consumer, or by only closing once `runner.done()` is true).
</issue_to_address>
### Comment 2
<location path="astrbot/core/pipeline/process_stage/method/agent_sub_stages/third_party.py" line_range="213-222" />
<code_context>
+ fallback_is_error = True
+ yield
- final_resp = runner.get_final_llm_resp()
+ final_resp = runner.get_final_llm_resp()
- if not final_resp or not final_resp.result_chain:
- logger.warning("Agent Runner 未返回最终结果。")
- return
+ if not final_resp or not final_resp.result_chain:
+ if merged_chain:
+ logger.warning(
+ "Agent Runner returned no final response, fallback to streamed error/result chain."
+ )
+ content_type = (
+ ResultContentType.AGENT_RUNNER_ERROR
+ if fallback_is_error
</code_context>
<issue_to_address>
**suggestion (bug_risk):** Non-streaming error state is derived only from `final_resp.role` and ignores intermediate `is_error` flags.
In the non-streaming path, `fallback_is_error` is only used when `final_resp` is missing. When `final_resp` exists, `content_type` and `THIRD_PARTY_RUNNER_ERROR_EXTRA_KEY` depend solely on `final_resp.role == "err"`, so any earlier `_ThirdPartyRunnerOutput.is_error` is ignored. This can misclassify results as normal `LLM_RESULT` when an error chunk was emitted earlier. Please fold `fallback_is_error` into the final decision (e.g., OR with `final_resp.role == "err"`) so intermediate error signals are preserved.
</issue_to_address>Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Original comment in English
Hey - I've found 2 issues, and left some high level feedback:
- The
ThirdPartyAgentSubStage.processmethod has become quite complex with separate streaming/non-streaming paths and duplicated logic for settingTHIRD_PARTY_RUNNER_ERROR_EXTRA_KEYandResultContentType; consider extracting small helpers (e.g. for final response resolution and error-flag propagation) to reduce branching and keep the state transitions easier to reason about. - The new
_ThirdPartyRunnerOutput+THIRD_PARTY_RUNNER_ERROR_EXTRA_KEY+ResultContentType.AGENT_RUNNER_ERRORprotocol defines runner error semantics in several places; you might centralize this into a small utility or shared abstraction so that any future third-party runners or callers don’t need to manually replicate the flag/content-type mapping.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The `ThirdPartyAgentSubStage.process` method has become quite complex with separate streaming/non-streaming paths and duplicated logic for setting `THIRD_PARTY_RUNNER_ERROR_EXTRA_KEY` and `ResultContentType`; consider extracting small helpers (e.g. for final response resolution and error-flag propagation) to reduce branching and keep the state transitions easier to reason about.
- The new `_ThirdPartyRunnerOutput` + `THIRD_PARTY_RUNNER_ERROR_EXTRA_KEY` + `ResultContentType.AGENT_RUNNER_ERROR` protocol defines runner error semantics in several places; you might centralize this into a small utility or shared abstraction so that any future third-party runners or callers don’t need to manually replicate the flag/content-type mapping.
## Individual Comments
### Comment 1
<location path="astrbot/core/pipeline/process_stage/method/agent_sub_stages/third_party.py" line_range="323-324" />
<code_context>
+ ),
+ )
+ yield
+ finally:
+ await _close_runner_if_supported(runner)
asyncio.create_task(
</code_context>
<issue_to_address>
**issue (bug_risk):** Closing the runner in `finally` may race with ongoing streaming consumption and prematurely tear down resources.
In the streaming path, the async generator (`_stream_runner_chain` / `run_third_party_agent`) is still being consumed after `process` returns. With `finally: await _close_runner_if_supported(runner)`, the runner and its client (e.g., DeerFlow’s SSE session) may be closed while the consumer is still iterating, leading to mid-stream errors. Cleanup for the streaming case should be deferred until consumption is complete (e.g., via a callback from the consumer, or by only closing once `runner.done()` is true).
</issue_to_address>
### Comment 2
<location path="astrbot/core/pipeline/process_stage/method/agent_sub_stages/third_party.py" line_range="213-222" />
<code_context>
+ fallback_is_error = True
+ yield
- final_resp = runner.get_final_llm_resp()
+ final_resp = runner.get_final_llm_resp()
- if not final_resp or not final_resp.result_chain:
- logger.warning("Agent Runner 未返回最终结果。")
- return
+ if not final_resp or not final_resp.result_chain:
+ if merged_chain:
+ logger.warning(
+ "Agent Runner returned no final response, fallback to streamed error/result chain."
+ )
+ content_type = (
+ ResultContentType.AGENT_RUNNER_ERROR
+ if fallback_is_error
</code_context>
<issue_to_address>
**suggestion (bug_risk):** Non-streaming error state is derived only from `final_resp.role` and ignores intermediate `is_error` flags.
In the non-streaming path, `fallback_is_error` is only used when `final_resp` is missing. When `final_resp` exists, `content_type` and `THIRD_PARTY_RUNNER_ERROR_EXTRA_KEY` depend solely on `final_resp.role == "err"`, so any earlier `_ThirdPartyRunnerOutput.is_error` is ignored. This can misclassify results as normal `LLM_RESULT` when an error chunk was emitted earlier. Please fold `fallback_is_error` into the final decision (e.g., OR with `final_resp.role == "err"`) so intermediate error signals are preserved.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
astrbot/core/pipeline/process_stage/method/agent_sub_stages/third_party.py
Outdated
Show resolved
Hide resolved
astrbot/core/pipeline/process_stage/method/agent_sub_stages/third_party.py
Show resolved
Hide resolved
|
@sourcery-ai review |
SourceryAI
left a comment
There was a problem hiding this comment.
Hey - 我发现了两个问题
Prompt for AI Agents
Please address the comments from this code review:
## Individual Comments
### Comment 1
<location path="astrbot/core/pipeline/process_stage/method/agent_sub_stages/third_party.py" line_range="268-277" />
<code_context>
+ async def _stream_runner_chain() -> AsyncGenerator[MessageChain, None]:
</code_context>
<issue_to_address>
**issue (bug_risk):** 如果流式迭代器从未被消费,Runner 的清理逻辑会被跳过,从而导致连接/资源泄露。
在流式分支中,`defer_runner_close_to_stream = True` 意味着我们依赖 `_stream_runner_chain` 的 `finally` 去调用 `_close_runner_once()`。如果 `async_stream` 被设置但从未被迭代(例如提前返回、被取消或出现 bug),那个 `finally` 就永远不会执行,而 `process` 外层的 `finally` 又因为 `defer_runner_close_to_stream` 为 True 而跳过调用 `_close_runner_once()`。这会导致连接/资源泄露。请增加一个兜底逻辑,确保即使流从未被消费,`_close_runner_once()` 也一定会被调用(例如:只在拉取到第一个元素之后才延迟关闭,或者在超时/取消时安排关闭操作)。
</issue_to_address>
### Comment 2
<location path="astrbot/core/pipeline/process_stage/method/agent_sub_stages/third_party.py" line_range="140" />
<code_context>
+ logger.warning(f"Failed to close third-party runner cleanly: {e}")
class ThirdPartyAgentSubStage(Stage):
</code_context>
<issue_to_address>
**issue (complexity):** 可以考虑把 runner 结果聚合、错误分类和生命周期管理提取成一些小的辅助函数/上下文管理器,这样 `ThirdPartyAgentSubStage.process` 就能读起来像一个更短、更线性的编排方法。
你可以保留现在新增的所有功能,只需通过几处有针对性的重构,降低理解成本。
---
### 1. 局部化 `_ThirdPartyRunnerOutput` 的使用并集中错误分类逻辑
目前 `is_error` 会贯穿 `run_third_party_agent`、流式/非流式逻辑以及最终结果设置的各个地方。你可以保留 `_ThirdPartyRunnerOutput`,但在 `process` 中用一些局部的小辅助函数把它封装起来,这样主方法的其他部分只需要在一个位置处理 `(chain, is_error)` 即可。
例如:对非流式分支,把合并和最终决策移动到一个辅助函数中:
```python
async def _collect_non_stream_result(
runner: "BaseAgentRunner",
stream_to_general: bool,
custom_error_message: str | None,
) -> tuple[list, bool] | None:
merged_chain: list = []
fallback_is_error = False
async for output in run_third_party_agent(
runner,
stream_to_general=stream_to_general,
custom_error_message=custom_error_message,
):
merged_chain.extend(output.chain.chain or [])
if output.is_error:
fallback_is_error = True
final_resp = runner.get_final_llm_resp()
if not final_resp or not final_resp.result_chain:
if not merged_chain:
return None
return merged_chain, fallback_is_error
chain = final_resp.result_chain.chain or []
is_error = fallback_is_error or final_resp.role == "err"
return chain, is_error
```
然后 `process` 的非流式分支就可以简化为一个单一、线性的调用:
```python
# 非流式响应或转换为普通响应
result = await _collect_non_stream_result(
runner,
stream_to_general=stream_to_general,
custom_error_message=custom_error_message,
)
if result is None:
logger.warning("Agent Runner 未返回最终结果。")
return
chain, is_error = result
_set_non_stream_runner_result(event=event, chain=chain, is_error=is_error)
yield
```
这样既保留了包装类型,又不用在主方法各处分别去理解 `merged_chain`、`fallback_is_error` 和 `final_resp`。
对于流式分支也可以采用类似的辅助函数,在流式结束后一次性计算 `is_runner_error`,而不是在生成器内部修改 `stream_has_runner_error` 并更新 event 的 extras。
---
### 2. 用一个小的异步上下文管理器封装 runner 生命周期
目前你需要同时处理 `runner_closed`、`defer_runner_close_to_stream`、`_close_runner_once`,以及一个有条件关闭的 `finally`。这些逻辑都可以被包装到一个很小的上下文管理器中,以保证只会关闭一次:
```python
class _RunnerLifetime:
def __init__(self, runner: "BaseAgentRunner") -> None:
self._runner = runner
self._closed = False
async def close(self) -> None:
if self._closed:
return
self._closed = True
await _close_runner_if_supported(self._runner)
async def __aenter__(self) -> "BaseAgentRunner":
return self._runner
async def __aexit__(self, exc_type, exc, tb) -> None:
await self.close()
```
然后 `process` 就可以变成:
```python
async with _RunnerLifetime(runner) as managed_runner:
await managed_runner.reset(...)
if streaming_response and not stream_to_general:
# assign a stream that closes the runner at the end
async def _stream_runner_chain() -> AsyncGenerator[MessageChain, None]:
try:
async for output in run_third_party_agent(
managed_runner,
stream_to_general=False,
custom_error_message=custom_error_message,
):
...
yield output.chain
finally:
await lifetime.close() # explicit, but __aexit__ is also a safety net
event.set_result(...set_async_stream(_stream_runner_chain()))
yield
# post‑stream final result handling here (no extra close logic)
else:
# non‑streaming path here (no defer flags, no extra finally)
```
要点:
- `ThirdPartyAgentSubStage.process` 不再需要 `runner_closed` 或 `defer_runner_close_to_stream`。
- 所有关闭相关语义都集中在 `_RunnerLifetime` 和 `_close_runner_if_supported` 中,更容易理解和测试。
---
### 3. 让流式生成器只负责输出数据
目前 `_stream_runner_chain` 同时在做两件事:
- 向客户端输出 chain;
- 修改 `stream_has_runner_error` 并调用 `_set_runner_error_extra`。
你可以保留错误跟踪逻辑,但把元数据的处理放到流结束后的一个步骤里。例如:
```python
async def _stream_runner_chain(
event: "AstrMessageEvent",
runner: "BaseAgentRunner",
custom_error_message: str | None,
) -> AsyncGenerator[MessageChain, None]:
nonlocal stream_has_runner_error
try:
async for output in run_third_party_agent(
runner,
stream_to_general=False,
custom_error_message=custom_error_message,
):
if output.is_error:
stream_has_runner_error = True
yield output.chain
finally:
await lifetime.close()
```
然后在 `process` 中的 `yield` 之后:
```python
if runner.done():
final_resp = runner.get_final_llm_resp()
if final_resp and final_resp.result_chain:
is_runner_error = stream_has_runner_error or final_resp.role == "err"
_set_runner_error_extra(event, is_runner_error)
event.set_result(
MessageEventResult(
chain=final_resp.result_chain.chain or [],
result_content_type=ResultContentType.STREAMING_FINISH,
),
)
```
这样生成器只专注于流式输出;`process` 在一个地方完成分类和元数据处理。
---
这些改动可以保持:
- 错误与正常结果的分类逻辑,包括中间错误信号;
- DeerFlow 集成;
- 安全且幂等的 runner 关闭。
但会把大部分分支和状态抽到一些小的、具名的辅助函数/上下文管理器里,使 `ThirdPartyAgentSubStage.process` 更短、更容易理解。
</issue_to_address>Hi @zouyonghe! 👋
感谢你通过评论 @sourcery-ai review 来体验 Sourcery!🚀
安装 sourcery-ai bot 即可在每一个 Pull Request 上自动获得代码审查 ✨
帮我变得更有用!请在每条评论上点 👍 或 👎,我会根据你的反馈改进后续的审查。Original comment in English
Hey - I've found 2 issues
Prompt for AI Agents
Please address the comments from this code review:
## Individual Comments
### Comment 1
<location path="astrbot/core/pipeline/process_stage/method/agent_sub_stages/third_party.py" line_range="268-277" />
<code_context>
+ async def _stream_runner_chain() -> AsyncGenerator[MessageChain, None]:
</code_context>
<issue_to_address>
**issue (bug_risk):** Runner cleanup is skipped if the streaming iterator is never consumed, which can leak connections/resources.
In the streaming path `defer_runner_close_to_stream = True` means we rely on `_stream_runner_chain`'s `finally` to call `_close_runner_once()`. If `async_stream` is set but never iterated (e.g. early return, cancellation, or a bug), that `finally` never runs and the outer `finally` in `process` also skips `_close_runner_once()` because `defer_runner_close_to_stream` is True. This can leak connections/resources. Please add a fallback so `_close_runner_once()` is guaranteed to run even if the stream is never consumed (e.g. only defer after the first item is pulled, or ensure closure is scheduled on timeout/cancellation).
</issue_to_address>
### Comment 2
<location path="astrbot/core/pipeline/process_stage/method/agent_sub_stages/third_party.py" line_range="140" />
<code_context>
+ logger.warning(f"Failed to close third-party runner cleanly: {e}")
class ThirdPartyAgentSubStage(Stage):
</code_context>
<issue_to_address>
**issue (complexity):** Consider extracting the runner result aggregation, error classification, and lifecycle management into small helpers/context managers so that `ThirdPartyAgentSubStage.process` reads as a short, linear orchestration method.
You can keep all the new functionality but reduce the mental overhead with a couple of targeted refactors.
---
### 1. Localize `_ThirdPartyRunnerOutput` usage and centralize error classification
Right now `is_error` is threaded through `run_third_party_agent`, streaming/non‑streaming, and final result setting. You can keep `_ThirdPartyRunnerOutput` but hide it behind small local helpers in `process`, so the rest of the method only deals with `(chain, is_error)` at one point.
Example: for non‑streaming, move the merge + final decision into a helper:
```python
async def _collect_non_stream_result(
runner: "BaseAgentRunner",
stream_to_general: bool,
custom_error_message: str | None,
) -> tuple[list, bool] | None:
merged_chain: list = []
fallback_is_error = False
async for output in run_third_party_agent(
runner,
stream_to_general=stream_to_general,
custom_error_message=custom_error_message,
):
merged_chain.extend(output.chain.chain or [])
if output.is_error:
fallback_is_error = True
final_resp = runner.get_final_llm_resp()
if not final_resp or not final_resp.result_chain:
if not merged_chain:
return None
return merged_chain, fallback_is_error
chain = final_resp.result_chain.chain or []
is_error = fallback_is_error or final_resp.role == "err"
return chain, is_error
```
Then `process`’s non‑streaming branch becomes a single, linear call:
```python
# 非流式响应或转换为普通响应
result = await _collect_non_stream_result(
runner,
stream_to_general=stream_to_general,
custom_error_message=custom_error_message,
)
if result is None:
logger.warning("Agent Runner 未返回最终结果。")
return
chain, is_error = result
_set_non_stream_runner_result(event=event, chain=chain, is_error=is_error)
yield
```
This keeps the wrapper type but removes the need to reason about `merged_chain`, `fallback_is_error`, and `final_resp` scattered across the main method.
A similar helper can be used for streaming to compute `is_runner_error` once, after streaming is done, instead of mutating `stream_has_runner_error` + event extras from inside the generator.
---
### 2. Encapsulate runner lifecycle with a small async context manager
You currently juggle `runner_closed`, `defer_runner_close_to_stream`, `_close_runner_once`, and a `finally` that conditionally closes. All of that can be wrapped in a tiny context manager that guarantees single close:
```python
class _RunnerLifetime:
def __init__(self, runner: "BaseAgentRunner") -> None:
self._runner = runner
self._closed = False
async def close(self) -> None:
if self._closed:
return
self._closed = True
await _close_runner_if_supported(self._runner)
async def __aenter__(self) -> "BaseAgentRunner":
return self._runner
async def __aexit__(self, exc_type, exc, tb) -> None:
await self.close()
```
Then `process` becomes:
```python
async with _RunnerLifetime(runner) as managed_runner:
await managed_runner.reset(...)
if streaming_response and not stream_to_general:
# assign a stream that closes the runner at the end
async def _stream_runner_chain() -> AsyncGenerator[MessageChain, None]:
try:
async for output in run_third_party_agent(
managed_runner,
stream_to_general=False,
custom_error_message=custom_error_message,
):
...
yield output.chain
finally:
await lifetime.close() # explicit, but __aexit__ is also a safety net
event.set_result(...set_async_stream(_stream_runner_chain()))
yield
# post‑stream final result handling here (no extra close logic)
else:
# non‑streaming path here (no defer flags, no extra finally)
```
Key points:
- `ThirdPartyAgentSubStage.process` no longer needs `runner_closed` or `defer_runner_close_to_stream`.
- All closing semantics live in `_RunnerLifetime` and `_close_runner_if_supported`, which are easy to reason about and test.
---
### 3. Make the streaming generator only responsible for emitting data
Currently `_stream_runner_chain` both:
- Emits chains to the client, and
- Mutates `stream_has_runner_error` and calls `_set_runner_error_extra`.
You can keep error tracking but move the metadata application to a single post‑stream step. For example:
```python
async def _stream_runner_chain(
event: "AstrMessageEvent",
runner: "BaseAgentRunner",
custom_error_message: str | None,
) -> AsyncGenerator[MessageChain, None]:
nonlocal stream_has_runner_error
try:
async for output in run_third_party_agent(
runner,
stream_to_general=False,
custom_error_message=custom_error_message,
):
if output.is_error:
stream_has_runner_error = True
yield output.chain
finally:
await lifetime.close()
```
Then, after `yield` in `process`:
```python
if runner.done():
final_resp = runner.get_final_llm_resp()
if final_resp and final_resp.result_chain:
is_runner_error = stream_has_runner_error or final_resp.role == "err"
_set_runner_error_extra(event, is_runner_error)
event.set_result(
MessageEventResult(
chain=final_resp.result_chain.chain or [],
result_content_type=ResultContentType.STREAMING_FINISH,
),
)
```
The generator focuses on streaming; `process` does the classification and metadata in one place.
---
These changes preserve:
- Error vs normal classification, including intermediate error signals.
- DeerFlow integration.
- Safe, idempotent runner closing.
But they pull most of the branching and state into small, named helpers/context managers, making `ThirdPartyAgentSubStage.process` shorter and easier to follow.
</issue_to_address>Hi @zouyonghe! 👋
Thanks for trying out Sourcery by commenting with @sourcery-ai review! 🚀
Install the sourcery-ai bot to get automatic code reviews on every pull request ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
astrbot/core/pipeline/process_stage/method/agent_sub_stages/third_party.py
Outdated
Show resolved
Hide resolved
astrbot/core/pipeline/process_stage/method/agent_sub_stages/third_party.py
Show resolved
Hide resolved
There was a problem hiding this comment.
Hey - 我发现了 2 个问题
Prompt for AI Agents
Please address the comments from this code review:
## Individual Comments
### Comment 1
<location path="astrbot/core/agent/runners/deerflow/deerflow_agent_runner.py" line_range="264-273" />
<code_context>
+ f"DeerFlow agent reached max_step ({max_step}) without completion."
+ )
+
+ def _extract_new_messages_from_values(
+ self,
+ values_messages: list[T.Any],
+ state: _StreamState,
+ ) -> list[dict[str, T.Any]]:
+ new_messages: list[dict[str, T.Any]] = []
+ for msg in values_messages:
+ if not isinstance(msg, dict):
+ continue
+ msg_id = get_message_id(msg)
+ if not msg_id or msg_id in state.seen_message_ids:
+ continue
</code_context>
<issue_to_address>
**issue (bug_risk):** 在 values 事件中,没有 `id` 的消息会被永久忽略,这会导致对于不分配消息 ID 的后端,流式传输出现问题。
在 `_extract_new_messages_from_values` 中,`if not msg_id or msg_id in state.seen_message_ids: continue` 这一条件会让所有没有 `id` 的消息从 `new_messages` 和 `run_values_messages` 中被丢弃。对于省略 `id` 的后端或自定义消息类型,这意味着:
- 基线运行会设置 `baseline_initialized`,但从未记录这些消息。
- 之后所有没有 ID 的 values 更新都会被跳过。
- `extract_latest_ai_text(state.run_values_messages)` 的流式输出会一直为空,而非流式场景可能不得不依赖 `fallback_stream_text`/任务失败。
请为缺失 `id` 的情况添加一个回退机制(例如基于索引的跟踪、将任何不带 `id` 的字典视为新消息,或为每个流生成合成 ID),以便在上游消息缺少稳定 ID 时,基于 values 的更新仍然可以正常工作。
</issue_to_address>
### Comment 2
<location path="astrbot/core/agent/runners/deerflow/deerflow_api_client.py" line_range="150-153" />
<code_context>
+ ) -> AsyncGenerator[dict[str, Any], None]:
+ session = self._get_session()
+ url = f"{self.api_base}/api/langgraph/threads/{thread_id}/runs/stream"
+ logger.debug(f"deerflow stream_run payload: {payload}")
+ # For long-running SSE streams, avoid aiohttp total timeout.
+ # Use socket read timeout so active heartbeats/chunks can keep the stream alive.
</code_context>
<issue_to_address>
**🚨 suggestion (security):** 流式 payload 的日志记录可能会在调试日志中暴露敏感的用户内容。
`logger.debug(f"deerflow stream_run payload: {payload}")` 会记录整个请求,包括提示词、图片以及其他潜在敏感字段,这些内容可能会进入集中式日志存储。
请移除这一日志,或对 payload 做脱敏/摘要(例如只记录 ID 和大小/长度),或者通过专门的 debug/trace 标志来控制,使原始用户内容不会被写入日志。
```suggestion
session = self._get_session()
url = f"{self.api_base}/api/langgraph/threads/{thread_id}/runs/stream"
# Log only a minimal summary to avoid exposing sensitive user content.
logger.debug(
"deerflow stream_run payload summary: thread_id=%s, keys=%s",
thread_id,
list(payload.keys()),
)
# For long-running SSE streams, avoid aiohttp total timeout.
```
</issue_to_address>帮我变得更有用!请对每条评论点 👍 或 👎,我会根据反馈改进后续的评审。
Original comment in English
Hey - I've found 2 issues
Prompt for AI Agents
Please address the comments from this code review:
## Individual Comments
### Comment 1
<location path="astrbot/core/agent/runners/deerflow/deerflow_agent_runner.py" line_range="264-273" />
<code_context>
+ f"DeerFlow agent reached max_step ({max_step}) without completion."
+ )
+
+ def _extract_new_messages_from_values(
+ self,
+ values_messages: list[T.Any],
+ state: _StreamState,
+ ) -> list[dict[str, T.Any]]:
+ new_messages: list[dict[str, T.Any]] = []
+ for msg in values_messages:
+ if not isinstance(msg, dict):
+ continue
+ msg_id = get_message_id(msg)
+ if not msg_id or msg_id in state.seen_message_ids:
+ continue
</code_context>
<issue_to_address>
**issue (bug_risk):** Messages without an `id` are permanently ignored in values events, which can break streaming for backends that don't assign message IDs.
In `_extract_new_messages_from_values`, the condition `if not msg_id or msg_id in state.seen_message_ids: continue` causes all messages without an `id` to be dropped from `new_messages` and `run_values_messages`. For backends or custom message types that omit `id`, this means:
- The baseline run sets `baseline_initialized` but never records those messages.
- All later values updates without IDs are skipped.
- Streaming output from `extract_latest_ai_text(state.run_values_messages)` stays empty, and non-streaming may have to rely on `fallback_stream_text`/task failures.
Please add a fallback for missing `id`s (e.g. index-based tracking, treating any dict-without-`id` as new, or generating synthetic IDs per stream) so values-based updates still work when upstream messages lack stable IDs.
</issue_to_address>
### Comment 2
<location path="astrbot/core/agent/runners/deerflow/deerflow_api_client.py" line_range="150-153" />
<code_context>
+ ) -> AsyncGenerator[dict[str, Any], None]:
+ session = self._get_session()
+ url = f"{self.api_base}/api/langgraph/threads/{thread_id}/runs/stream"
+ logger.debug(f"deerflow stream_run payload: {payload}")
+ # For long-running SSE streams, avoid aiohttp total timeout.
+ # Use socket read timeout so active heartbeats/chunks can keep the stream alive.
</code_context>
<issue_to_address>
**🚨 suggestion (security):** Streaming payload logging may expose sensitive user content in debug logs.
`logger.debug(f"deerflow stream_run payload: {payload}")` logs the entire request, including prompts, images, and other potentially sensitive fields, which may end up in centralized log stores.
Please either remove this log, redact/summarize the payload (e.g., IDs and size/lengths only), or gate it behind a dedicated debug/trace flag so raw user content is not written to logs.
```suggestion
session = self._get_session()
url = f"{self.api_base}/api/langgraph/threads/{thread_id}/runs/stream"
# Log only a minimal summary to avoid exposing sensitive user content.
logger.debug(
"deerflow stream_run payload summary: thread_id=%s, keys=%s",
thread_id,
list(payload.keys()),
)
# For long-running SSE streams, avoid aiohttp total timeout.
```
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
|
@sourcery-ai review |
There was a problem hiding this comment.
Hey - 我发现了 4 个问题,并给出了一些高层次的反馈:
- 在
third_party.py中,_close_runner_if_supported使用了inspect.isawaitable,但在 diff 里并没有看到导入inspect;请确认该模块已经导入了inspect,否则请补上以避免运行时出现NameError。 - 在
DeerFlowAgentRunner.reset中构建 DeerFlow API 基础 URL 时,你当前在格式无效时直接raise Exception;建议改用更具体的异常类型(例如ValueError),这样调用方可以区分配置错误与运行时故障。
给 AI Agent 的提示
Please address the comments from this code review:
## Overall Comments
- In `third_party.py`, `_close_runner_if_supported` uses `inspect.isawaitable` but `inspect` is not imported in the diff; double-check that this module imports `inspect` or add it to avoid a `NameError` at runtime.
- When building the DeerFlow API base URL in `DeerFlowAgentRunner.reset`, you currently `raise Exception` on invalid format; consider using a more specific exception type (e.g., `ValueError`) so callers can distinguish configuration errors from runtime failures.
## Individual Comments
### Comment 1
<location path="astrbot/core/agent/runners/deerflow/deerflow_agent_runner.py" line_range="603-604" />
<code_context>
+
+ final_text = self._resolve_final_text(state)
+
+ chain = MessageChain(chain=[Comp.Plain(final_text)])
+ self.final_llm_resp = LLMResponse(role="assistant", result_chain=chain)
+ self._transition_state(AgentState.DONE)
+
</code_context>
<issue_to_address>
**suggestion (bug_risk):** 超时和“仅任务失败”的运行会被暴露为成功的助手响应,从而导致下游错误处理无法识别它们是 runner 错误。
这里 `LLMResponse` 总是以 `role="assistant"` 创建,即使在 `state.timed_out` 为 true 或响应只包含 `task_failures` 的情况下也是如此。但 `ThirdPartyAgentSubStage` 依赖 `final_resp.role == "err"`(或 `fallback_is_error`)来区分 `AGENT_RUNNER_ERROR` 和正常的 `LLM_RESULT`。结果是,DeerFlow 的超时以及纯子任务失败会被报告为正常的助手输出,而不是 runner 错误。请在 `state.timed_out` 为真,或者 `final_text` 完全由任务失败推导出来时,将角色设置为特定的错误角色(例如 `"err"`),以便外层 pipeline 能正确地把它们归类为 `AGENT_RUNNER_ERROR`。
建议实现:
```python
final_text = self._resolve_final_text(state)
# Determine whether this final response represents an error
# rather than a normal assistant result. We treat timeouts and
# "pure task failure" runs as runner errors so that upstream
# stages (e.g. ThirdPartyAgentSubStage) can classify them as
# AGENT_RUNNER_ERROR instead of LLM_RESULT.
timed_out = bool(getattr(state, "timed_out", False))
# Heuristic: if we have task failures but no successful / normal
# outputs, consider this a pure-failure run.
task_failures = getattr(state, "task_failures", None)
has_task_failures = bool(task_failures)
# The exact fields used to track successful results may differ
# across implementations; we conservatively check for several
# common containers.
successful_tasks = getattr(state, "successful_tasks", None)
results = getattr(state, "results", None)
messages = getattr(state, "messages", None)
has_non_failure_results = bool(successful_tasks) or bool(results) or bool(messages)
failures_only = has_task_failures and not has_non_failure_results
is_error = timed_out or failures_only
role = "err" if is_error else "assistant"
chain = MessageChain(chain=[Comp.Plain(final_text)])
self.final_llm_resp = LLMResponse(role=role, result_chain=chain)
self._transition_state(AgentState.DONE)
```
上述实现假设 `state` 对象暴露了 `task_failures`,以及 `successful_tasks`、`results` 或 `messages` 中至少一个,用来区分“纯失败”运行和“成功 + 失败混合”运行。如果你的 DeerFlow state 结构使用了不同的属性名,请相应调整对 `successful_tasks`、`results` 和 `messages` 的访问,以确保当且仅当 `final_text` 完全由任务失败推导时,`failures_only` 为 `True`。如果已经存在一个显式的标志或方法指示“最终结果仅由失败组成”,请直接使用它来替代这个启发式(例如 `failures_only = state.final_result_from_failures_only`)。
</issue_to_address>
### Comment 2
<location path="astrbot/core/agent/runners/deerflow/deerflow_stream_utils.py" line_range="22-25" />
<code_context>
+ return extract_text(content["kwargs"].get("content"))
+ if isinstance(content, list):
+ parts: list[str] = []
+ for item in content:
+ if isinstance(item, str):
+ parts.append(item)
+ elif isinstance(item, dict):
+ item_type = item.get("type")
+ if item_type == "text" and isinstance(item.get("text"), str):
+ parts.append(item["text"])
+ elif "content" in item:
+ parts.append(str(item["content"]))
+ return "\n".join([p for p in parts if p]).strip()
</code_context>
<issue_to_address>
**suggestion (bug_risk):** 列表条目中的嵌套内容被直接字符串化,而不是递归展开,这会降低输出质量。
在 `extract_text` 中,当 `content` 是 list 并且元素是 dict 时,这一分支:
```python
elif "content" in item:
parts.append(str(item["content"]))
```
会输出嵌套内容的原始 `repr`(往往是 dict/list),而不是可读的文本。这和上面 dict 情况下的递归处理不一致,并且可能暴露内部结构(例如工具调用的 payload)。你可以通过在这里调用 `extract_text(item["content"])` 来修复,从而一致地对嵌套内容进行扁平化处理。
```suggestion
if item_type == "text" and isinstance(item.get("text"), str):
parts.append(item["text"])
elif "content" in item:
parts.append(extract_text(item["content"]))
```
</issue_to_address>
### Comment 3
<location path="astrbot/core/pipeline/process_stage/method/agent_sub_stages/third_party.py" line_range="242" />
<code_context>
- provider_config=self.prov_cfg,
- streaming=streaming_response,
- )
+ runner_closed = False
+ defer_runner_close_to_stream = False
+
</code_context>
<issue_to_address>
**issue (complexity):** 建议把流式和非流式处理(以及共享的错误聚合逻辑)抽取到辅助函数中,以拉平 `process`,减少嵌套控制流和标志变量。
你可以保留所有新的行为,同时通过以下方式降低 `process` 的认知负担:
1. **将流式 vs 非流式分支提取为辅助函数**
这样可以把嵌套的闭包和标志变量从 `process` 中移出去,让逻辑更加局部化。
```python
async def _handle_streaming_runner(
self,
runner: "BaseAgentRunner",
event: "AstrMessageEvent",
custom_error_message: str | None,
) -> None:
stream_has_runner_error = False
async def _stream_runner_chain() -> AsyncGenerator[MessageChain, None]:
nonlocal stream_has_runner_error
async for runner_output in run_third_party_agent(
runner,
stream_to_general=False,
custom_error_message=custom_error_message,
):
if runner_output.is_error:
stream_has_runner_error = True
_set_runner_error_extra(event, True)
yield runner_output.chain
event.set_result(
MessageEventResult()
.set_result_content_type(ResultContentType.STREAMING_RESULT)
.set_async_stream(_stream_runner_chain()),
)
# caller is responsible for yielding once to start streaming
if runner.done():
final_resp = runner.get_final_llm_resp()
if final_resp and final_resp.result_chain:
is_runner_error = _aggregate_runner_error(
has_intermediate_error=stream_has_runner_error,
final_resp=final_resp,
)
_set_runner_error_extra(event, is_runner_error)
event.set_result(
MessageEventResult(
chain=final_resp.result_chain.chain or [],
result_content_type=ResultContentType.STREAMING_FINISH,
),
)
```
```python
async def _handle_non_streaming_runner(
self,
runner: "BaseAgentRunner",
event: "AstrMessageEvent",
stream_to_general: bool,
custom_error_message: str | None,
) -> None:
merged_chain: list = []
has_intermediate_error = False
async for output in run_third_party_agent(
runner,
stream_to_general=stream_to_general,
custom_error_message=custom_error_message,
):
merged_chain.extend(output.chain.chain or [])
if output.is_error:
has_intermediate_error = True
final_resp = runner.get_final_llm_resp()
if not final_resp or not final_resp.result_chain:
if merged_chain:
logger.warning(
"Agent Runner returned no final response, "
"fallback to streamed error/result chain."
)
_set_non_stream_runner_result(
event=event,
chain=merged_chain,
is_error=has_intermediate_error,
)
return
logger.warning("Agent Runner 未返回最终结果。")
return
is_runner_error = _aggregate_runner_error(
has_intermediate_error=has_intermediate_error,
final_resp=final_resp,
)
_set_non_stream_runner_result(
event=event,
chain=final_resp.result_chain.chain or [],
is_error=is_runner_error,
)
```
这样 `process` 会变得更扁平:
```python
runner_closed = False
async def _close_runner_once() -> None:
nonlocal runner_closed
if runner_closed:
return
runner_closed = True
await _close_runner_if_supported(runner)
try:
await runner.reset(...)
if streaming_response and not stream_to_general:
await self._handle_streaming_runner(
runner=runner,
event=event,
custom_error_message=custom_error_message,
)
else:
await self._handle_non_streaming_runner(
runner=runner,
event=event,
stream_to_general=stream_to_general,
custom_error_message=custom_error_message,
)
yield
finally:
await _close_runner_once()
```
这样可以完全移除 `defer_runner_close_to_stream`;流式清理将由 `_stream_runner_chain` 生成器结束以及 `finally` 中唯一的 `_close_runner_once` 隐式处理。
2. **统一错误聚合逻辑**
目前你在不同分支中重复计算:
```python
is_runner_error = fallback_is_error or final_resp.role == "err"
# and
is_runner_error = stream_has_runner_error or final_resp.role == "err"
```
你可以把它们集中到一个小的辅助函数里,让两个分支都共用:
```python
def _aggregate_runner_error(
has_intermediate_error: bool,
final_resp: "LLMResponse | None",
) -> bool:
if not final_resp:
return has_intermediate_error
return has_intermediate_error or final_resp.role == "err"
```
这样可以保持当前的语义不变,同时让错误行为更易审计,并避免在流式/非流式路径中重复逻辑。
这两个抽取(流式 vs 非流式处理函数 + 共享的 `_aggregate_runner_error`)在保留你新增特性(`_ThirdPartyRunnerOutput`、错误 extras、runner 关闭)的同时,大幅降低了 `process` 中的嵌套和基于标志的控制流复杂度。
</issue_to_address>
### Comment 4
<location path="astrbot/core/agent/runners/deerflow/deerflow_agent_runner.py" line_range="46" />
<code_context>
+
+ _MAX_VALUES_HISTORY = 200
+
+ @dataclass
+ class _StreamState:
+ streamed_text: str = ""
</code_context>
<issue_to_address>
**issue (complexity):** 建议通过将文本追踪统一为单一的 `latest_text` 流程,并在请求结束时集中处理超时,从而简化流式状态和超时处理逻辑。
你可以在不丢失任何行为的前提下,通过统一流式文本状态和集中化超时处理,显著减少复杂度。
### 1. 合并 `streamed_text`、`fallback_stream_text` 和 `run_values_messages`
当前 `_StreamState` 跟踪:
- `streamed_text`(由 `values` 构建),
- `fallback_stream_text`(由 `messages-*` 构建),
- `run_values_messages`(最终结果的规范消息)。
你可以让 `run_values_messages` 在存在 `values` 事件时始终作为唯一权威来源,只在完全没有任何 `values` 更新时回退到基于 `messages` 的增量。
具体来说:
- 使用 `latest_text` 和 `prev_text_for_streaming`,替代分离的 `streamed_text` + `fallback_stream_text`;
- 在 `_handle_values_event` 中,从 `run_values_messages` 更新 `state.latest_text`,并用 `prev_text_for_streaming` 计算增量;
- 在 `_handle_message_event` 中,仅在 `values` 从未提供过文本时(纯回退模式)才设置 `latest_text`。
这样 `_resolve_final_text` 就可以变得更简单。
```python
@dataclass
class _StreamState:
latest_text: str = ""
prev_text_for_streaming: str = ""
clarification_text: str = ""
task_failures: list[str] = field(default_factory=list)
# de-dup machinery as-is
seen_message_ids: set[str] = field(default_factory=set)
seen_message_order: deque[str] = field(default_factory=deque)
no_id_message_fingerprints: dict[int, str] = field(default_factory=dict)
baseline_initialized: bool = False
run_values_messages: list[dict[str, T.Any]] = field(default_factory=list)
timed_out: bool = False
```
```python
def _handle_values_event(self, data: T.Any, state: _StreamState) -> list[AgentResponse]:
responses: list[AgentResponse] = []
values_messages = extract_messages_from_values_data(data)
if not values_messages:
return responses
if not state.baseline_initialized:
state.baseline_initialized = True
# existing baseline init logic unchanged...
...
return responses
new_messages = self._extract_new_messages_from_values(values_messages, state)
if new_messages:
state.run_values_messages.extend(new_messages)
if len(state.run_values_messages) > self._MAX_VALUES_HISTORY:
state.run_values_messages = state.run_values_messages[-self._MAX_VALUES_HISTORY :]
latest_text = extract_latest_ai_text(state.run_values_messages) or ""
latest_clarification = extract_latest_clarification_text(state.run_values_messages)
if latest_clarification:
state.clarification_text = latest_clarification
else:
latest_text = ""
if not self.streaming or not latest_text:
state.latest_text = latest_text or state.latest_text
return responses
# single streaming path based on latest_text vs prev_text_for_streaming
if latest_text.startswith(state.prev_text_for_streaming):
delta = latest_text[len(state.prev_text_for_streaming) :]
else:
delta = latest_text
if delta:
state.latest_text = latest_text
state.prev_text_for_streaming = latest_text
responses.append(
AgentResponse(
type="streaming_delta",
data=AgentResponseData(chain=MessageChain().message(delta)),
)
)
return responses
```
```python
def _handle_message_event(self, data: T.Any, state: _StreamState) -> AgentResponse | None:
delta = extract_ai_delta_from_event_data(data)
response: AgentResponse | None = None
# Only use messages as a fallback when values never provided text
if delta and not state.run_values_messages:
state.latest_text += delta
if self.streaming:
response = AgentResponse(
type="streaming_delta",
data=AgentResponseData(chain=MessageChain().message(delta)),
)
maybe_clarification = extract_clarification_from_event_data(data)
if maybe_clarification:
state.clarification_text = maybe_clarification
return response
```
```python
def _resolve_final_text(self, state: _StreamState) -> str:
if state.clarification_text:
final_text = state.clarification_text
else:
# Prefer canonical values-based text, else fallback `latest_text`
values_text = extract_latest_ai_text(state.run_values_messages)
final_text = values_text or state.latest_text
if not final_text:
final_text = build_task_failure_summary(state.task_failures)
if not final_text:
logger.warning("DeerFlow returned no text content in stream events.")
final_text = "DeerFlow returned an empty response."
return final_text
```
这会移除并行存在的 `streamed_text` 与 `fallback_stream_text` 概念,只保留一个更易于推理的 `latest_text`,同时仍然保留:
- 基于 values 的规范最终文本,
- 基于 message 的回退,
- 流式增量输出。
### 2. 在 `_execute_deerflow_request` 中集中处理超时行为
当前超时处理分散在:
- `_format_exception`,
- `_execute_deerflow_request`(`state.timed_out = True`),
- `_resolve_final_text`(在没有文本时重新抛出 `TimeoutError`)。
你可以在保持外部行为不变的前提下,通过在 `_execute_deerflow_request` 末尾统一决策超时,并让 `_resolve_final_text` 不再抛异常,从而让控制流更清晰。
```python
async def _execute_deerflow_request(self):
...
state = self._StreamState()
try:
async for event in self.api_client.stream_run(
thread_id=thread_id,
payload=payload,
timeout=self.timeout,
):
...
except (asyncio.TimeoutError, TimeoutError):
state.timed_out = True
final_text = self._resolve_final_text(state)
if state.timed_out:
timeout_note = (
f"DeerFlow stream timed out after {self.timeout}s. "
"Returning partial result."
)
final_text = f"{final_text}\n\n{timeout_note}" if final_text else timeout_note
chain = MessageChain(chain=[Comp.Plain(final_text)])
...
```
这样 `_resolve_final_text` 就不需要知道 `timed_out` 或抛出 `TimeoutError`;它只负责计算“当前能得到的最佳文本”。这可以保持对用户的可见行为一致,同时把关于超时的决策集中在一个地方,让整体流程更容易理解。
</issue_to_address>帮我变得更有用!请对每条评论点 👍 或 👎,我会根据你的反馈改进后续 review。
Original comment in English
Hey - I've found 4 issues, and left some high level feedback:
- In
third_party.py,_close_runner_if_supportedusesinspect.isawaitablebutinspectis not imported in the diff; double-check that this module importsinspector add it to avoid aNameErrorat runtime. - When building the DeerFlow API base URL in
DeerFlowAgentRunner.reset, you currentlyraise Exceptionon invalid format; consider using a more specific exception type (e.g.,ValueError) so callers can distinguish configuration errors from runtime failures.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- In `third_party.py`, `_close_runner_if_supported` uses `inspect.isawaitable` but `inspect` is not imported in the diff; double-check that this module imports `inspect` or add it to avoid a `NameError` at runtime.
- When building the DeerFlow API base URL in `DeerFlowAgentRunner.reset`, you currently `raise Exception` on invalid format; consider using a more specific exception type (e.g., `ValueError`) so callers can distinguish configuration errors from runtime failures.
## Individual Comments
### Comment 1
<location path="astrbot/core/agent/runners/deerflow/deerflow_agent_runner.py" line_range="603-604" />
<code_context>
+
+ final_text = self._resolve_final_text(state)
+
+ chain = MessageChain(chain=[Comp.Plain(final_text)])
+ self.final_llm_resp = LLMResponse(role="assistant", result_chain=chain)
+ self._transition_state(AgentState.DONE)
+
</code_context>
<issue_to_address>
**suggestion (bug_risk):** Timeouts and task-failure-only runs are surfaced as successful assistant responses, which prevents downstream error handling from recognizing them as runner errors.
Here `LLMResponse` is always created with `role="assistant"`, even when `state.timed_out` is true or the response consists only of `task_failures`. But `ThirdPartyAgentSubStage` relies on `final_resp.role == "err"` (or `fallback_is_error`) to distinguish `AGENT_RUNNER_ERROR` from a normal `LLM_RESULT`. As a result, DeerFlow timeouts and pure subtask failures are reported as normal assistant outputs instead of runner errors. Please set an error-specific role (e.g., `"err"`) when `state.timed_out` or when `final_text` is derived solely from task failures so the outer pipeline can correctly classify these as `AGENT_RUNNER_ERROR`.
Suggested implementation:
```python
final_text = self._resolve_final_text(state)
# Determine whether this final response represents an error
# rather than a normal assistant result. We treat timeouts and
# "pure task failure" runs as runner errors so that upstream
# stages (e.g. ThirdPartyAgentSubStage) can classify them as
# AGENT_RUNNER_ERROR instead of LLM_RESULT.
timed_out = bool(getattr(state, "timed_out", False))
# Heuristic: if we have task failures but no successful / normal
# outputs, consider this a pure-failure run.
task_failures = getattr(state, "task_failures", None)
has_task_failures = bool(task_failures)
# The exact fields used to track successful results may differ
# across implementations; we conservatively check for several
# common containers.
successful_tasks = getattr(state, "successful_tasks", None)
results = getattr(state, "results", None)
messages = getattr(state, "messages", None)
has_non_failure_results = bool(successful_tasks) or bool(results) or bool(messages)
failures_only = has_task_failures and not has_non_failure_results
is_error = timed_out or failures_only
role = "err" if is_error else "assistant"
chain = MessageChain(chain=[Comp.Plain(final_text)])
self.final_llm_resp = LLMResponse(role=role, result_chain=chain)
self._transition_state(AgentState.DONE)
```
The above implementation assumes the `state` object exposes `task_failures` and at least one of `successful_tasks`, `results`, or `messages` to distinguish "pure failure" runs from mixed success/failure runs. If your `DeerFlow` state structure uses different attribute names, adjust the `successful_tasks`, `results`, and `messages` lookups accordingly so that `failures_only` is `True` exactly when `final_text` is derived solely from task failures. If there is already an explicit flag or method indicating "final result is failure only", use that instead of the heuristic (e.g., `failures_only = state.final_result_from_failures_only`).
</issue_to_address>
### Comment 2
<location path="astrbot/core/agent/runners/deerflow/deerflow_stream_utils.py" line_range="22-25" />
<code_context>
+ return extract_text(content["kwargs"].get("content"))
+ if isinstance(content, list):
+ parts: list[str] = []
+ for item in content:
+ if isinstance(item, str):
+ parts.append(item)
+ elif isinstance(item, dict):
+ item_type = item.get("type")
+ if item_type == "text" and isinstance(item.get("text"), str):
+ parts.append(item["text"])
+ elif "content" in item:
+ parts.append(str(item["content"]))
+ return "\n".join([p for p in parts if p]).strip()
</code_context>
<issue_to_address>
**suggestion (bug_risk):** Nested content in list items is stringified instead of being recursively flattened, which can degrade output quality.
In `extract_text`, when `content` is a list and an element is a dict, this branch:
```python
elif "content" in item:
parts.append(str(item["content"]))
```
emits the raw `repr` of nested content (often dicts/lists) instead of readable text. This is inconsistent with the recursive handling in the dict case above and can expose internal structures (e.g., tool call payloads). You can fix this by calling `extract_text(item["content"])` here so nested content is flattened consistently.
```suggestion
if item_type == "text" and isinstance(item.get("text"), str):
parts.append(item["text"])
elif "content" in item:
parts.append(extract_text(item["content"]))
```
</issue_to_address>
### Comment 3
<location path="astrbot/core/pipeline/process_stage/method/agent_sub_stages/third_party.py" line_range="242" />
<code_context>
- provider_config=self.prov_cfg,
- streaming=streaming_response,
- )
+ runner_closed = False
+ defer_runner_close_to_stream = False
+
</code_context>
<issue_to_address>
**issue (complexity):** Consider extracting the streaming and non-streaming handling (plus shared error aggregation) into helper functions to flatten `process` and reduce nested control flow and flags.
You can keep all the new behavior but reduce the cognitive load in `process` by:
1. **Extracting streaming vs non‑streaming branches into helpers**
This removes the nested closures and flags from `process` and localizes the logic.
```python
async def _handle_streaming_runner(
self,
runner: "BaseAgentRunner",
event: "AstrMessageEvent",
custom_error_message: str | None,
) -> None:
stream_has_runner_error = False
async def _stream_runner_chain() -> AsyncGenerator[MessageChain, None]:
nonlocal stream_has_runner_error
async for runner_output in run_third_party_agent(
runner,
stream_to_general=False,
custom_error_message=custom_error_message,
):
if runner_output.is_error:
stream_has_runner_error = True
_set_runner_error_extra(event, True)
yield runner_output.chain
event.set_result(
MessageEventResult()
.set_result_content_type(ResultContentType.STREAMING_RESULT)
.set_async_stream(_stream_runner_chain()),
)
# caller is responsible for yielding once to start streaming
if runner.done():
final_resp = runner.get_final_llm_resp()
if final_resp and final_resp.result_chain:
is_runner_error = _aggregate_runner_error(
has_intermediate_error=stream_has_runner_error,
final_resp=final_resp,
)
_set_runner_error_extra(event, is_runner_error)
event.set_result(
MessageEventResult(
chain=final_resp.result_chain.chain or [],
result_content_type=ResultContentType.STREAMING_FINISH,
),
)
```
```python
async def _handle_non_streaming_runner(
self,
runner: "BaseAgentRunner",
event: "AstrMessageEvent",
stream_to_general: bool,
custom_error_message: str | None,
) -> None:
merged_chain: list = []
has_intermediate_error = False
async for output in run_third_party_agent(
runner,
stream_to_general=stream_to_general,
custom_error_message=custom_error_message,
):
merged_chain.extend(output.chain.chain or [])
if output.is_error:
has_intermediate_error = True
final_resp = runner.get_final_llm_resp()
if not final_resp or not final_resp.result_chain:
if merged_chain:
logger.warning(
"Agent Runner returned no final response, "
"fallback to streamed error/result chain."
)
_set_non_stream_runner_result(
event=event,
chain=merged_chain,
is_error=has_intermediate_error,
)
return
logger.warning("Agent Runner 未返回最终结果。")
return
is_runner_error = _aggregate_runner_error(
has_intermediate_error=has_intermediate_error,
final_resp=final_resp,
)
_set_non_stream_runner_result(
event=event,
chain=final_resp.result_chain.chain or [],
is_error=is_runner_error,
)
```
Then `process` becomes flatter:
```python
runner_closed = False
async def _close_runner_once() -> None:
nonlocal runner_closed
if runner_closed:
return
runner_closed = True
await _close_runner_if_supported(runner)
try:
await runner.reset(...)
if streaming_response and not stream_to_general:
await self._handle_streaming_runner(
runner=runner,
event=event,
custom_error_message=custom_error_message,
)
else:
await self._handle_non_streaming_runner(
runner=runner,
event=event,
stream_to_general=stream_to_general,
custom_error_message=custom_error_message,
)
yield
finally:
await _close_runner_once()
```
This removes `defer_runner_close_to_stream` entirely; streaming cleanup is implicitly handled by the `_stream_runner_chain` generator finishing and the single `_close_runner_once` in the `finally`.
2. **Unifying error aggregation logic**
Instead of recomputing:
```python
is_runner_error = fallback_is_error or final_resp.role == "err"
# and
is_runner_error = stream_has_runner_error or final_resp.role == "err"
```
you can centralize this in a tiny helper used by both branches:
```python
def _aggregate_runner_error(
has_intermediate_error: bool,
final_resp: "LLMResponse | None",
) -> bool:
if not final_resp:
return has_intermediate_error
return has_intermediate_error or final_resp.role == "err"
```
This keeps all current semantics but makes the error behavior easier to audit and avoids duplicated logic in streaming/non‑streaming paths.
These two extractions (streaming vs non‑streaming handlers + shared `_aggregate_runner_error`) retain your new features (`_ThirdPartyRunnerOutput`, error extras, runner closing) while significantly reducing nesting and flag‑based control flow in `process`.
</issue_to_address>
### Comment 4
<location path="astrbot/core/agent/runners/deerflow/deerflow_agent_runner.py" line_range="46" />
<code_context>
+
+ _MAX_VALUES_HISTORY = 200
+
+ @dataclass
+ class _StreamState:
+ streamed_text: str = ""
</code_context>
<issue_to_address>
**issue (complexity):** Consider simplifying the streaming state and timeout handling by unifying text tracking into a single `latest_text` flow and resolving timeouts in one place at the end of the request.
You can trim a noticeable amount of complexity by unifying the streaming text state and centralizing timeout handling, without losing any behavior.
### 1. Collapse `streamed_text` vs `fallback_stream_text` vs `run_values_messages`
Right now `_StreamState` tracks:
- `streamed_text` (built from `values`),
- `fallback_stream_text` (built from `messages-*`),
- `run_values_messages` (canonical messages for final result).
You can keep `run_values_messages` as the single authoritative source whenever `values` events are present, and only fall back to `messages` deltas when no `values` updates are available at all.
Concretely:
- Track `latest_text` and `prev_text_for_streaming` instead of separate `streamed_text` + `fallback_stream_text`.
- In `_handle_values_event`, update `state.latest_text` from `run_values_messages`, and compute deltas from `prev_text_for_streaming`.
- In `_handle_message_event`, only set `latest_text` if `values` have never produced text (pure fallback mode).
That lets `_resolve_final_text` become much simpler.
```python
@dataclass
class _StreamState:
latest_text: str = ""
prev_text_for_streaming: str = ""
clarification_text: str = ""
task_failures: list[str] = field(default_factory=list)
# de-dup machinery as-is
seen_message_ids: set[str] = field(default_factory=set)
seen_message_order: deque[str] = field(default_factory=deque)
no_id_message_fingerprints: dict[int, str] = field(default_factory=dict)
baseline_initialized: bool = False
run_values_messages: list[dict[str, T.Any]] = field(default_factory=list)
timed_out: bool = False
```
```python
def _handle_values_event(self, data: T.Any, state: _StreamState) -> list[AgentResponse]:
responses: list[AgentResponse] = []
values_messages = extract_messages_from_values_data(data)
if not values_messages:
return responses
if not state.baseline_initialized:
state.baseline_initialized = True
# existing baseline init logic unchanged...
...
return responses
new_messages = self._extract_new_messages_from_values(values_messages, state)
if new_messages:
state.run_values_messages.extend(new_messages)
if len(state.run_values_messages) > self._MAX_VALUES_HISTORY:
state.run_values_messages = state.run_values_messages[-self._MAX_VALUES_HISTORY :]
latest_text = extract_latest_ai_text(state.run_values_messages) or ""
latest_clarification = extract_latest_clarification_text(state.run_values_messages)
if latest_clarification:
state.clarification_text = latest_clarification
else:
latest_text = ""
if not self.streaming or not latest_text:
state.latest_text = latest_text or state.latest_text
return responses
# single streaming path based on latest_text vs prev_text_for_streaming
if latest_text.startswith(state.prev_text_for_streaming):
delta = latest_text[len(state.prev_text_for_streaming) :]
else:
delta = latest_text
if delta:
state.latest_text = latest_text
state.prev_text_for_streaming = latest_text
responses.append(
AgentResponse(
type="streaming_delta",
data=AgentResponseData(chain=MessageChain().message(delta)),
)
)
return responses
```
```python
def _handle_message_event(self, data: T.Any, state: _StreamState) -> AgentResponse | None:
delta = extract_ai_delta_from_event_data(data)
response: AgentResponse | None = None
# Only use messages as a fallback when values never provided text
if delta and not state.run_values_messages:
state.latest_text += delta
if self.streaming:
response = AgentResponse(
type="streaming_delta",
data=AgentResponseData(chain=MessageChain().message(delta)),
)
maybe_clarification = extract_clarification_from_event_data(data)
if maybe_clarification:
state.clarification_text = maybe_clarification
return response
```
```python
def _resolve_final_text(self, state: _StreamState) -> str:
if state.clarification_text:
final_text = state.clarification_text
else:
# Prefer canonical values-based text, else fallback `latest_text`
values_text = extract_latest_ai_text(state.run_values_messages)
final_text = values_text or state.latest_text
if not final_text:
final_text = build_task_failure_summary(state.task_failures)
if not final_text:
logger.warning("DeerFlow returned no text content in stream events.")
final_text = "DeerFlow returned an empty response."
return final_text
```
This removes the parallel `streamed_text` vs `fallback_stream_text` concept and keeps a single `latest_text` that’s easy to reason about, while preserving:
- values-based canonical final text,
- message-based fallback, and
- streaming deltas.
### 2. Centralize timeout behavior in `_execute_deerflow_request`
Timeout handling is currently split across:
- `_format_exception`,
- `_execute_deerflow_request` (`state.timed_out = True`),
- `_resolve_final_text` (re-raises `TimeoutError` when no text).
You can keep the external behavior but make the control flow clearer by resolving the timeout at the end of `_execute_deerflow_request` and not letting `_resolve_final_text` raise.
```python
async def _execute_deerflow_request(self):
...
state = self._StreamState()
try:
async for event in self.api_client.stream_run(
thread_id=thread_id,
payload=payload,
timeout=self.timeout,
):
...
except (asyncio.TimeoutError, TimeoutError):
state.timed_out = True
final_text = self._resolve_final_text(state)
if state.timed_out:
timeout_note = (
f"DeerFlow stream timed out after {self.timeout}s. "
"Returning partial result."
)
final_text = f"{final_text}\n\n{timeout_note}" if final_text else timeout_note
chain = MessageChain(chain=[Comp.Plain(final_text)])
...
```
And then `_resolve_final_text` no longer needs to know about `timed_out` or raise `TimeoutError`; it just computes “best available text.” This keeps the same user-facing messages but concentrates the timeout decision in one place, making the overall flow easier to follow.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
astrbot/core/pipeline/process_stage/method/agent_sub_stages/third_party.py
Show resolved
Hide resolved
|
@sourcery-ai review |
SourceryAI
left a comment
There was a problem hiding this comment.
Hey - 我发现了两个问题
Prompt for AI Agents
Please address the comments from this code review:
## Individual Comments
### Comment 1
<location path="astrbot/core/pipeline/process_stage/method/agent_sub_stages/third_party.py" line_range="57" />
<code_context>
+ event.set_extra(THIRD_PARTY_RUNNER_ERROR_EXTRA_KEY, is_error)
+
+
+def _runner_result_content_type(is_error: bool) -> ResultContentType:
+ return (
+ ResultContentType.AGENT_RUNNER_ERROR
</code_context>
<issue_to_address>
**issue (complexity):** 建议将这些简单的错误/结果辅助函数内联,这样非流式和流式处理器就可以直接表达其错误聚合和结果设置逻辑。
通过内联这些很小的错误/结果辅助函数并移除一层间接调用,你可以在保留新增行为的同时,降低新增的复杂度。
### 1. 内联 `_aggregate_runner_error` 和 `_set_non_stream_runner_result`
目前 `_aggregate_runner_error` 和 `_set_non_stream_runner_result` 把非常简单的逻辑隐藏了起来,并且把错误标记的处理分散到了多处:
```python
def _set_runner_error_extra(event: "AstrMessageEvent", is_error: bool) -> None:
event.set_extra(THIRD_PARTY_RUNNER_ERROR_EXTRA_KEY, is_error)
def _runner_result_content_type(is_error: bool) -> ResultContentType:
return (
ResultContentType.AGENT_RUNNER_ERROR
if is_error
else ResultContentType.LLM_RESULT
)
def _set_non_stream_runner_result(
event: "AstrMessageEvent",
chain: list,
is_error: bool,
) -> None:
_set_runner_error_extra(event, is_error)
event.set_result(
MessageEventResult(
chain=chain,
result_content_type=_runner_result_content_type(is_error),
),
)
def _aggregate_runner_error(
has_intermediate_error: bool,
final_resp: "LLMResponse | None",
) -> bool:
if not final_resp:
return has_intermediate_error
return has_intermediate_error or final_resp.role == "err"
```
以及它们的使用方式:
```python
is_runner_error = _aggregate_runner_error(
has_intermediate_error=has_intermediate_error,
final_resp=final_resp,
)
_set_non_stream_runner_result(
event=event,
chain=final_resp.result_chain.chain or [],
is_error=is_runner_error,
)
```
你可以在使用的地方直接内联这些逻辑,并删除 `_aggregate_runner_error` 和 `_set_non_stream_runner_result`,从而让控制流更加清晰。比如:
```python
# 保留这个辅助函数 —— 它确实有用
def _set_runner_error_extra(event: "AstrMessageEvent", is_error: bool) -> None:
event.set_extra(THIRD_PARTY_RUNNER_ERROR_EXTRA_KEY, is_error)
```
在 `_handle_non_streaming_runner` 中:
```python
final_resp = runner.get_final_llm_resp()
if not final_resp or not final_resp.result_chain:
if merged_chain:
logger.warning(
"Agent Runner returned no final response, fallback to streamed error/result chain."
)
is_error = has_intermediate_error
_set_runner_error_extra(event, is_error)
event.set_result(
MessageEventResult(
chain=merged_chain,
result_content_type=(
ResultContentType.AGENT_RUNNER_ERROR
if is_error
else ResultContentType.LLM_RESULT
),
)
)
yield
return
logger.warning("Agent Runner 未返回最终结果。")
return
is_error = has_intermediate_error or final_resp.role == "err"
_set_runner_error_extra(event, is_error)
event.set_result(
MessageEventResult(
chain=final_resp.result_chain.chain or [],
result_content_type=(
ResultContentType.AGENT_RUNNER_ERROR
if is_error
else ResultContentType.LLM_RESULT
),
)
)
yield
```
在 `_handle_streaming_runner` 的收尾阶段:
```python
if runner.done():
final_resp = runner.get_final_llm_resp()
if final_resp and final_resp.result_chain:
is_runner_error = stream_has_runner_error or final_resp.role == "err"
_set_runner_error_extra(event, is_runner_error)
event.set_result(
MessageEventResult(
chain=final_resp.result_chain.chain or [],
result_content_type=ResultContentType.STREAMING_FINISH,
)
)
```
这样可以保留:
- `THIRD_PARTY_RUNNER_ERROR_EXTRA_KEY` 的行为;
- 正常 `ResultContentType` 与错误 `ResultContentType` 的区分;
- 中间错误与最终错误的聚合逻辑;
但同时删除两个辅助函数以及额外的 `_runner_result_content_type` 间接层,让错误流转和最终结果设置在调用点上就一目了然,更易于理解和维护。
</issue_to_address>
### Comment 2
<location path="astrbot/core/agent/runners/deerflow/deerflow_agent_runner.py" line_range="41" />
<code_context>
+ from typing_extensions import override
+
+
+class DeerFlowAgentRunner(BaseAgentRunner[TContext]):
+ """DeerFlow Agent Runner via LangGraph HTTP API."""
+
</code_context>
<issue_to_address>
**issue (complexity):** 建议将流式聚合逻辑抽取到一个专门的辅助类中,这样 runner 就只需要专注于 HTTP 调度和 hooks。
通过把流式聚合(values/messages/custom,去重,澄清信息,超时处理,最终文本选择)移动到一个专用的辅助类里,只让 runner 负责 HTTP + hooks 的编排,可以大幅降低理解成本。
### 1. 抽取专用的流式聚合器
当前 `DeerFlowAgentRunner` 自身包含:
- `_StreamState`
- `_extract_new_messages_from_values`
- `_fingerprint_message`
- `_remember_seen_message_id`
- `_handle_values_event`
- `_handle_message_event`
- `_resolve_final_text`
- `_execute_deerflow_request` 中的超时/错误与最终文本的关联逻辑
这些内容都可以封装进一个小而专注的辅助类中,对外暴露最小接口,在保持行为完全一致的前提下,降低 runner 的复杂度。
下面是一个抽取示例(只展示接口,具体实现可以直接从你现有方法中迁移):
```python
@dataclass
class DeerFlowFinalResult:
text: str
is_error: bool
reason: str | None = None
class DeerFlowStreamAggregator:
_MAX_VALUES_HISTORY = 200
@dataclass
class _State:
streamed_text: str = ""
fallback_stream_text: str = ""
clarification_text: str = ""
task_failures: list[str] = field(default_factory=list)
seen_message_ids: set[str] = field(default_factory=set)
seen_message_order: deque[str] = field(default_factory=deque)
no_id_message_fingerprints: dict[int, str] = field(default_factory=dict)
baseline_initialized: bool = False
run_values_messages: list[dict[str, T.Any]] = field(default_factory=list)
timed_out: bool = False
def __init__(self, streaming: bool, timeout: int):
self.streaming = streaming
self.timeout = timeout
self.state = self._State()
def handle_values(self, data: T.Any) -> list[AgentResponse]:
# move current _handle_values_event logic here
...
def handle_messages(self, data: T.Any) -> list[AgentResponse]:
# move current _handle_message_event logic here
...
def handle_custom(self, data: T.Any) -> None:
# move custom event task_failure accumulation
...
def mark_timeout(self) -> None:
self.state.timed_out = True
def finalize(self) -> DeerFlowFinalResult:
# move current _resolve_final_text + timeout/error assembly here
...
```
这样 `_execute_deerflow_request` 会更容易理解:
```python
async def _execute_deerflow_request(self):
...
aggregator = DeerFlowStreamAggregator(
streaming=self.streaming,
timeout=self.timeout,
)
try:
async for event in self.api_client.stream_run(
thread_id=thread_id,
payload=payload,
timeout=self.timeout,
):
event_type = event.get("event")
data = event.get("data")
if event_type == "values":
for resp in aggregator.handle_values(data):
yield resp
elif event_type in {"messages-tuple", "messages", "message"}:
for resp in aggregator.handle_messages(data):
yield resp
elif event_type == "custom":
aggregator.handle_custom(data)
elif event_type == "error":
raise Exception(f"DeerFlow stream returned error event: {data}")
elif event_type == "end":
break
except (asyncio.TimeoutError, TimeoutError):
aggregator.mark_timeout()
final = aggregator.finalize()
role = "err" if final.is_error else "assistant"
chain = MessageChain(chain=[Comp.Plain(final.text)])
self.final_llm_resp = LLMResponse(role=role, result_chain=chain)
self._transition_state(AgentState.DONE)
try:
await self.agent_hooks.on_agent_done(self.run_context, self.final_llm_resp)
except Exception as e:
logger.error(f"Error in on_agent_done hook: {e}", exc_info=True)
yield AgentResponse(type="llm_result", data=AgentResponseData(chain=chain))
```
这样可以保留:
- 所有去重/基线/澄清等行为;
- 超时 vs 任务失败 vs 正常结果的语义不变;
但同时:
- 将 `DeerFlowAgentRunner` 的职责收缩为“准备 payload、拉取流式事件、喂给聚合器、产出最终响应”;
- 把最终结果逻辑(`text`、`is_error`、`reason`)集中到单一的 `finalize()` 调用中,而不是分散在 `_execute_deerflow_request` 和 `_resolve_final_text` 里。
</issue_to_address>Hi @zouyonghe! 👋
感谢你通过评论 @sourcery-ai review 来试用 Sourcery!🚀
安装 sourcery-ai 机器人,即可在每个 Pull Request 上自动获得代码审查 ✨
帮我变得更有用!请在每条评论上点 👍 或 👎,我会根据你的反馈改进后续的审查质量。Original comment in English
Hey - I've found 2 issues
Prompt for AI Agents
Please address the comments from this code review:
## Individual Comments
### Comment 1
<location path="astrbot/core/pipeline/process_stage/method/agent_sub_stages/third_party.py" line_range="57" />
<code_context>
+ event.set_extra(THIRD_PARTY_RUNNER_ERROR_EXTRA_KEY, is_error)
+
+
+def _runner_result_content_type(is_error: bool) -> ResultContentType:
+ return (
+ ResultContentType.AGENT_RUNNER_ERROR
</code_context>
<issue_to_address>
**issue (complexity):** Consider inlining the simple error/result helper functions so the non-streaming and streaming handlers directly express their error aggregation and result-setting logic.
You can reduce some of the new complexity without losing any of the added behavior by inlining the tiny error/result helpers and removing a layer of indirection.
### 1. Inline `_aggregate_runner_error` and `_set_non_stream_runner_result`
Right now `_aggregate_runner_error` and `_set_non_stream_runner_result` obscure very simple logic and scatter the error flag handling:
```python
def _set_runner_error_extra(event: "AstrMessageEvent", is_error: bool) -> None:
event.set_extra(THIRD_PARTY_RUNNER_ERROR_EXTRA_KEY, is_error)
def _runner_result_content_type(is_error: bool) -> ResultContentType:
return (
ResultContentType.AGENT_RUNNER_ERROR
if is_error
else ResultContentType.LLM_RESULT
)
def _set_non_stream_runner_result(
event: "AstrMessageEvent",
chain: list,
is_error: bool,
) -> None:
_set_runner_error_extra(event, is_error)
event.set_result(
MessageEventResult(
chain=chain,
result_content_type=_runner_result_content_type(is_error),
),
)
def _aggregate_runner_error(
has_intermediate_error: bool,
final_resp: "LLMResponse | None",
) -> bool:
if not final_resp:
return has_intermediate_error
return has_intermediate_error or final_resp.role == "err"
```
And usages:
```python
is_runner_error = _aggregate_runner_error(
has_intermediate_error=has_intermediate_error,
final_resp=final_resp,
)
_set_non_stream_runner_result(
event=event,
chain=final_resp.result_chain.chain or [],
is_error=is_runner_error,
)
```
You can make the control flow clearer by inlining the logic directly where it’s used and dropping `_aggregate_runner_error` and `_set_non_stream_runner_result`. For example:
```python
# keep this helper – it's actually useful
def _set_runner_error_extra(event: "AstrMessageEvent", is_error: bool) -> None:
event.set_extra(THIRD_PARTY_RUNNER_ERROR_EXTRA_KEY, is_error)
```
In `_handle_non_streaming_runner`:
```python
final_resp = runner.get_final_llm_resp()
if not final_resp or not final_resp.result_chain:
if merged_chain:
logger.warning(
"Agent Runner returned no final response, fallback to streamed error/result chain."
)
is_error = has_intermediate_error
_set_runner_error_extra(event, is_error)
event.set_result(
MessageEventResult(
chain=merged_chain,
result_content_type=(
ResultContentType.AGENT_RUNNER_ERROR
if is_error
else ResultContentType.LLM_RESULT
),
)
)
yield
return
logger.warning("Agent Runner 未返回最终结果。")
return
is_error = has_intermediate_error or final_resp.role == "err"
_set_runner_error_extra(event, is_error)
event.set_result(
MessageEventResult(
chain=final_resp.result_chain.chain or [],
result_content_type=(
ResultContentType.AGENT_RUNNER_ERROR
if is_error
else ResultContentType.LLM_RESULT
),
)
)
yield
```
And in `_handle_streaming_runner` finalization:
```python
if runner.done():
final_resp = runner.get_final_llm_resp()
if final_resp and final_resp.result_chain:
is_runner_error = stream_has_runner_error or final_resp.role == "err"
_set_runner_error_extra(event, is_runner_error)
event.set_result(
MessageEventResult(
chain=final_resp.result_chain.chain or [],
result_content_type=ResultContentType.STREAMING_FINISH,
)
)
```
This keeps:
- The `THIRD_PARTY_RUNNER_ERROR_EXTRA_KEY` behavior.
- The distinction between normal and error `ResultContentType`.
- The intermediate vs final error aggregation logic.
But removes two helper functions and the extra `_runner_result_content_type` indirection, making the error flow and final result setting visible at the call site and easier to reason about.
</issue_to_address>
### Comment 2
<location path="astrbot/core/agent/runners/deerflow/deerflow_agent_runner.py" line_range="41" />
<code_context>
+ from typing_extensions import override
+
+
+class DeerFlowAgentRunner(BaseAgentRunner[TContext]):
+ """DeerFlow Agent Runner via LangGraph HTTP API."""
+
</code_context>
<issue_to_address>
**issue (complexity):** Consider extracting the streaming aggregation logic into a dedicated helper class so the runner focuses only on HTTP orchestration and hooks.
You can reduce the cognitive load substantially by moving the stream aggregation (values/messages/custom, dedupe, clarification, timeout, final text selection) into a dedicated helper, and having the runner only orchestrate HTTP + hooks.
### 1. Extract a dedicated stream aggregator
Right now `DeerFlowAgentRunner` owns:
- `_StreamState`
- `_extract_new_messages_from_values`
- `_fingerprint_message`
- `_remember_seen_message_id`
- `_handle_values_event`
- `_handle_message_event`
- `_resolve_final_text`
- timeout/error vs final text wiring in `_execute_deerflow_request`
All of that can be encapsulated in a small, focused helper that exposes a minimal interface, keeping behavior identical but reducing the runner’s complexity.
Example extraction (interfaces only, bodies moved from your current methods):
```python
@dataclass
class DeerFlowFinalResult:
text: str
is_error: bool
reason: str | None = None
class DeerFlowStreamAggregator:
_MAX_VALUES_HISTORY = 200
@dataclass
class _State:
streamed_text: str = ""
fallback_stream_text: str = ""
clarification_text: str = ""
task_failures: list[str] = field(default_factory=list)
seen_message_ids: set[str] = field(default_factory=set)
seen_message_order: deque[str] = field(default_factory=deque)
no_id_message_fingerprints: dict[int, str] = field(default_factory=dict)
baseline_initialized: bool = False
run_values_messages: list[dict[str, T.Any]] = field(default_factory=list)
timed_out: bool = False
def __init__(self, streaming: bool, timeout: int):
self.streaming = streaming
self.timeout = timeout
self.state = self._State()
def handle_values(self, data: T.Any) -> list[AgentResponse]:
# move current _handle_values_event logic here
...
def handle_messages(self, data: T.Any) -> list[AgentResponse]:
# move current _handle_message_event logic here
...
def handle_custom(self, data: T.Any) -> None:
# move custom event task_failure accumulation
...
def mark_timeout(self) -> None:
self.state.timed_out = True
def finalize(self) -> DeerFlowFinalResult:
# move current _resolve_final_text + timeout/error assembly here
...
```
Then `_execute_deerflow_request` becomes much easier to follow:
```python
async def _execute_deerflow_request(self):
...
aggregator = DeerFlowStreamAggregator(
streaming=self.streaming,
timeout=self.timeout,
)
try:
async for event in self.api_client.stream_run(
thread_id=thread_id,
payload=payload,
timeout=self.timeout,
):
event_type = event.get("event")
data = event.get("data")
if event_type == "values":
for resp in aggregator.handle_values(data):
yield resp
elif event_type in {"messages-tuple", "messages", "message"}:
for resp in aggregator.handle_messages(data):
yield resp
elif event_type == "custom":
aggregator.handle_custom(data)
elif event_type == "error":
raise Exception(f"DeerFlow stream returned error event: {data}")
elif event_type == "end":
break
except (asyncio.TimeoutError, TimeoutError):
aggregator.mark_timeout()
final = aggregator.finalize()
role = "err" if final.is_error else "assistant"
chain = MessageChain(chain=[Comp.Plain(final.text)])
self.final_llm_resp = LLMResponse(role=role, result_chain=chain)
self._transition_state(AgentState.DONE)
try:
await self.agent_hooks.on_agent_done(self.run_context, self.final_llm_resp)
except Exception as e:
logger.error(f"Error in on_agent_done hook: {e}", exc_info=True)
yield AgentResponse(type="llm_result", data=AgentResponseData(chain=chain))
```
This keeps:
- All dedupe/baseline/clarification behavior exactly as-is.
- Timeout vs task-failure vs normal-result semantics unchanged.
But it:
- Shrinks `DeerFlowAgentRunner` responsibility to “prepare payload, stream events, feed aggregator, emit final response”.
- Centralizes final result logic (`text`, `is_error`, `reason`) into a single `finalize()` call, instead of being spread across `_execute_deerflow_request` + `_resolve_final_text`.
</issue_to_address>Hi @zouyonghe! 👋
Thanks for trying out Sourcery by commenting with @sourcery-ai review! 🚀
Install the sourcery-ai bot to get automatic code reviews on every pull request ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
astrbot/core/pipeline/process_stage/method/agent_sub_stages/third_party.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Hey - 我发现了两个问题,并提供了一些总体反馈:
- DeerFlow provider 模板暴露了一个
proxy字段,但DeerFlowAPIClient从未使用它(也不接受任何与 proxy 相关的参数),建议将 provider 配置中的 proxy 透传到底层的ClientSession或请求调用中,使控制台中的该配置项真正生效。
给 AI 代理的提示词
Please address the comments from this code review:
## Overall Comments
- DeerFlow provider 模板暴露了一个 `proxy` 字段,但 `DeerFlowAPIClient` 从未使用它(也不接受任何与 proxy 相关的参数),建议将 provider 配置中的 proxy 透传到底层的 `ClientSession` 或请求调用中,使控制台中的该配置项真正生效。
## Individual Comments
### Comment 1
<location path="astrbot/core/agent/runners/deerflow/deerflow_agent_runner.py" line_range="98-107" />
<code_context>
+
+ self._transition_state(AgentState.RUNNING)
+
+ try:
+ async for response in self._execute_deerflow_request():
+ yield response
+ except Exception as e:
+ err_msg = self._format_exception(e)
+ logger.error(f"DeerFlow request failed: {err_msg}", exc_info=True)
+ self._transition_state(AgentState.ERROR)
+ err_chain = MessageChain().message(f"DeerFlow request failed: {err_msg}")
+ self.final_llm_resp = LLMResponse(
+ role="err",
+ completion_text=f"DeerFlow request failed: {err_msg}",
+ result_chain=err_chain,
+ )
+ yield AgentResponse(
+ type="err",
+ data=AgentResponseData(
</code_context>
<issue_to_address>
**issue (bug_risk):** 取消被当作通用错误处理;`asyncio.CancelledError` 通常应该被向上抛出,而不是转换为错误响应。
由于从 Python 3.11 起 `asyncio.CancelledError` 继承自 `Exception`,这个宽泛的 `except Exception` 会拦截任务取消,并将其转成 `'err'` 响应,而不是允许任务真正被取消。这在关闭或超时场景下可能导致一些出乎意料的行为。
建议显式地让取消异常向上冒泡:
```python
try:
async for response in self._execute_deerflow_request():
yield response
except asyncio.CancelledError:
raise # let caller handle cancellation
except Exception as e:
err_msg = self._format_exception(e)
... # existing error handling
```
</issue_to_address>
### Comment 2
<location path="astrbot/core/pipeline/process_stage/method/agent_sub_stages/third_party.py" line_range="184" />
<code_context>
logger.debug("Failed to resolve persona custom error message: %s", e)
return None
+ async def _handle_streaming_runner(
+ self,
+ runner: "BaseAgentRunner",
</code_context>
<issue_to_address>
**issue (complexity):** 可以考虑将 runner 的处理和生命周期管理统一进单一的驱动函数和异步上下文管理器,在保留现有行为的前提下简化控制流和错误处理。
你可以在保留所有新行为(error extras、AGENT_RUNNER_ERROR、DeerFlow、清理逻辑)的同时,通过以下方式减少附带复杂度:
1. **将 streaming / non‑streaming 的 handler 统一为一个 driver**
2. **把 runner 生命周期集中到一个 async context manager 中**
### 1. 用一个 driver 替换两个 async‑generator handler
当前的情况是:
- `_handle_streaming_runner` 和 `_handle_non_streaming_runner` 都是 `AsyncGenerator[None, None]`
- `process` 只是对它们做 `async for _ in ...: yield`
- 错误聚合逻辑在两边重复
可以用一个单独的 helper 来替换它们,这个 helper:
- 驱动 `run_third_party_agent`
- 聚合 `merged_chain`、`has_intermediate_error` 和 `final_resp`
- 返回一个小的结果对象,而不是通过 yield 输出
结果对象示例:
```python
@dataclass
class _RunnerResult:
chain: list
is_error: bool
content_type: ResultContentType
has_stream_finish: bool = False # for streaming finish events
```
然后是一个统一的 driver:
```python
async def _run_runner(
self,
runner: "BaseAgentRunner",
event: AstrMessageEvent,
streaming: bool,
stream_to_general: bool,
custom_error_message: str | None,
) -> _RunnerResult:
merged_chain: list = []
has_intermediate_error = False
async for output in run_third_party_agent(
runner,
stream_to_general=stream_to_general,
custom_error_message=custom_error_message,
):
merged_chain.extend(output.chain.chain or [])
if output.is_error:
has_intermediate_error = True
_set_runner_error_extra(event, True)
# For streaming mode we still let process() yield control
if streaming and not stream_to_general:
await asyncio.sleep(0) # or just return control to caller
final_resp = runner.get_final_llm_resp()
if not final_resp or not final_resp.result_chain:
if merged_chain:
logger.warning(
"Agent Runner returned no final response, fallback to streamed error/result chain."
)
is_error = has_intermediate_error
return _RunnerResult(
chain=merged_chain,
is_error=is_error,
content_type=_runner_result_content_type(is_error),
)
logger.warning("Agent Runner 未返回最终结果。")
return _RunnerResult(chain=[], is_error=False, content_type=ResultContentType.LLM_RESULT)
is_error = _aggregate_runner_error(
has_intermediate_error=has_intermediate_error,
final_resp=final_resp,
)
return _RunnerResult(
chain=final_resp.result_chain.chain or [],
is_error=is_error,
content_type=_runner_result_content_type(is_error),
)
```
然后 `process` 可以:
- 在 streaming 模式下:先设置一次流,再调用 `_run_runner` 来计算最终的 `STREAMING_FINISH` 结果;
- 在 non‑streaming 模式下:直接调用 `_run_runner`,并用返回的 `_RunnerResult` 来设置 `event.result`。
对于 streaming,可以保留现有的内部 streaming generator,但让它只负责发出 `MessageChain`,而不承担错误记账逻辑:
```python
async def _stream_runner_chain(
runner: "BaseAgentRunner",
custom_error_message: str | None,
) -> AsyncGenerator[MessageChain, None]:
async for output in run_third_party_agent(
runner,
stream_to_general=False,
custom_error_message=custom_error_message,
):
yield output.chain
```
在 `process` 中则可以这样使用:
```python
if streaming_response and not stream_to_general:
event.set_result(
MessageEventResult()
.set_result_content_type(ResultContentType.STREAMING_RESULT)
.set_async_stream(
_stream_runner_chain(
runner=runner,
custom_error_message=custom_error_message,
)
)
)
yield
result = await self._run_runner(
runner=runner,
event=event,
streaming=True,
stream_to_general=stream_to_general,
custom_error_message=custom_error_message,
)
_set_runner_error_extra(event, result.is_error)
if result.chain:
event.set_result(
MessageEventResult(
chain=result.chain,
result_content_type=ResultContentType.STREAMING_FINISH,
),
)
else:
result = await self._run_runner(
runner=runner,
event=event,
streaming=False,
stream_to_general=stream_to_general,
custom_error_message=custom_error_message,
)
if result.chain:
_set_runner_error_extra(event, result.is_error)
event.set_result(
MessageEventResult(
chain=result.chain,
result_content_type=result.content_type,
)
)
yield
```
这样就可以移除两个 async‑generator handler,并把下面这些逻辑集中到一个地方:
- 错误聚合
- 最终结果选择
- content type 选择
同时保持 `_ThirdPartyRunnerOutput` 和所有错误语义不变。
### 2. 用一个小的 async context manager 封装 reset/close
可以替代当前的做法:
- `_close_runner_if_supported`
- `runner_closed` + `_close_runner_once` 闭包
- 在 reset + handler 周围的多个 `try` / `except` / `finally` 块
使用一个简单的本地异步上下文管理器:
```python
from contextlib import asynccontextmanager
@asynccontextmanager
async def _runner_session(runner: "BaseAgentRunner", **reset_kwargs):
try:
await runner.reset(**reset_kwargs)
yield runner
finally:
await _close_runner_if_supported(runner)
```
然后 `process` 可以写成:
```python
async with _runner_session(
runner,
request=req,
run_context=AgentContextWrapper(
context=astr_agent_ctx,
tool_call_timeout=60,
),
agent_hooks=MAIN_AGENT_HOOKS,
provider_config=self.prov_cfg,
streaming=streaming_response,
):
if streaming_response and not stream_to_general:
# streaming 分支,使用 _stream_runner_chain + _run_runner(如上)
...
else:
# non-streaming 分支,使用 _run_runner(如上)
...
```
这样可以保留:
- “如果支持则关闭”的行为
- 在成功和异常情况下都能正确清理
但去掉了 `runner_closed` 状态 + 闭包,以及在每个分支周围重复的关闭调用。
综合来看,这些修改在保留你新增的所有行为(error extras、AGENT_RUNNER_ERROR、DeerFlow、异常处理、streaming 完成事件)的前提下,减少了层次和重复逻辑,使当前版本更易阅读和维护。
</issue_to_address>帮我变得更有用!请在每条评论上点 👍 或 👎,我会根据你的反馈持续改进 review 质量。
Original comment in English
Hey - I've found 2 issues, and left some high level feedback:
- The DeerFlow provider template exposes a
proxyfield butDeerFlowAPIClientnever uses it (nor accepts a proxy-related argument), so consider wiring the provider config proxy through to the underlyingClientSessionor request calls to make the dashboard option effective.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The DeerFlow provider template exposes a `proxy` field but `DeerFlowAPIClient` never uses it (nor accepts a proxy-related argument), so consider wiring the provider config proxy through to the underlying `ClientSession` or request calls to make the dashboard option effective.
## Individual Comments
### Comment 1
<location path="astrbot/core/agent/runners/deerflow/deerflow_agent_runner.py" line_range="98-107" />
<code_context>
+
+ self._transition_state(AgentState.RUNNING)
+
+ try:
+ async for response in self._execute_deerflow_request():
+ yield response
+ except Exception as e:
+ err_msg = self._format_exception(e)
+ logger.error(f"DeerFlow request failed: {err_msg}", exc_info=True)
+ self._transition_state(AgentState.ERROR)
+ err_chain = MessageChain().message(f"DeerFlow request failed: {err_msg}")
+ self.final_llm_resp = LLMResponse(
+ role="err",
+ completion_text=f"DeerFlow request failed: {err_msg}",
+ result_chain=err_chain,
+ )
+ yield AgentResponse(
+ type="err",
+ data=AgentResponseData(
</code_context>
<issue_to_address>
**issue (bug_risk):** Cancellation is treated as a generic error; `asyncio.CancelledError` should typically be propagated instead of converted to an error response.
Because `asyncio.CancelledError` now subclasses `Exception` (Python 3.11+), this broad `except Exception` will intercept cancellations and turn them into `'err'` responses instead of allowing the task to be cancelled. That can cause surprising behavior during shutdown or timeouts.
Consider explicitly letting cancellations propagate:
```python
try:
async for response in self._execute_deerflow_request():
yield response
except asyncio.CancelledError:
raise # let caller handle cancellation
except Exception as e:
err_msg = self._format_exception(e)
... # existing error handling
```
</issue_to_address>
### Comment 2
<location path="astrbot/core/pipeline/process_stage/method/agent_sub_stages/third_party.py" line_range="184" />
<code_context>
logger.debug("Failed to resolve persona custom error message: %s", e)
return None
+ async def _handle_streaming_runner(
+ self,
+ runner: "BaseAgentRunner",
</code_context>
<issue_to_address>
**issue (complexity):** Consider unifying the runner handling and lifecycle management into a single driver and async context manager to simplify control flow and error handling while preserving existing behaviors.
You can keep all the new behaviors (error extras, AGENT_RUNNER_ERROR, DeerFlow, cleanup) but reduce incidental complexity by:
1. **Unifying the streaming/non‑streaming handlers into one driver**
2. **Centralizing runner lifecycle into an async context manager**
### 1. Replace the two async‑generator handlers with a single driver
Right now:
- `_handle_streaming_runner` and `_handle_non_streaming_runner` are both `AsyncGenerator[None, None]`
- `process` just `async for _ in ...: yield` on them
- Error aggregation logic is duplicated across both
You can replace them with a single helper that:
- Drives `run_third_party_agent`
- Aggregates `merged_chain`, `has_intermediate_error`, and `final_resp`
- Returns a small result object instead of yielding
Example shape:
```python
@dataclass
class _RunnerResult:
chain: list
is_error: bool
content_type: ResultContentType
has_stream_finish: bool = False # for streaming finish events
```
Then a single driver:
```python
async def _run_runner(
self,
runner: "BaseAgentRunner",
event: AstrMessageEvent,
streaming: bool,
stream_to_general: bool,
custom_error_message: str | None,
) -> _RunnerResult:
merged_chain: list = []
has_intermediate_error = False
async for output in run_third_party_agent(
runner,
stream_to_general=stream_to_general,
custom_error_message=custom_error_message,
):
merged_chain.extend(output.chain.chain or [])
if output.is_error:
has_intermediate_error = True
_set_runner_error_extra(event, True)
# For streaming mode we still let process() yield control
if streaming and not stream_to_general:
await asyncio.sleep(0) # or just return control to caller
final_resp = runner.get_final_llm_resp()
if not final_resp or not final_resp.result_chain:
if merged_chain:
logger.warning(
"Agent Runner returned no final response, fallback to streamed error/result chain."
)
is_error = has_intermediate_error
return _RunnerResult(
chain=merged_chain,
is_error=is_error,
content_type=_runner_result_content_type(is_error),
)
logger.warning("Agent Runner 未返回最终结果。")
return _RunnerResult(chain=[], is_error=False, content_type=ResultContentType.LLM_RESULT)
is_error = _aggregate_runner_error(
has_intermediate_error=has_intermediate_error,
final_resp=final_resp,
)
return _RunnerResult(
chain=final_resp.result_chain.chain or [],
is_error=is_error,
content_type=_runner_result_content_type(is_error),
)
```
`process` can then:
- In streaming mode: set up the stream once, then call `_run_runner` to compute the final `STREAMING_FINISH` result.
- In non‑streaming mode: call `_run_runner` and set `event.result` from the returned `_RunnerResult`.
For streaming, you can keep the inner streaming generator but make it only responsible for emitting `MessageChain`, not for error bookkeeping:
```python
async def _stream_runner_chain(
runner: "BaseAgentRunner",
custom_error_message: str | None,
) -> AsyncGenerator[MessageChain, None]:
async for output in run_third_party_agent(
runner,
stream_to_general=False,
custom_error_message=custom_error_message,
):
yield output.chain
```
And in `process`:
```python
if streaming_response and not stream_to_general:
event.set_result(
MessageEventResult()
.set_result_content_type(ResultContentType.STREAMING_RESULT)
.set_async_stream(
_stream_runner_chain(
runner=runner,
custom_error_message=custom_error_message,
)
)
)
yield
result = await self._run_runner(
runner=runner,
event=event,
streaming=True,
stream_to_general=stream_to_general,
custom_error_message=custom_error_message,
)
_set_runner_error_extra(event, result.is_error)
if result.chain:
event.set_result(
MessageEventResult(
chain=result.chain,
result_content_type=ResultContentType.STREAMING_FINISH,
),
)
else:
result = await self._run_runner(
runner=runner,
event=event,
streaming=False,
stream_to_general=stream_to_general,
custom_error_message=custom_error_message,
)
if result.chain:
_set_runner_error_extra(event, result.is_error)
event.set_result(
MessageEventResult(
chain=result.chain,
result_content_type=result.content_type,
)
)
yield
```
This removes the two async‑generator handlers and centralizes:
- error aggregation
- final result selection
- content type selection
while keeping `_ThirdPartyRunnerOutput` and all error semantics intact.
### 2. Encapsulate reset/close into a small async context manager
Instead of:
- `_close_runner_if_supported`
- `runner_closed` + `_close_runner_once` closure
- multiple `try`/`except`/`finally` blocks around reset + handlers
You can use a simple local async context manager:
```python
from contextlib import asynccontextmanager
@asynccontextmanager
async def _runner_session(runner: "BaseAgentRunner", **reset_kwargs):
try:
await runner.reset(**reset_kwargs)
yield runner
finally:
await _close_runner_if_supported(runner)
```
Then `process` becomes:
```python
async with _runner_session(
runner,
request=req,
run_context=AgentContextWrapper(
context=astr_agent_ctx,
tool_call_timeout=60,
),
agent_hooks=MAIN_AGENT_HOOKS,
provider_config=self.prov_cfg,
streaming=streaming_response,
):
if streaming_response and not stream_to_general:
# streaming branch using _stream_runner_chain + _run_runner as above
...
else:
# non-streaming branch using _run_runner as above
...
```
This keeps:
- the “close if supported” behavior
- correct cleanup on both success and exceptions
but removes the `runner_closed` state + closure and the duplicated close calls around each branch.
Together, these changes keep all your new behaviors (error extras, AGENT_RUNNER_ERROR, DeerFlow, exception handling, streaming finish) but reduce the layering and duplicated logic that make the current version harder to follow.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
astrbot/core/pipeline/process_stage/method/agent_sub_stages/third_party.py
Outdated
Show resolved
Hide resolved
|
@sourcery-ai review |
There was a problem hiding this comment.
Hey - 我发现了 3 个问题
给 AI Agent 的提示
请根据本次代码审查中的评论进行修改:
## 单条评论
### 评论 1
<location path="astrbot/core/agent/runners/deerflow/deerflow_agent_runner.py" line_range="328-336" />
<code_context>
+ dropped = state.seen_message_order.popleft()
+ state.seen_message_ids.discard(dropped)
+
+ def _is_likely_base64_image(self, value: str) -> bool:
+ if " " in value:
+ return False
+
+ compact = value.replace("\n", "").replace("\r", "")
+ if not compact or len(compact) % 4 != 0:
+ return False
+
+ base64_chars = (
+ "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/="
+ )
</code_context>
<issue_to_address>
**suggestion (bug_risk):** 当前 Base64 检测过于宽松,可能会把普通文本误认为图像载荷。
目前的启发式规则(`len % 4 == 0` 且所有字符都在 base64 字符表内)会把许多较短的普通字符串误判为 base64(例如 `"test"`、不含空格的 token),从而导致普通的 `image_urls` 文本被包装成 `data:image/png;base64,...`,最终生成坏掉的图片链接。请收紧检测逻辑,例如通过强制最小长度和/或对填充进行约束,或者在将值视为 base64 图片之前,先在 try/except 中调用 `base64.b64decode(..., validate=True)` 进行解码验证。
建议实现:
```python
import base64
from ...hooks import BaseAgentRunHooks
```
```python
def _remember_seen_message_id(self, state: _StreamState, msg_id: str) -> None:
if not msg_id or msg_id in state.seen_message_ids:
return
state.seen_message_ids.add(msg_id)
state.seen_message_order.append(msg_id)
while len(state.seen_message_order) > self._MAX_VALUES_HISTORY:
dropped = state.seen_message_order.popleft()
state.seen_message_ids.discard(dropped)
def _is_likely_base64_image(self, value: str) -> bool:
# 优先快速拒绝包含空白符的值
if " " in value:
return False
# 去掉 base64 载荷中常见的行分隔符
compact = value.replace("\n", "").replace("\r", "")
# 强制最小长度,避免将过短的字符串误判为 base64。
# 一般图像的 base64 载荷长度远大于 32 个字符。
if not compact or len(compact) < 32 or len(compact) % 4 != 0:
return False
base64_chars = (
"ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/="
)
# 快速路径:确保所有字符都在 base64 字符表中
if any(ch not in base64_chars for ch in compact):
return False
# 最终校验:尝试严格的 base64 解码。
# validate=True 会保证只有填充和 base64 格式都正确的字符串通过。
try:
base64.b64decode(compact, validate=True)
except Exception:
return False
return True
from ...hooks import BaseAgentRunHooks
```
</issue_to_address>
### 评论 2
<location path="astrbot/core/pipeline/process_stage/method/agent_sub_stages/third_party.py" line_range="292-301" />
<code_context>
+
+ final_resp = runner.get_final_llm_resp()
+
+ if not final_resp or not final_resp.result_chain:
+ if merged_chain:
+ logger.warning(
+ "Agent Runner returned no final response, fallback to streamed error/result chain."
+ )
+ _set_non_stream_runner_result(
+ event=event,
+ chain=merged_chain,
+ is_error=has_intermediate_error,
+ )
+ yield
+ return
+ logger.warning("Agent Runner 未返回最终结果。")
+ return
+
</code_context>
<issue_to_address>
**issue:** 当既没有最终响应,也没有任何中间链路时,用户完全收不到任何消息。
在 `_handle_non_streaming_runner` 中,当 `final_resp` 为假且 `merged_chain` 为空时,我们只是在日志中记录并直接返回,而没有设置 `event.result`,因此下游看到的是一个静默的 no-op。建议在该分支中构造一个通用错误的 `MessageChain`(例如:"Agent Runner did not return any result"),并将其传递给 `_set_non_stream_runner_result(..., is_error=True)`,以便调用方始终能收到一个明确的错误载荷,而不是空结果。
</issue_to_address>
### 评论 3
<location path="astrbot/core/pipeline/process_stage/method/agent_sub_stages/third_party.py" line_range="152" />
<code_context>
+ logger.warning(f"Failed to close third-party runner cleanly: {e}")
+
+
+@asynccontextmanager
+async def _runner_session(
+ runner: "BaseAgentRunner",
</code_context>
<issue_to_address>
**issue (complexity):** 建议简化新的 runner 生命周期和错误处理流程,去掉多余的间接层(上下文管理器、仅用于序列控制的异步生成器,以及一些很小的帮助函数),而是在 `process` 和各个处理方法中直接处理这些逻辑。
你可以在保留新的 DeerFlow 集成和错误语义的前提下,显著减少间接调用。以下两个重点区域比较值得简化:
---
### 1. 将 `_runner_session` 和 `defer_runner_close` 合并为本地的 `try/finally`
目前 `_runner_session`、`close_runner_once`、`defer_runner_close` 与 `_handle_streaming_runner` 协同来决定何时关闭 runner。这使得逻辑难以追踪,并且将生命周期管理分散到了三个函数中。
你可以通过一个本地标记和 `process` 中的单一 `try/finally` 保留“在流式处理活跃时不关闭”的行为,并完全移除 `_runner_session`、`defer_runner_close` 和 `close_runner_once`。
**修改前(当前结构)**
```python
async with _runner_session(
runner=runner,
request=req,
run_context=AgentContextWrapper(...),
agent_hooks=MAIN_AGENT_HOOKS,
provider_config=self.prov_cfg,
streaming=streaming_response,
) as (close_runner_once, defer_runner_close):
if streaming_response and not stream_to_general:
stream_started = False
async for _ in self._handle_streaming_runner(
runner=runner,
event=event,
custom_error_message=custom_error_message,
close_runner_once=close_runner_once,
):
if not stream_started:
defer_runner_close()
stream_started = True
yield
else:
async for _ in self._handle_non_streaming_runner(...):
yield
```
**修改后(更简单但行为等价)**
```python
close_called = False
async def close_runner_once() -> None:
nonlocal close_called
if close_called:
return
close_called = True
await _close_runner_if_supported(runner)
await runner.reset(
request=req,
run_context=AgentContextWrapper(
context=astr_agent_ctx,
tool_call_timeout=60,
),
agent_hooks=MAIN_AGENT_HOOKS,
provider_config=self.prov_cfg,
streaming=streaming_response,
)
streaming_started = False
try:
if streaming_response and not stream_to_general:
async for _ in self._handle_streaming_runner(
runner=runner,
event=event,
custom_error_message=custom_error_message,
close_runner_once=close_runner_once, # 仍在内部使用
):
streaming_started = True
yield
else:
async for _ in self._handle_non_streaming_runner(
runner=runner,
event=event,
stream_to_general=stream_to_general,
custom_error_message=custom_error_message,
):
yield
finally:
# 仅当从未开始流式处理时才自动关闭;
# 否则由流式处理的 handler 在结束后调用 close_runner_once()
if not streaming_started:
await close_runner_once()
```
这样可以保留:
- 在 `process` 中调用 `runner.reset(...)`
- 在一个 `try/finally` 中集中并显式地处理关闭逻辑
- 保证流式清理在消费者结束后再进行(仍然由 `_handle_streaming_runner` 内调用 `close_runner_once()` 来完成)
之后你就可以删除 `_runner_session` 和 `defer_runner_close`。
---
### 2. 用回调替换仅为 `yield None` 的 `_handle_*` 异步生成器
目前 `_handle_streaming_runner` 和 `_handle_non_streaming_runner` 都是异步生成器,它们:
- 负责对 `event` 和 runner 执行副作用
- 仅仅为了让 `process` 能穿插自己的 `yield` 而 `yield None`
这种模式让控制流更难阅读。你可以在保持对外行为完全一致的前提下:
- 把它们改成普通的 async 函数
- 传入一个简单的 “tick” 回调,由 `process` 中的 `yield` 来实现
**以非流式 handler 为例**
**修改前**
```python
async def _handle_non_streaming_runner(... ) -> AsyncGenerator[None, None]:
merged_chain: list = []
has_intermediate_error = False
async for output in run_third_party_agent(...):
merged_chain.extend(output.chain.chain or [])
if output.is_error:
has_intermediate_error = True
yield
final_resp = runner.get_final_llm_resp()
...
_set_non_stream_runner_result(...)
yield
```
**修改后**
```python
async def _handle_non_streaming_runner(
self,
runner: "BaseAgentRunner",
event: AstrMessageEvent,
stream_to_general: bool,
custom_error_message: str | None,
on_tick: T.Callable[[], None],
) -> None:
merged_chain: list = []
has_intermediate_error = False
async for output in run_third_party_agent(
runner,
stream_to_general=stream_to_general,
custom_error_message=custom_error_message,
):
merged_chain.extend(output.chain.chain or [])
if output.is_error:
has_intermediate_error = True
on_tick()
final_resp = runner.get_final_llm_resp()
...
_set_non_stream_runner_result(
event=event,
chain=final_resp.result_chain.chain or [],
is_error=is_runner_error,
)
on_tick()
```
然后 `process` 变为:
```python
async def process(... ) -> AsyncGenerator[None, None]:
...
async def tick() -> None:
# 统一在一个地方向调用方 yield
yield
...
await self._handle_non_streaming_runner(
runner=runner,
event=event,
stream_to_general=stream_to_general,
custom_error_message=custom_error_message,
on_tick=lambda: (yield), # 或一个小的本地 helper
)
```
类似地,`_handle_streaming_runner` 也可以改成普通 async 函数:
- 负责用一个流式生成器调用 `event.set_result(...)`
- 计算 `stream_has_runner_error` 和最终结果
- 在原本 `yield` 的地方调用 `on_tick()`
这样可以去掉“仅为序列控制而 yield 的异步生成器”模式,使每个方法中的错误和生命周期处理更容易理解。
---
### 3. 在使用处内联体积很小的错误处理 helper
目前 `_ThirdPartyRunnerOutput`、`_runner_result_content_type`、`_set_non_stream_runner_result` 和 `_aggregate_runner_error` 这些抽象都比较小,反而在一定程度上遮蔽了端到端错误路径。
你可以保留 `_ThirdPartyRunnerOutput`(它对 `run_third_party_agent` 是有帮助的),但将其他 helper 内联到两个 handler 方法中,这样完整的错误流程可以在一个地方一目了然。
**在 `_handle_non_streaming_runner` 内部的简化示例**
```python
is_runner_error = has_intermediate_error or final_resp.role == "err"
event.set_extra(THIRD_PARTY_RUNNER_ERROR_EXTRA_KEY, is_runner_error)
event.set_result(
MessageEventResult(
chain=final_resp.result_chain.chain or [],
result_content_type=(
ResultContentType.AGENT_RUNNER_ERROR
if is_runner_error
else ResultContentType.LLM_RESULT
),
),
)
```
这样就可以删除 `_runner_result_content_type`、`_set_non_stream_runner_result` 和 `_aggregate_runner_error`,并在使用点上完全显式地展现错误语义。
---
综合来看,这些改动可以:
- 保留所有现有功能(runner reset、流式与非流式处理、错误标记以及清理)
- 移除一个上下文管理器和 `defer_runner_close` 回调
- 将仅用于序列控制的异步生成器改为普通 async 函数 + 回调
- 在 helper 过多影响可读性时,内联一些体积较小的 helper。
</issue_to_address>帮我变得更有用!请对每条评论点 👍 或 👎,我会根据你的反馈改进之后的代码审查。
Original comment in English
Hey - I've found 3 issues
Prompt for AI Agents
Please address the comments from this code review:
## Individual Comments
### Comment 1
<location path="astrbot/core/agent/runners/deerflow/deerflow_agent_runner.py" line_range="328-336" />
<code_context>
+ dropped = state.seen_message_order.popleft()
+ state.seen_message_ids.discard(dropped)
+
+ def _is_likely_base64_image(self, value: str) -> bool:
+ if " " in value:
+ return False
+
+ compact = value.replace("\n", "").replace("\r", "")
+ if not compact or len(compact) % 4 != 0:
+ return False
+
+ base64_chars = (
+ "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/="
+ )
</code_context>
<issue_to_address>
**suggestion (bug_risk):** Base64 detection is overly permissive and may incorrectly treat normal text as an image payload.
This heuristic (`len % 4 == 0` and all chars in the base64 alphabet) will flag many short plain strings as base64 (e.g. `"test"`, tokens without spaces), so normal `image_urls` text may be wrapped as `data:image/png;base64,...` and yield broken images. Please tighten the detection, e.g. by enforcing a minimum length and/or padding, or by attempting `base64.b64decode(..., validate=True)` in a try/except before treating it as a base64 image.
Suggested implementation:
```python
import base64
from ...hooks import BaseAgentRunHooks
```
```python
def _remember_seen_message_id(self, state: _StreamState, msg_id: str) -> None:
if not msg_id or msg_id in state.seen_message_ids:
return
state.seen_message_ids.add(msg_id)
state.seen_message_order.append(msg_id)
while len(state.seen_message_order) > self._MAX_VALUES_HISTORY:
dropped = state.seen_message_order.popleft()
state.seen_message_ids.discard(dropped)
def _is_likely_base64_image(self, value: str) -> bool:
# Reject values containing whitespace early
if " " in value:
return False
# Strip common line separators used in base64-encoded payloads
compact = value.replace("\n", "").replace("\r", "")
# Enforce a minimum length to avoid misclassifying short strings.
# Typical image base64 payloads are far longer than 32 characters.
if not compact or len(compact) < 32 or len(compact) % 4 != 0:
return False
base64_chars = (
"ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/="
)
# Fast path: ensure all characters are in the base64 alphabet
if any(ch not in base64_chars for ch in compact):
return False
# Final verification: attempt strict base64 decode.
# validate=True ensures only correctly padded/base64 strings pass.
try:
base64.b64decode(compact, validate=True)
except Exception:
return False
return True
from ...hooks import BaseAgentRunHooks
```
</issue_to_address>
### Comment 2
<location path="astrbot/core/pipeline/process_stage/method/agent_sub_stages/third_party.py" line_range="292-301" />
<code_context>
+
+ final_resp = runner.get_final_llm_resp()
+
+ if not final_resp or not final_resp.result_chain:
+ if merged_chain:
+ logger.warning(
+ "Agent Runner returned no final response, fallback to streamed error/result chain."
+ )
+ _set_non_stream_runner_result(
+ event=event,
+ chain=merged_chain,
+ is_error=has_intermediate_error,
+ )
+ yield
+ return
+ logger.warning("Agent Runner 未返回最终结果。")
+ return
+
</code_context>
<issue_to_address>
**issue:** When neither a final response nor any intermediate chain exist, the user receives no message at all.
In `_handle_non_streaming_runner`, when `final_resp` is falsy and `merged_chain` is empty, we only log and return without setting `event.result`, so downstream sees a silent no-op. Consider constructing a generic error `MessageChain` (e.g., "Agent Runner did not return any result") in this branch and passing it to `_set_non_stream_runner_result(..., is_error=True)` so callers consistently receive a concrete error payload instead of nothing.
</issue_to_address>
### Comment 3
<location path="astrbot/core/pipeline/process_stage/method/agent_sub_stages/third_party.py" line_range="152" />
<code_context>
+ logger.warning(f"Failed to close third-party runner cleanly: {e}")
+
+
+@asynccontextmanager
+async def _runner_session(
+ runner: "BaseAgentRunner",
</code_context>
<issue_to_address>
**issue (complexity):** Consider simplifying the new runner lifecycle and error-handling flow by removing extra indirection (context manager, sequencing-only async generators, and tiny helpers) and handling it directly in `process` and the handler methods.
You can keep the new DeerFlow integration and error semantics while trimming a lot of indirection. Two focused areas that are likely worth simplifying:
---
### 1. Collapse `_runner_session` + `defer_runner_close` into a local `try/finally`
Right now `_runner_session`, `close_runner_once`, `defer_runner_close`, and `_handle_streaming_runner` coordinate to decide when to close the runner. This is hard to follow and spreads lifecycle logic across three functions.
You can keep the “don’t close while streaming is active” behavior with a local flag and a single `try/finally` in `process`, and drop `_runner_session`, `defer_runner_close`, and `close_runner_once` entirely.
**Before (current structure)**
```python
async with _runner_session(
runner=runner,
request=req,
run_context=AgentContextWrapper(...),
agent_hooks=MAIN_AGENT_HOOKS,
provider_config=self.prov_cfg,
streaming=streaming_response,
) as (close_runner_once, defer_runner_close):
if streaming_response and not stream_to_general:
stream_started = False
async for _ in self._handle_streaming_runner(
runner=runner,
event=event,
custom_error_message=custom_error_message,
close_runner_once=close_runner_once,
):
if not stream_started:
defer_runner_close()
stream_started = True
yield
else:
async for _ in self._handle_non_streaming_runner(...):
yield
```
**After (simpler, equivalent behavior)**
```python
close_called = False
async def close_runner_once() -> None:
nonlocal close_called
if close_called:
return
close_called = True
await _close_runner_if_supported(runner)
await runner.reset(
request=req,
run_context=AgentContextWrapper(
context=astr_agent_ctx,
tool_call_timeout=60,
),
agent_hooks=MAIN_AGENT_HOOKS,
provider_config=self.prov_cfg,
streaming=streaming_response,
)
streaming_started = False
try:
if streaming_response and not stream_to_general:
async for _ in self._handle_streaming_runner(
runner=runner,
event=event,
custom_error_message=custom_error_message,
close_runner_once=close_runner_once, # still used inside
):
streaming_started = True
yield
else:
async for _ in self._handle_non_streaming_runner(
runner=runner,
event=event,
stream_to_general=stream_to_general,
custom_error_message=custom_error_message,
):
yield
finally:
# only auto-close if streaming never started; otherwise the
# streaming handler is responsible for calling close_runner_once()
if not streaming_started:
await close_runner_once()
```
This keeps:
- `runner.reset(...)` in `process`
- Close logic centralized and visible in one `try/finally`
- The requirement that streaming cleanup happens after the consumer finishes (still handled in `_handle_streaming_runner` by calling `close_runner_once()`).
You can then delete `_runner_session` and `defer_runner_close` entirely.
---
### 2. Replace `_handle_*` async generators that yield `None` with callbacks
Both `_handle_streaming_runner` and `_handle_non_streaming_runner` are async generators that:
- Orchestrate side effects on `event` and the runner
- Yield `None` just so `process` can interleave its own `yield`s
This pattern makes control flow harder to read. You can keep the exact external behavior by:
- Making them regular async functions
- Passing a simple “tick” callback that does `yield` from `process`
**Example for the non‑streaming handler**
**Before**
```python
async def _handle_non_streaming_runner(... ) -> AsyncGenerator[None, None]:
merged_chain: list = []
has_intermediate_error = False
async for output in run_third_party_agent(...):
merged_chain.extend(output.chain.chain or [])
if output.is_error:
has_intermediate_error = True
yield
final_resp = runner.get_final_llm_resp()
...
_set_non_stream_runner_result(...)
yield
```
**After**
```python
async def _handle_non_streaming_runner(
self,
runner: "BaseAgentRunner",
event: AstrMessageEvent,
stream_to_general: bool,
custom_error_message: str | None,
on_tick: T.Callable[[], None],
) -> None:
merged_chain: list = []
has_intermediate_error = False
async for output in run_third_party_agent(
runner,
stream_to_general=stream_to_general,
custom_error_message=custom_error_message,
):
merged_chain.extend(output.chain.chain or [])
if output.is_error:
has_intermediate_error = True
on_tick()
final_resp = runner.get_final_llm_resp()
...
_set_non_stream_runner_result(
event=event,
chain=final_resp.result_chain.chain or [],
is_error=is_runner_error,
)
on_tick()
```
Then `process` becomes:
```python
async def process(... ) -> AsyncGenerator[None, None]:
...
async def tick() -> None:
# one place to yield back to caller
yield
...
await self._handle_non_streaming_runner(
runner=runner,
event=event,
stream_to_general=stream_to_general,
custom_error_message=custom_error_message,
on_tick=lambda: (yield), # or a small local helper
)
```
Similarly, `_handle_streaming_runner` can be converted to a normal async function that:
- Sets `event.set_result(...)` with a streaming generator
- Computes `stream_has_runner_error` and final result
- Calls `on_tick()` wherever you currently `yield`
This removes the “async generator that yields only for sequencing” pattern and makes error and lifecycle handling in each method easier to reason about.
---
### 3. Inline tiny error helpers where they are used
The current `_ThirdPartyRunnerOutput`, `_runner_result_content_type`, `_set_non_stream_runner_result`, and `_aggregate_runner_error` abstractions are small enough that they obscure the end‑to‑end error path.
You can keep `_ThirdPartyRunnerOutput` (it’s helpful for `run_third_party_agent`) but inline the other helpers into the two handler methods so the full error flow is visible in one place.
**Example simplification inside `_handle_non_streaming_runner`**
```python
is_runner_error = has_intermediate_error or final_resp.role == "err"
event.set_extra(THIRD_PARTY_RUNNER_ERROR_EXTRA_KEY, is_runner_error)
event.set_result(
MessageEventResult(
chain=final_resp.result_chain.chain or [],
result_content_type=(
ResultContentType.AGENT_RUNNER_ERROR
if is_runner_error
else ResultContentType.LLM_RESULT
),
),
)
```
This lets you delete `_runner_result_content_type`, `_set_non_stream_runner_result`, and `_aggregate_runner_error`, and makes the error semantics completely explicit at the point of use.
---
In combination, these changes:
- Keep all current functionality (runner reset, streaming vs non‑streaming, error tagging, and cleanup)
- Remove one context manager and the `defer_runner_close` callback
- Replace sequencing-only async generators with normal async functions + callbacks
- Inline tiny helpers where they hinder readability more than they help.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
astrbot/core/pipeline/process_stage/method/agent_sub_stages/third_party.py
Show resolved
Hide resolved
astrbot/core/pipeline/process_stage/method/agent_sub_stages/third_party.py
Outdated
Show resolved
Hide resolved
|
@sourcery-ai review |
There was a problem hiding this comment.
Hey - 我已经审查了你的更改,看起来很棒!
帮我变得更有用!请在每条评论上点 👍 或 👎,我会根据你的反馈改进后续的审查。
Original comment in English
Hey - I've reviewed your changes and they look great!
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
|
@sourcery-ai review |
There was a problem hiding this comment.
Hey - 我发现了 3 个问题
给 AI Agent 的提示
Please address the comments from this code review:
## 单条评论
### 评论 1
<location path="astrbot/core/pipeline/process_stage/method/agent_sub_stages/third_party.py" line_range="267-276" />
<code_context>
+ streaming_started = True
</code_context>
<issue_to_address>
**issue (bug_risk):** 如果产生了流式结果但消费者从未开始遍历异步流,那么 Runner 可能永远不会被关闭。
由于 `streaming_started` 将清理责任切换到了 `_stream_runner_chain()` 的 `finally` 中,只有在消费者实际遍历异步流时,runner 才会被关闭。如果消费者在开始遍历之前就放弃了该流(例如断开连接),`close_runner_once` 就永远不会被调用,资源可能会泄漏。可以考虑在 `set_async_stream` 外面再包一层逻辑,以在取消/GC 时确保 runner 会被清理(例如使用一个监管任务或超时/终结器,在流从未被消费时关闭 runner),或者在事件上增加一个显式的取消钩子,在流未被读取时总是关闭 runner。
</issue_to_address>
### 评论 2
<location path="astrbot/core/pipeline/process_stage/method/agent_sub_stages/third_party.py" line_range="201" />
<code_context>
+
+if sys.version_info >= (3, 12):
+ from typing import override
+else:
+ from typing_extensions import override
+
</code_context>
<issue_to_address>
**issue (complexity):** 可以考虑把非流式消费逻辑和最终结果/错误解析逻辑抽取到共享的辅助函数中,让两个分支复用它们,而不是在内联代码里重复控制流和状态处理。
在保留所有新行为的同时,你可以通过将“消费 runner + 解析最终结果”的逻辑集中到几个小的 helper 中,并在流式和非流式分支中复用它们,从而降低复杂度。
### 1. 抽取非流式 runner 消费逻辑
当前非流式分支内联了:
- 收集 `merged_chain`
- 追踪 `has_intermediate_error`
- 调用 `runner.get_final_llm_resp()`
- 决定要使用哪个 chain/内容类型
你可以把这些移到一个 helper 中,从而让 `process` 不需要直接管理这些细节:
```python
async def _consume_runner_non_streaming(
runner: "BaseAgentRunner",
stream_to_general: bool,
custom_error_message: str | None,
) -> tuple[list, bool, "LLMResp | None"]:
merged_chain: list = []
has_intermediate_error = False
async for output in run_third_party_agent(
runner,
stream_to_general=stream_to_general,
custom_error_message=custom_error_message,
):
merged_chain.extend(output.chain.chain or [])
if output.is_error:
has_intermediate_error = True
final_resp = runner.get_final_llm_resp()
return merged_chain, has_intermediate_error, final_resp
```
然后 `process` 中的非流式分支就会简单很多:
```python
else:
merged_chain, has_intermediate_error, final_resp = (
await _consume_runner_non_streaming(
runner,
stream_to_general=stream_to_general,
custom_error_message=custom_error_message,
)
)
yield # 保持现有节奏
chain, is_error, result_content_type = _resolve_final_result(
merged_chain=merged_chain,
final_resp=final_resp,
has_stream_error=has_intermediate_error,
)
_set_runner_error_extra(event, is_error)
event.set_result(
MessageEventResult(
chain=chain,
result_content_type=result_content_type,
),
)
yield
```
### 2. 最终结果解析的单一事实来源
目前流式和非流式分支都有各自独立的逻辑来决定:
- 是否把这次当作错误
- 使用哪个 chain(最终的 vs 合并的)
- 要设置什么 `ResultContentType`
你可以把这些汇总到一个 helper 里,在两个分支中复用:
```python
def _resolve_final_result(
merged_chain: list,
final_resp: "LLMResp | None",
has_stream_error: bool,
) -> tuple[list, bool, ResultContentType]:
if not final_resp or not final_resp.result_chain:
if merged_chain:
# 回退到流式过程中收集的 chain
is_error = has_stream_error
content_type = (
ResultContentType.AGENT_RUNNER_ERROR
if is_error
else ResultContentType.LLM_RESULT
)
return merged_chain, is_error, content_type
# 完全没有结果
fallback_error_chain = MessageChain().message(
"Agent Runner did not return any result.",
)
return (
fallback_error_chain.chain or [],
True,
ResultContentType.AGENT_RUNNER_ERROR,
)
is_error = has_stream_error or final_resp.role == "err"
content_type = (
ResultContentType.AGENT_RUNNER_ERROR
if is_error
else ResultContentType.LLM_RESULT
)
return final_resp.result_chain.chain or [], is_error, content_type
```
在非流式分支中的用法:见上文示例。
在流式分支中,当流结束后可以这样使用:
```python
if runner.done():
final_resp = runner.get_final_llm_resp()
# 如果你只做流式输出,这里的 merged_chain 就是 [],也可以像非流式 helper 那样顺便收集。
chain, is_error, result_content_type = _resolve_final_result(
merged_chain=[],
final_resp=final_resp,
has_stream_error=stream_has_runner_error,
)
_set_runner_error_extra(event, is_error)
event.set_result(
MessageEventResult(
chain=chain,
result_content_type=result_content_type
if result_content_type is not ResultContentType.LLM_RESULT
else ResultContentType.STREAMING_FINISH,
),
)
```
(你可以保留特殊的 `STREAMING_FINISH` 类型,只需像上面这样在需要时覆盖即可。)
这样既保持了现有语义(错误标记、回退逻辑、`AGENT_RUNNER_ERROR` 与 `LLM_RESULT`/`STREAMING_FINISH` 的区分),又能:
- 消除重复的“最终结果 vs 合并 chain”的逻辑
- 将错误/结束策略集中到 `_resolve_final_result` 中
- 让 `process` 更易读,每个分支都只需委托给小而聚焦的 helper,而不用在内联代码中操控多个标志和条件分支。
</issue_to_address>
### 评论 3
<location path="astrbot/core/agent/runners/deerflow/deerflow_agent_runner.py" line_range="44" />
<code_context>
+ from typing_extensions import override
+
+
+class DeerFlowAgentRunner(BaseAgentRunner[TContext]):
+ """DeerFlow Agent Runner via LangGraph HTTP API."""
+
</code_context>
<issue_to_address>
**issue (complexity):** 可以考虑将配置解析、流累积和内容适配抽取成一些小的辅助类/模块,从 `DeerFlowAgentRunner` 中解耦这些职责,使其本身更精简、更易测试。
你可以通过从 `DeerFlowAgentRunner` 中抽出几个聚焦的协作对象,在不改变行为的前提下降低复杂度:
---
### 1. 从 `reset` 中抽离配置处理
`reset` 目前在做:请求绑定、配置解析以及 API 客户端生命周期管理。你可以把配置解析和签名构建移动到一个小的 dataclass 中:
```python
# deerflow_config.py
from dataclasses import dataclass
@dataclass(frozen=True)
class DeerFlowConfig:
api_base: str
api_key: str
auth_header: str
proxy: str
assistant_id: str
model_name: str
thinking_enabled: bool
plan_mode: bool
subagent_enabled: bool
max_concurrent_subagents: int
timeout: int
recursion_limit: int
@classmethod
def from_provider_config(cls, provider_config: dict, coerce_int) -> "DeerFlowConfig":
api_base = provider_config.get("deerflow_api_base", "http://127.0.0.1:2026")
if not isinstance(api_base, str) or not api_base.startswith(("http://", "https://")):
raise ValueError(
"DeerFlow API Base URL format is invalid. It must start with http:// or https://."
)
proxy = provider_config.get("proxy", "")
proxy = proxy.strip() if isinstance(proxy, str) else ""
return cls(
api_base=api_base,
api_key=provider_config.get("deerflow_api_key", ""),
auth_header=provider_config.get("deerflow_auth_header", ""),
proxy=proxy,
assistant_id=provider_config.get("deerflow_assistant_id", "lead_agent"),
model_name=provider_config.get("deerflow_model_name", ""),
thinking_enabled=bool(provider_config.get("deerflow_thinking_enabled", False)),
plan_mode=bool(provider_config.get("deerflow_plan_mode", False)),
subagent_enabled=bool(provider_config.get("deerflow_subagent_enabled", False)),
max_concurrent_subagents=coerce_int(
"deerflow_max_concurrent_subagents",
provider_config.get("deerflow_max_concurrent_subagents", 3),
default=3,
min_value=1,
),
timeout=coerce_int("timeout", provider_config.get("timeout", 300), default=300, min_value=1),
recursion_limit=coerce_int(
"deerflow_recursion_limit",
provider_config.get("deerflow_recursion_limit", 1000),
default=1000,
min_value=1,
),
)
```
这样 `reset` 就主要负责绑定和客户端复用:
```python
# in DeerFlowAgentRunner
from .deerflow_config import DeerFlowConfig
@override
async def reset(..., provider_config: dict, **kwargs: T.Any) -> None:
self.req = request
self.streaming = kwargs.get("streaming", False)
self.final_llm_resp = None
self._state = AgentState.IDLE
self.agent_hooks = agent_hooks
self.run_context = run_context
self.config = DeerFlowConfig.from_provider_config(provider_config, self._coerce_int_config)
new_client_signature = (
self.config.api_base,
self.config.api_key,
self.config.auth_header,
self.config.proxy,
)
old_client = getattr(self, "api_client", None)
old_signature = getattr(self, "_api_client_signature", None)
if (
isinstance(old_client, DeerFlowAPIClient)
and old_signature == new_client_signature
and not old_client.is_closed
):
self.api_client = old_client
return
if isinstance(old_client, DeerFlowAPIClient):
try:
await old_client.close()
except Exception as e:
logger.warning(f"Failed to close previous DeerFlow API client cleanly: {e}")
self.api_client = DeerFlowAPIClient(
api_base=self.config.api_base,
api_key=self.config.api_key,
auth_header=self.config.auth_header,
proxy=self.config.proxy,
)
self._api_client_signature = new_client_signature
# 后续的小改动:在各处统一使用 self.config.timeout / recursion_limit / 等等
```
这样可以立即从 runner 中移除一大块赋值逻辑,并将校验集中起来。
---
### 2. 将流状态和事件处理抽取到一个 helper 中
`_StreamState`、`_handle_values_event`、`_handle_message_event`、`_extract_new_messages_from_values`、`_remember_seen_message_id`、`_fingerprint_message`、`_resolve_final_output` 之间高度相关,可以封装进一个“stream accumulator”中,通过一个小的 API 对外暴露。
最小的抽取方式(保持行为不变,主要是复制/粘贴,并在需要时回调 runner):
```python
# deerflow_stream_accumulator.py
from dataclasses import dataclass, field
from collections import deque
import typing as T
import astrbot.core.message.components as Comp
from astrbot.core.message.message_event_result import MessageChain
from astrbot import logger
from .deerflow_stream_utils import (
build_task_failure_summary,
extract_ai_delta_from_event_data,
extract_clarification_from_event_data,
extract_latest_ai_message,
extract_latest_ai_text,
extract_latest_clarification_text,
extract_messages_from_values_data,
extract_task_failures_from_custom_event,
extract_text,
get_message_id,
)
@dataclass
class StreamState:
latest_text: str = ""
prev_text_for_streaming: str = ""
clarification_text: str = ""
task_failures: list[str] = field(default_factory=list)
seen_message_ids: set[str] = field(default_factory=set)
seen_message_order: deque[str] = field(default_factory=deque)
no_id_message_fingerprints: dict[int, str] = field(default_factory=dict)
baseline_initialized: bool = False
has_values_text: bool = False
run_values_messages: list[dict[str, T.Any]] = field(default_factory=list)
timed_out: bool = False
class DeerFlowStreamAccumulator:
def __init__(self, max_values_history: int, streaming: bool, timeout: int):
self._max_values_history = max_values_history
self._streaming = streaming
self._timeout = timeout
self.state = StreamState()
# `_fingerprint_message`, `_remember_seen_message_id`, `_extract_new_messages_from_values`
# 和 `_build_chain_from_ai_content` 可以几乎原样迁移到这里,
# 只是 `_build_chain_from_ai_content` 需要一个小的 hook 来构造组件。
def mark_timed_out(self) -> None:
self.state.timed_out = True
def handle_values_event(self, data: T.Any) -> list["AgentResponse"]:
...
# 复制现有逻辑,使用 `self.state` 而不是 `state`,
# 并使用 `self._max_values_history`、`self._streaming`。
def handle_message_event(self, data: T.Any) -> "AgentResponse | None":
...
# 复制现有逻辑,使用 `self.state` 和 `self._streaming`。
def handle_custom_event(self, data: T.Any) -> None:
self.state.task_failures.extend(extract_task_failures_from_custom_event(data))
def resolve_final_output(self, build_chain_from_ai_content) -> tuple[MessageChain, bool]:
state = self.state
failures_only = False
final_chain = MessageChain()
if state.clarification_text:
final_chain = MessageChain(chain=[Comp.Plain(state.clarification_text)])
else:
latest_ai_message = extract_latest_ai_message(state.run_values_messages)
if latest_ai_message:
final_chain = build_chain_from_ai_content(latest_ai_message.get("content"))
if not final_chain.chain and state.latest_text:
final_chain = MessageChain(chain=[Comp.Plain(state.latest_text)])
if not final_chain.chain:
failure_text = build_task_failure_summary(state.task_failures)
if failure_text:
final_chain = MessageChain(chain=[Comp.Plain(failure_text)])
failures_only = True
if not final_chain.chain:
logger.warning("DeerFlow returned no text content in stream events.")
final_chain = MessageChain(chain=[Comp.Plain("DeerFlow returned an empty response.")])
return final_chain, failures_only
```
然后 `_execute_deerflow_request` 就会变成一个更薄的编排外壳:
```python
# in DeerFlowAgentRunner
from .deerflow_stream_accumulator import DeerFlowStreamAccumulator
async def _execute_deerflow_request(self):
prompt = self.req.prompt or ""
session_id = self.req.session_id or f"deerflow-ephemeral-{uuid4()}"
image_urls = self.req.image_urls or []
system_prompt = self.req.system_prompt
thread_id = await self._ensure_thread_id(session_id)
payload = self._build_payload(
thread_id=thread_id,
prompt=prompt,
image_urls=image_urls,
system_prompt=system_prompt,
)
accumulator = DeerFlowStreamAccumulator(
max_values_history=self._MAX_VALUES_HISTORY,
streaming=self.streaming,
timeout=self.config.timeout, # 或 self.timeout
)
try:
async for event in self.api_client.stream_run(
thread_id=thread_id,
payload=payload,
timeout=self.config.timeout,
):
event_type = event.get("event")
data = event.get("data")
if event_type == "values":
for response in accumulator.handle_values_event(data):
yield response
elif event_type in {"messages-tuple", "messages", "message"}:
response = accumulator.handle_message_event(data)
if response:
yield response
elif event_type == "custom":
accumulator.handle_custom_event(data)
elif event_type == "error":
raise Exception(f"DeerFlow stream returned error event: {data}")
elif event_type == "end":
break
except (asyncio.TimeoutError, TimeoutError):
accumulator.mark_timed_out()
final_chain, failures_only = accumulator.resolve_final_output(self._build_chain_from_ai_content)
# 超时提示 + 流式最终 delta + 最终 LLMResponse 逻辑可以继续留在这里
...
```
这样就从 `DeerFlowAgentRunner` 中移除了大部分状态处理逻辑,并使流行为可以单独测试。
---
### 3. 可选:内容适配器模块
如果你还想要一个额外的“小胜利”,可以把 `_is_likely_base64_image`、`_image_component_from_url`、`_append_components_from_content`、`_build_chain_from_ai_content` 提取到 `deerflow_content_adapter.py` 模块中。例如:
```python
# deerflow_content_adapter.py
def build_chain_from_ai_content(content: T.Any) -> MessageChain:
components: list[Comp.BaseMessageComponent] = []
append_components_from_content(content, components)
if components:
return MessageChain(chain=components)
fallback_text = extract_text(content)
if fallback_text:
return MessageChain(chain=[Comp.Plain(fallback_text)])
return MessageChain()
```
然后在 runner 中只需要调用 `build_chain_from_ai_content`,而 `DeerFlowStreamAccumulator.resolve_final_output` 也可以依赖这个函数,而不是依赖 runner 的方法。
这些抽取在保持功能不变的同时,把“大一统”的类拆分成更小、可测试的单元,并降低了阅读 `DeerFlowAgentRunner` 的认知负担。
</issue_to_address>帮我变得更有用!请对每条评论点 👍 或 👎,我会根据你的反馈改进之后的代码审查。
Original comment in English
Hey - I've found 3 issues
Prompt for AI Agents
Please address the comments from this code review:
## Individual Comments
### Comment 1
<location path="astrbot/core/pipeline/process_stage/method/agent_sub_stages/third_party.py" line_range="267-276" />
<code_context>
+ streaming_started = True
</code_context>
<issue_to_address>
**issue (bug_risk):** Runner may never be closed if a streaming result is produced but the consumer never starts iterating the async stream.
Because `streaming_started` flips the cleanup responsibility to `_stream_runner_chain()`’s `finally`, the runner is only closed if the consumer actually iterates the async stream. If the consumer abandons the stream (e.g., disconnects) before iterating, `close_runner_once` is never called and resources can leak. Consider wrapping `set_async_stream` with logic that guarantees runner cleanup on cancellation/GC (e.g., a supervising task or timeout/finalizer that closes the runner if the stream is never consumed), or adding an explicit cancellation hook on the event that always closes the runner when the stream isn’t read.
</issue_to_address>
### Comment 2
<location path="astrbot/core/pipeline/process_stage/method/agent_sub_stages/third_party.py" line_range="201" />
<code_context>
+
+if sys.version_info >= (3, 12):
+ from typing import override
+else:
+ from typing_extensions import override
+
</code_context>
<issue_to_address>
**issue (complexity):** Consider extracting the non-streaming consumption logic and the final-result/error-resolution logic into shared helper functions so both branches reuse them instead of duplicating control flow and state handling inline.
You can keep all the new behavior but reduce the complexity by centralizing the “consume runner + resolve final result” logic into small helpers and reusing them in both streaming and non‑streaming branches.
### 1. Factor out non‑streaming runner consumption
The non‑streaming branch currently inlines:
- collecting `merged_chain`
- tracking `has_intermediate_error`
- calling `runner.get_final_llm_resp()`
- deciding which chain/content type to use
You can move that into a helper so `process` doesn’t have to manage this directly:
```python
async def _consume_runner_non_streaming(
runner: "BaseAgentRunner",
stream_to_general: bool,
custom_error_message: str | None,
) -> tuple[list, bool, "LLMResp | None"]:
merged_chain: list = []
has_intermediate_error = False
async for output in run_third_party_agent(
runner,
stream_to_general=stream_to_general,
custom_error_message=custom_error_message,
):
merged_chain.extend(output.chain.chain or [])
if output.is_error:
has_intermediate_error = True
final_resp = runner.get_final_llm_resp()
return merged_chain, has_intermediate_error, final_resp
```
Then the non‑streaming branch in `process` becomes simpler:
```python
else:
merged_chain, has_intermediate_error, final_resp = (
await _consume_runner_non_streaming(
runner,
stream_to_general=stream_to_general,
custom_error_message=custom_error_message,
)
)
yield # keep existing pacing
chain, is_error, result_content_type = _resolve_final_result(
merged_chain=merged_chain,
final_resp=final_resp,
has_stream_error=has_intermediate_error,
)
_set_runner_error_extra(event, is_error)
event.set_result(
MessageEventResult(
chain=chain,
result_content_type=result_content_type,
),
)
yield
```
### 2. Single source of truth for final result resolution
Right now streaming and non‑streaming branches have independent logic for:
- whether we treat this as an error
- which chain to use (final vs merged)
- what `ResultContentType` to set
You can consolidate that into one helper and reuse it in both branches:
```python
def _resolve_final_result(
merged_chain: list,
final_resp: "LLMResp | None",
has_stream_error: bool,
) -> tuple[list, bool, ResultContentType]:
if not final_resp or not final_resp.result_chain:
if merged_chain:
# fallback to streamed chain
is_error = has_stream_error
content_type = (
ResultContentType.AGENT_RUNNER_ERROR
if is_error
else ResultContentType.LLM_RESULT
)
return merged_chain, is_error, content_type
# no result at all
fallback_error_chain = MessageChain().message(
"Agent Runner did not return any result.",
)
return (
fallback_error_chain.chain or [],
True,
ResultContentType.AGENT_RUNNER_ERROR,
)
is_error = has_stream_error or final_resp.role == "err"
content_type = (
ResultContentType.AGENT_RUNNER_ERROR
if is_error
else ResultContentType.LLM_RESULT
)
return final_resp.result_chain.chain or [], is_error, content_type
```
Usage in non‑streaming: shown above.
Usage in streaming branch after the stream finishes:
```python
if runner.done():
final_resp = runner.get_final_llm_resp()
# merged_chain is [] here if you only stream, or you can optionally
# collect it similarly to the non-streaming helper.
chain, is_error, result_content_type = _resolve_final_result(
merged_chain=[],
final_resp=final_resp,
has_stream_error=stream_has_runner_error,
)
_set_runner_error_extra(event, is_error)
event.set_result(
MessageEventResult(
chain=chain,
result_content_type=result_content_type
if result_content_type is not ResultContentType.LLM_RESULT
else ResultContentType.STREAMING_FINISH,
),
)
```
(You can keep the special `STREAMING_FINISH` type, just override when needed as shown.)
This keeps all existing semantics (error flagging, fallbacks, `AGENT_RUNNER_ERROR` vs `LLM_RESULT`/`STREAMING_FINISH`) but:
- removes duplicated “final result vs merged chain” logic
- centralizes the error/finish policy in `_resolve_final_result`
- makes `process` much easier to follow, with each branch delegating to small, focused helpers instead of juggling multiple flags and conditionals inline.
</issue_to_address>
### Comment 3
<location path="astrbot/core/agent/runners/deerflow/deerflow_agent_runner.py" line_range="44" />
<code_context>
+ from typing_extensions import override
+
+
+class DeerFlowAgentRunner(BaseAgentRunner[TContext]):
+ """DeerFlow Agent Runner via LangGraph HTTP API."""
+
</code_context>
<issue_to_address>
**issue (complexity):** Consider extracting configuration parsing, stream accumulation, and content adaptation into small helper classes/modules to decouple concerns from `DeerFlowAgentRunner` and make it slimmer and more testable.
You can reduce complexity by extracting a few focused collaborators from `DeerFlowAgentRunner` without changing behavior:
---
### 1. Extract configuration handling from `reset`
`reset` is currently doing: request wiring, config parsing, and API client lifecycle. You can move the config parsing + signature building into a small dataclass:
```python
# deerflow_config.py
from dataclasses import dataclass
@dataclass(frozen=True)
class DeerFlowConfig:
api_base: str
api_key: str
auth_header: str
proxy: str
assistant_id: str
model_name: str
thinking_enabled: bool
plan_mode: bool
subagent_enabled: bool
max_concurrent_subagents: int
timeout: int
recursion_limit: int
@classmethod
def from_provider_config(cls, provider_config: dict, coerce_int) -> "DeerFlowConfig":
api_base = provider_config.get("deerflow_api_base", "http://127.0.0.1:2026")
if not isinstance(api_base, str) or not api_base.startswith(("http://", "https://")):
raise ValueError(
"DeerFlow API Base URL format is invalid. It must start with http:// or https://."
)
proxy = provider_config.get("proxy", "")
proxy = proxy.strip() if isinstance(proxy, str) else ""
return cls(
api_base=api_base,
api_key=provider_config.get("deerflow_api_key", ""),
auth_header=provider_config.get("deerflow_auth_header", ""),
proxy=proxy,
assistant_id=provider_config.get("deerflow_assistant_id", "lead_agent"),
model_name=provider_config.get("deerflow_model_name", ""),
thinking_enabled=bool(provider_config.get("deerflow_thinking_enabled", False)),
plan_mode=bool(provider_config.get("deerflow_plan_mode", False)),
subagent_enabled=bool(provider_config.get("deerflow_subagent_enabled", False)),
max_concurrent_subagents=coerce_int(
"deerflow_max_concurrent_subagents",
provider_config.get("deerflow_max_concurrent_subagents", 3),
default=3,
min_value=1,
),
timeout=coerce_int("timeout", provider_config.get("timeout", 300), default=300, min_value=1),
recursion_limit=coerce_int(
"deerflow_recursion_limit",
provider_config.get("deerflow_recursion_limit", 1000),
default=1000,
min_value=1,
),
)
```
Then `reset` becomes mostly wiring + client reuse:
```python
# in DeerFlowAgentRunner
from .deerflow_config import DeerFlowConfig
@override
async def reset(..., provider_config: dict, **kwargs: T.Any) -> None:
self.req = request
self.streaming = kwargs.get("streaming", False)
self.final_llm_resp = None
self._state = AgentState.IDLE
self.agent_hooks = agent_hooks
self.run_context = run_context
self.config = DeerFlowConfig.from_provider_config(provider_config, self._coerce_int_config)
new_client_signature = (
self.config.api_base,
self.config.api_key,
self.config.auth_header,
self.config.proxy,
)
old_client = getattr(self, "api_client", None)
old_signature = getattr(self, "_api_client_signature", None)
if (
isinstance(old_client, DeerFlowAPIClient)
and old_signature == new_client_signature
and not old_client.is_closed
):
self.api_client = old_client
return
if isinstance(old_client, DeerFlowAPIClient):
try:
await old_client.close()
except Exception as e:
logger.warning(f"Failed to close previous DeerFlow API client cleanly: {e}")
self.api_client = DeerFlowAPIClient(
api_base=self.config.api_base,
api_key=self.config.api_key,
auth_header=self.config.auth_header,
proxy=self.config.proxy,
)
self._api_client_signature = new_client_signature
# small follow-up: use self.config.timeout / recursion_limit / etc. everywhere
```
This immediately removes a large chunk of assignments from the runner and centralizes validation.
---
### 2. Extract stream state + event handling into a helper
The `_StreamState`, `_handle_values_event`, `_handle_message_event`, `_extract_new_messages_from_values`, `_remember_seen_message_id`, `_fingerprint_message`, `_resolve_final_output` are tightly related and can be encapsulated in a “stream accumulator” that exposes a small API.
Minimal extraction (behavior-preserving, mostly copy/paste + references back to runner where needed):
```python
# deerflow_stream_accumulator.py
from dataclasses import dataclass, field
from collections import deque
import typing as T
import astrbot.core.message.components as Comp
from astrbot.core.message.message_event_result import MessageChain
from astrbot import logger
from .deerflow_stream_utils import (
build_task_failure_summary,
extract_ai_delta_from_event_data,
extract_clarification_from_event_data,
extract_latest_ai_message,
extract_latest_ai_text,
extract_latest_clarification_text,
extract_messages_from_values_data,
extract_task_failures_from_custom_event,
extract_text,
get_message_id,
)
@dataclass
class StreamState:
latest_text: str = ""
prev_text_for_streaming: str = ""
clarification_text: str = ""
task_failures: list[str] = field(default_factory=list)
seen_message_ids: set[str] = field(default_factory=set)
seen_message_order: deque[str] = field(default_factory=deque)
no_id_message_fingerprints: dict[int, str] = field(default_factory=dict)
baseline_initialized: bool = False
has_values_text: bool = False
run_values_messages: list[dict[str, T.Any]] = field(default_factory=list)
timed_out: bool = False
class DeerFlowStreamAccumulator:
def __init__(self, max_values_history: int, streaming: bool, timeout: int):
self._max_values_history = max_values_history
self._streaming = streaming
self._timeout = timeout
self.state = StreamState()
# `_fingerprint_message`, `_remember_seen_message_id`, `_extract_new_messages_from_values`
# and `_build_chain_from_ai_content` can be moved here almost as-is, except that
# `_build_chain_from_ai_content` needs a small hook to construct components.
def mark_timed_out(self) -> None:
self.state.timed_out = True
def handle_values_event(self, data: T.Any) -> list["AgentResponse"]:
...
# copy existing logic, using `self.state` instead of `state`,
# and `self._max_values_history`, `self._streaming`.
def handle_message_event(self, data: T.Any) -> "AgentResponse | None":
...
# copy existing logic, using `self.state` and `self._streaming`.
def handle_custom_event(self, data: T.Any) -> None:
self.state.task_failures.extend(extract_task_failures_from_custom_event(data))
def resolve_final_output(self, build_chain_from_ai_content) -> tuple[MessageChain, bool]:
state = self.state
failures_only = False
final_chain = MessageChain()
if state.clarification_text:
final_chain = MessageChain(chain=[Comp.Plain(state.clarification_text)])
else:
latest_ai_message = extract_latest_ai_message(state.run_values_messages)
if latest_ai_message:
final_chain = build_chain_from_ai_content(latest_ai_message.get("content"))
if not final_chain.chain and state.latest_text:
final_chain = MessageChain(chain=[Comp.Plain(state.latest_text)])
if not final_chain.chain:
failure_text = build_task_failure_summary(state.task_failures)
if failure_text:
final_chain = MessageChain(chain=[Comp.Plain(failure_text)])
failures_only = True
if not final_chain.chain:
logger.warning("DeerFlow returned no text content in stream events.")
final_chain = MessageChain(chain=[Comp.Plain("DeerFlow returned an empty response.")])
return final_chain, failures_only
```
Then `_execute_deerflow_request` becomes a thinner orchestration shell:
```python
# in DeerFlowAgentRunner
from .deerflow_stream_accumulator import DeerFlowStreamAccumulator
async def _execute_deerflow_request(self):
prompt = self.req.prompt or ""
session_id = self.req.session_id or f"deerflow-ephemeral-{uuid4()}"
image_urls = self.req.image_urls or []
system_prompt = self.req.system_prompt
thread_id = await self._ensure_thread_id(session_id)
payload = self._build_payload(
thread_id=thread_id,
prompt=prompt,
image_urls=image_urls,
system_prompt=system_prompt,
)
accumulator = DeerFlowStreamAccumulator(
max_values_history=self._MAX_VALUES_HISTORY,
streaming=self.streaming,
timeout=self.config.timeout, # or self.timeout
)
try:
async for event in self.api_client.stream_run(
thread_id=thread_id,
payload=payload,
timeout=self.config.timeout,
):
event_type = event.get("event")
data = event.get("data")
if event_type == "values":
for response in accumulator.handle_values_event(data):
yield response
elif event_type in {"messages-tuple", "messages", "message"}:
response = accumulator.handle_message_event(data)
if response:
yield response
elif event_type == "custom":
accumulator.handle_custom_event(data)
elif event_type == "error":
raise Exception(f"DeerFlow stream returned error event: {data}")
elif event_type == "end":
break
except (asyncio.TimeoutError, TimeoutError):
accumulator.mark_timed_out()
final_chain, failures_only = accumulator.resolve_final_output(self._build_chain_from_ai_content)
# timeout note + streaming final delta + final LLMResponse logic can stay here
...
```
That removes most of the state juggling from `DeerFlowAgentRunner` and makes the stream behavior testable independently.
---
### 3. Optionally: content adapter module
If you want one more quick win, you can isolate `_is_likely_base64_image`, `_image_component_from_url`, `_append_components_from_content`, `_build_chain_from_ai_content` into a `deerflow_content_adapter.py` module. For example:
```python
# deerflow_content_adapter.py
def build_chain_from_ai_content(content: T.Any) -> MessageChain:
components: list[Comp.BaseMessageComponent] = []
append_components_from_content(content, components)
if components:
return MessageChain(chain=components)
fallback_text = extract_text(content)
if fallback_text:
return MessageChain(chain=[Comp.Plain(fallback_text)])
return MessageChain()
```
Then in the runner you just call `build_chain_from_ai_content`, and `DeerFlowStreamAccumulator.resolve_final_output` can depend on that function instead of a runner method.
These extractions keep functionality intact but break the monolith into smaller testable units and reduce cognitive load in `DeerFlowAgentRunner`.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
astrbot/core/pipeline/process_stage/method/agent_sub_stages/third_party.py
Outdated
Show resolved
Hide resolved
astrbot/core/pipeline/process_stage/method/agent_sub_stages/third_party.py
Show resolved
Hide resolved
|
@sourcery-ai review |
There was a problem hiding this comment.
Hey - 我发现了两个问题
给 AI Agents 的提示
Please address the comments from this code review:
## Individual Comments
### Comment 1
<location path="astrbot/core/agent/runners/deerflow/deerflow_agent_runner.py" line_range="234-243" />
<code_context>
+ except asyncio.CancelledError:
+ # Let caller manage cancellation semantics.
+ raise
+ except Exception as e:
+ err_msg = self._format_exception(e)
+ logger.error(f"DeerFlow request failed: {err_msg}", exc_info=True)
+ self._transition_state(AgentState.ERROR)
+ err_chain = MessageChain().message(f"DeerFlow request failed: {err_msg}")
+ self.final_llm_resp = LLMResponse(
+ role="err",
+ completion_text=f"DeerFlow request failed: {err_msg}",
+ result_chain=err_chain,
+ )
+ yield AgentResponse(
+ type="err",
+ data=AgentResponseData(
</code_context>
<issue_to_address>
**issue (bug_risk):** on_agent_done 钩子在错误路径上从未被调用,这可能破坏钩子应有的不变式
在 `_execute_deerflow_request` 中,`on_agent_done` 会在设置 `self.final_llm_resp` 之后运行,但在 `step` 的错误路径里,你设置了 `self.final_llm_resp`,将状态切换为 `ERROR`,然后直接返回,而没有调用该钩子。这样一来,与其他 runner 不同,清理/遥测相关的钩子在 DeerFlow 出错时不会运行。请将错误路径重构到一个辅助方法中,在其中也调用 `on_agent_done(self.run_context, self.final_llm_resp)`(例如放在 `try/except` 包装中),从而确保钩子在成功和失败两种情况下都能被一致地通知到。
</issue_to_address>
### Comment 2
<location path="astrbot/core/pipeline/process_stage/method/agent_sub_stages/third_party.py" line_range="153" />
<code_context>
+ logger.warning(f"Failed to close third-party runner cleanly: {e}")
class ThirdPartyAgentSubStage(Stage):
</code_context>
<issue_to_address>
**issue (complexity):** 建议将 runner 生命周期管理以及“流式 vs 非流式”的处理抽取到专门的辅助方法中,这样 `process` 方法可以变成一个更简单的调度器,控制流更清晰,标志位/闭包之间的耦合也更少。
你可以保留当前新增的行为,但通过把 runner 生命周期和流式/非流式分支逻辑下沉到辅助函数,大幅简化 `process`。这样可以去掉大部分本地的标志位/闭包,使控制流更易理解。
### 1. 封装 runner 生命周期/超时
与其使用 `runner_closed`、`streaming_started`、`stream_consumption_started`、`stream_idle_close_task` 再加上内部的 `close_runner_once` / `close_if_stream_never_consumed`,不如把这些逻辑移动到一个小的辅助对象中。这样可以把逻辑集中在一个地方,而无需回退当前行为:
```python
class _RunnerLifecycle:
def __init__(self, runner: BaseAgentRunner, event: AstrMessageEvent) -> None:
self._runner = runner
self._event = event
self._closed = False
self._stream_consumed = False
self._idle_task: asyncio.Task[None] | None = None
async def reset(self, *, req, astr_agent_ctx, provider_cfg, streaming: bool) -> None:
await self._runner.reset(
request=req,
run_context=AgentContextWrapper(
context=astr_agent_ctx,
tool_call_timeout=60,
),
agent_hooks=MAIN_AGENT_HOOKS,
provider_config=provider_cfg,
streaming=streaming,
)
async def close_once(self) -> None:
if self._closed:
return
self._closed = True
await _close_runner_if_supported(self._runner)
def mark_stream_started(self) -> None:
self._idle_task = asyncio.create_task(self._close_if_never_consumed())
def mark_stream_consumed(self) -> None:
self._stream_consumed = True
if self._idle_task and not self._idle_task.done():
self._idle_task.cancel()
async def finalize(self) -> None:
if self._idle_task and not self._idle_task.done():
self._idle_task.cancel()
# non-streaming or streaming never started
if not self._closed:
await self.close_once()
async def _close_if_never_consumed(self) -> None:
try:
await asyncio.sleep(STREAM_CONSUMPTION_CLOSE_TIMEOUT_SEC)
except asyncio.CancelledError:
return
if not self._stream_consumed:
logger.warning(
"Third-party runner stream was never consumed; "
"closing runner to avoid resource leak.",
)
await self.close_once()
```
然后在 `process` 中你就可以去掉大多数手动标志位,只需这样使用:
```python
lifecycle = _RunnerLifecycle(runner, event)
try:
await lifecycle.reset(
req=req,
astr_agent_ctx=astr_agent_ctx,
provider_cfg=self.prov_cfg,
streaming=streaming_response,
)
if streaming_response and not stream_to_general:
await self._handle_streaming_response(
lifecycle=lifecycle,
runner=runner,
event=event,
custom_error_message=custom_error_message,
)
else:
await self._handle_non_streaming_response(
lifecycle=lifecycle,
runner=runner,
event=event,
stream_to_general=stream_to_general,
custom_error_message=custom_error_message,
)
finally:
await lifecycle.finalize()
```
这样既保留了超时语义和“只关闭一次”的行为,又移除了该阶段中相互依赖的布尔值和 `finally` 分支逻辑。
### 2. 拆分流式/非流式处理函数
当前嵌套的 `_stream_runner_chain` 闭包同时处理消费、错误跟踪和关闭逻辑。配合生命周期辅助对象,你可以把流式相关的逻辑移动到一个专门的方法中,使其只关注流式和错误传播:
```python
async def _handle_streaming_response(
self,
*,
lifecycle: _RunnerLifecycle,
runner: BaseAgentRunner,
event: AstrMessageEvent,
custom_error_message: str | None,
) -> None:
stream_has_runner_error = False
async def _stream_runner_chain() -> AsyncGenerator[MessageChain, None]:
nonlocal stream_has_runner_error
lifecycle.mark_stream_consumed()
try:
async for runner_output in run_third_party_agent(
runner,
stream_to_general=False,
custom_error_message=custom_error_message,
):
if runner_output.is_error:
stream_has_runner_error = True
_set_runner_error_extra(event, True)
yield runner_output.chain
finally:
await lifecycle.close_once()
event.set_result(
MessageEventResult()
.set_result_content_type(ResultContentType.STREAMING_RESULT)
.set_async_stream(_stream_runner_chain()),
)
lifecycle.mark_stream_started()
yield # 保留现有的 yield 行为
if runner.done():
final_resp = runner.get_final_llm_resp()
if final_resp and final_resp.result_chain:
final_chain, is_runner_error, _ = _resolve_final_result(
merged_chain=[],
final_resp=final_resp,
has_intermediate_error=stream_has_runner_error,
)
_set_runner_error_extra(event, is_runner_error)
event.set_result(
MessageEventResult(
chain=final_chain,
result_content_type=ResultContentType.STREAMING_FINISH,
),
)
```
非流式处理同样可以被隔离,只专注于合并 + 收尾:
```python
async def _handle_non_streaming_response(
self,
*,
lifecycle: _RunnerLifecycle,
runner: BaseAgentRunner,
event: AstrMessageEvent,
stream_to_general: bool,
custom_error_message: str | None,
) -> None:
merged_chain: list = []
has_intermediate_error = False
async for output in run_third_party_agent(
runner,
stream_to_general=stream_to_general,
custom_error_message=custom_error_message,
):
merged_chain.extend(output.chain.chain or [])
if output.is_error:
has_intermediate_error = True
yield
final_resp = runner.get_final_llm_resp()
final_chain, is_runner_error, result_content_type = _resolve_final_result(
merged_chain=merged_chain,
final_resp=final_resp,
has_intermediate_error=has_intermediate_error,
)
_set_runner_error_extra(event, is_runner_error)
event.set_result(
MessageEventResult(
chain=final_chain,
result_content_type=result_content_type,
),
)
yield
```
这样既保留了所有新功能(错误传播、空闲关闭超时、对 runner.close 的支持、最终结果合并),又让 `process` 主要负责流程编排与分支,生命周期相关的关注点被抽离,共享的可变标志/闭包更少。
</issue_to_address>帮我变得更有用!请对每条评论点击 👍 或 👎,我会根据你的反馈改进后续评审。
Original comment in English
Hey - I've found 2 issues
Prompt for AI Agents
Please address the comments from this code review:
## Individual Comments
### Comment 1
<location path="astrbot/core/agent/runners/deerflow/deerflow_agent_runner.py" line_range="234-243" />
<code_context>
+ except asyncio.CancelledError:
+ # Let caller manage cancellation semantics.
+ raise
+ except Exception as e:
+ err_msg = self._format_exception(e)
+ logger.error(f"DeerFlow request failed: {err_msg}", exc_info=True)
+ self._transition_state(AgentState.ERROR)
+ err_chain = MessageChain().message(f"DeerFlow request failed: {err_msg}")
+ self.final_llm_resp = LLMResponse(
+ role="err",
+ completion_text=f"DeerFlow request failed: {err_msg}",
+ result_chain=err_chain,
+ )
+ yield AgentResponse(
+ type="err",
+ data=AgentResponseData(
</code_context>
<issue_to_address>
**issue (bug_risk):** on_agent_done hook is never invoked on error paths, which can break hook invariants
In `_execute_deerflow_request`, `on_agent_done` runs after `self.final_llm_resp` is set, but in the `step` error path you set `self.final_llm_resp`, transition to `ERROR`, and return without calling the hook. This means cleanup/telemetry hooks won’t run on DeerFlow errors, unlike other runners. Please refactor the error path into a helper that also invokes `on_agent_done(self.run_context, self.final_llm_resp)` (e.g., in a `try/except` wrapper) so hooks are consistently notified on both success and failure.
</issue_to_address>
### Comment 2
<location path="astrbot/core/pipeline/process_stage/method/agent_sub_stages/third_party.py" line_range="153" />
<code_context>
+ logger.warning(f"Failed to close third-party runner cleanly: {e}")
class ThirdPartyAgentSubStage(Stage):
</code_context>
<issue_to_address>
**issue (complexity):** Consider extracting runner lifecycle management and streaming vs non-streaming handling into dedicated helpers so the `process` method becomes a simple orchestrator with clearer control flow and fewer interdependent flags/closures.
You can keep the new behavior but significantly simplify `process` by pushing the runner lifecycle and streaming/non‑streaming branching into helpers. That removes most of the local flags/closures and makes the control flow easier to reason about.
### 1. Encapsulate runner lifecycle/timeout
Instead of `runner_closed`, `streaming_started`, `stream_consumption_started`, and `stream_idle_close_task` plus inner `close_runner_once` / `close_if_stream_never_consumed`, you can move that into a small helper object. This keeps the logic in one place without reverting behavior:
```python
class _RunnerLifecycle:
def __init__(self, runner: BaseAgentRunner, event: AstrMessageEvent) -> None:
self._runner = runner
self._event = event
self._closed = False
self._stream_consumed = False
self._idle_task: asyncio.Task[None] | None = None
async def reset(self, *, req, astr_agent_ctx, provider_cfg, streaming: bool) -> None:
await self._runner.reset(
request=req,
run_context=AgentContextWrapper(
context=astr_agent_ctx,
tool_call_timeout=60,
),
agent_hooks=MAIN_AGENT_HOOKS,
provider_config=provider_cfg,
streaming=streaming,
)
async def close_once(self) -> None:
if self._closed:
return
self._closed = True
await _close_runner_if_supported(self._runner)
def mark_stream_started(self) -> None:
self._idle_task = asyncio.create_task(self._close_if_never_consumed())
def mark_stream_consumed(self) -> None:
self._stream_consumed = True
if self._idle_task and not self._idle_task.done():
self._idle_task.cancel()
async def finalize(self) -> None:
if self._idle_task and not self._idle_task.done():
self._idle_task.cancel()
# non-streaming or streaming never started
if not self._closed:
await self.close_once()
async def _close_if_never_consumed(self) -> None:
try:
await asyncio.sleep(STREAM_CONSUMPTION_CLOSE_TIMEOUT_SEC)
except asyncio.CancelledError:
return
if not self._stream_consumed:
logger.warning(
"Third-party runner stream was never consumed; "
"closing runner to avoid resource leak.",
)
await self.close_once()
```
Then in `process` you can drop most of the manual flags and simply use:
```python
lifecycle = _RunnerLifecycle(runner, event)
try:
await lifecycle.reset(
req=req,
astr_agent_ctx=astr_agent_ctx,
provider_cfg=self.prov_cfg,
streaming=streaming_response,
)
if streaming_response and not stream_to_general:
await self._handle_streaming_response(
lifecycle=lifecycle,
runner=runner,
event=event,
custom_error_message=custom_error_message,
)
else:
await self._handle_non_streaming_response(
lifecycle=lifecycle,
runner=runner,
event=event,
stream_to_general=stream_to_general,
custom_error_message=custom_error_message,
)
finally:
await lifecycle.finalize()
```
This keeps the timeout semantics and “only close once” behavior but removes the interdependent booleans and `finally` branching from the stage itself.
### 2. Split streaming/non‑streaming handlers
The nested `_stream_runner_chain` closure currently mixes consumption, error tracking, and closing. With the lifecycle helper, you can move streaming logic into a dedicated method that only cares about streaming and error propagation:
```python
async def _handle_streaming_response(
self,
*,
lifecycle: _RunnerLifecycle,
runner: BaseAgentRunner,
event: AstrMessageEvent,
custom_error_message: str | None,
) -> None:
stream_has_runner_error = False
async def _stream_runner_chain() -> AsyncGenerator[MessageChain, None]:
nonlocal stream_has_runner_error
lifecycle.mark_stream_consumed()
try:
async for runner_output in run_third_party_agent(
runner,
stream_to_general=False,
custom_error_message=custom_error_message,
):
if runner_output.is_error:
stream_has_runner_error = True
_set_runner_error_extra(event, True)
yield runner_output.chain
finally:
await lifecycle.close_once()
event.set_result(
MessageEventResult()
.set_result_content_type(ResultContentType.STREAMING_RESULT)
.set_async_stream(_stream_runner_chain()),
)
lifecycle.mark_stream_started()
yield # keep existing yield behavior
if runner.done():
final_resp = runner.get_final_llm_resp()
if final_resp and final_resp.result_chain:
final_chain, is_runner_error, _ = _resolve_final_result(
merged_chain=[],
final_resp=final_resp,
has_intermediate_error=stream_has_runner_error,
)
_set_runner_error_extra(event, is_runner_error)
event.set_result(
MessageEventResult(
chain=final_chain,
result_content_type=ResultContentType.STREAMING_FINISH,
),
)
```
Non-streaming can likewise be isolated and focused only on merge + finalize:
```python
async def _handle_non_streaming_response(
self,
*,
lifecycle: _RunnerLifecycle,
runner: BaseAgentRunner,
event: AstrMessageEvent,
stream_to_general: bool,
custom_error_message: str | None,
) -> None:
merged_chain: list = []
has_intermediate_error = False
async for output in run_third_party_agent(
runner,
stream_to_general=stream_to_general,
custom_error_message=custom_error_message,
):
merged_chain.extend(output.chain.chain or [])
if output.is_error:
has_intermediate_error = True
yield
final_resp = runner.get_final_llm_resp()
final_chain, is_runner_error, result_content_type = _resolve_final_result(
merged_chain=merged_chain,
final_resp=final_resp,
has_intermediate_error=has_intermediate_error,
)
_set_runner_error_extra(event, is_runner_error)
event.set_result(
MessageEventResult(
chain=final_chain,
result_content_type=result_content_type,
),
)
yield
```
This keeps all the new functionality (error propagation, idle close timeout, runner.close support, merged final result) but makes `process` mostly orchestration + branching, with lifecycle concerns factored out and fewer shared mutable flags/closures.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
astrbot/core/pipeline/process_stage/method/agent_sub_stages/third_party.py
Show resolved
Hide resolved
|
@sourcery-ai review |
There was a problem hiding this comment.
Hey - 我发现了 1 个问题,并留下了一些整体层面的反馈:
- 在
_RunnerLifecycle.finalize中,目前只有在not self._stream_started时才会关闭 runner。这意味着非流式流程(stream_started从未被设置)会被处理到,但正常完成的流式流程则依赖_handle_streaming_response去调用close_once。为了保持一致性并避免泄漏,建议在finalize中只要self._closed仍为False就始终调用close_once(),而不是依赖stream_started进行判断。 - 用于关闭未消费流的硬编码常量
STREAM_CONSUMPTION_CLOSE_TIMEOUT_SEC = 30在不同部署中可能需要调整;建议通过配置进行注入,或者至少把类似的超时时间(例如tool_call_timeout、DeerFlow 相关超时)归类在一起,便于运维人员在一个地方统一管理。
供 AI Agents 使用的提示词
Please address the comments from this code review:
## Overall Comments
- In `_RunnerLifecycle.finalize`, the runner is only closed when `not self._stream_started`, which means non-streaming flows (where `stream_started` is never set) are handled, but streaming flows that complete normally rely on `_handle_streaming_response` to call `close_once`; for consistency and leak prevention, consider always calling `close_once()` in `finalize` when `self._closed` is still `False` rather than guarding on `stream_started`.
- The hard-coded `STREAM_CONSUMPTION_CLOSE_TIMEOUT_SEC = 30` used for closing unconsumed streams might need tuning in different deployments; consider wiring this through configuration or at least grouping similar timeouts (e.g. tool_call_timeout, DeerFlow timeouts) so operators can control them in one place.
## Individual Comments
### Comment 1
<location path="astrbot/core/pipeline/process_stage/method/agent_sub_stages/third_party.py" line_range="94" />
<code_context>
stream_to_general: bool = False,
custom_error_message: str | None = None,
-) -> AsyncGenerator[MessageChain | None, None]:
+) -> AsyncGenerator["_ThirdPartyRunnerOutput", None]:
"""
运行第三方 agent runner 并转换响应格式
</code_context>
<issue_to_address>
**issue (complexity):** 考虑通过让 `run_third_party_agent` 返回简单的元组,并抽取一个共享的 helper 来消费响应并在流式和非流式路径之间统一错误聚合逻辑,从而简化新的第三方 runner 处理流程。
你可以在不改变行为的前提下减少新增复杂度,方式包括:
1. **移除 `_ThirdPartyRunnerOutput`**(改用简单元组)
2. **将“消费 runner + 错误聚合”的逻辑集中到一个地方**,同时服务于流式和非流式路径
### 1. 移除 `_ThirdPartyRunnerOutput` 这层间接封装
`run_third_party_agent` 始终会生成 `chain` 和 `is_error`,且所有调用方都会同时使用二者。你可以通过直接返回一个元组而不是 dataclass 来简化:
```python
# Before
async def run_third_party_agent(...) -> AsyncGenerator["_ThirdPartyRunnerOutput", None]:
...
if resp.type == "streaming_delta":
...
yield _ThirdPartyRunnerOutput(chain=resp.data["chain"], is_error=False)
elif resp.type == "llm_result":
...
yield _ThirdPartyRunnerOutput(chain=resp.data["chain"], is_error=False)
elif resp.type == "err":
yield _ThirdPartyRunnerOutput(chain=resp.data["chain"], is_error=True)
...
yield _ThirdPartyRunnerOutput(chain=MessageChain().message(err_msg), is_error=True)
@dataclass
class _ThirdPartyRunnerOutput:
chain: MessageChain
is_error: bool = False
```
```python
# After (行为不变,但间接层更少)
async def run_third_party_agent(
runner: "BaseAgentRunner",
stream_to_general: bool = False,
custom_error_message: str | None = None,
) -> AsyncGenerator[tuple[MessageChain, bool], None]:
try:
async for resp in runner.step_until_done(max_step=30): # type: ignore[misc]
if resp.type == "streaming_delta":
if stream_to_general:
continue
yield resp.data["chain"], False
elif resp.type == "llm_result":
if stream_to_general:
yield resp.data["chain"], False
elif resp.type == "err":
yield resp.data["chain"], True
except Exception as e:
logger.error(f"Third party agent runner error: {e}")
err_msg = custom_error_message or (
f"Error occurred during AI execution.\n"
f"Error Type: {type(e).__name__} (3rd party)\n"
f"Error Message: {str(e)}"
)
yield MessageChain().message(err_msg), True
```
这样一来,调用点会更简单、更易读:
```python
# _handle_streaming_response
async for chain, is_error in run_third_party_agent(...):
if is_error:
stream_has_runner_error = True
_set_runner_error_extra(event, True)
yield chain
# _handle_non_streaming_response
async for chain, is_error in run_third_party_agent(...):
merged_chain.extend(chain.chain or [])
if is_error:
has_intermediate_error = True
yield
```
### 2. 将“消费 + 错误聚合”逻辑集中到一个 helper 中
`_handle_streaming_response` 和 `_handle_non_streaming_response` 都在追踪“是否观察到任何错误”,并合并 chain。你可以把这些逻辑提取到一个小 helper 中,然后在两条路径里调用它,从而减少重复代码,让控制流更清晰。
例如:
```python
async def _consume_runner_and_aggregate(
runner: "BaseAgentRunner",
*,
stream_to_general: bool,
custom_error_message: str | None,
) -> tuple[list, bool]:
merged_chain: list = []
has_intermediate_error = False
async for chain, is_error in run_third_party_agent(
runner,
stream_to_general=stream_to_general,
custom_error_message=custom_error_message,
):
merged_chain.extend(chain.chain or [])
if is_error:
has_intermediate_error = True
# 让调用方决定是否向外层调用者 yield,所以这里不做 yield
return merged_chain, has_intermediate_error
```
然后你可以简化非流式处理函数:
```python
async def _handle_non_streaming_response(
self,
*,
runner: "BaseAgentRunner",
event: AstrMessageEvent,
stream_to_general: bool,
custom_error_message: str | None,
) -> AsyncGenerator[None, None]:
merged_chain, has_intermediate_error = await _consume_runner_and_aggregate(
runner=runner,
stream_to_general=stream_to_general,
custom_error_message=custom_error_message,
)
yield # 保留现有的 yield 位置
final_resp = runner.get_final_llm_resp()
if not final_resp or not final_resp.result_chain:
if merged_chain:
logger.warning(
"Agent Runner returned no final response, fallback to streamed error/result chain."
)
else:
logger.warning("Agent Runner 未返回最终结果。")
final_chain, is_runner_error, result_content_type = _resolve_final_result(
merged_chain=merged_chain,
final_resp=final_resp,
has_intermediate_error=has_intermediate_error,
)
_set_runner_error_extra(event, is_runner_error)
event.set_result(
MessageEventResult(
chain=final_chain,
result_content_type=result_content_type,
),
)
yield
```
对于流式处理函数,你同样可以在得到 `stream_has_runner_error` 后再调用 `_resolve_final_result`,从而继续复用同一套错误聚合策略,而无需重复合并/错误逻辑。
这两处修改在保留所有特性(按 chunk 标记错误、最终结果决策策略、event extras 以及 runner 清理)的前提下,移除了一层抽象,并将错误/chain 聚合逻辑集中到了一个地方,使得整体控制流更易理解。
</issue_to_address>帮我变得更有用!请在每条评论上点 👍 或 👎,我会根据你的反馈改进 Review 质量。
Original comment in English
Hey - I've found 1 issue, and left some high level feedback:
- In
_RunnerLifecycle.finalize, the runner is only closed whennot self._stream_started, which means non-streaming flows (wherestream_startedis never set) are handled, but streaming flows that complete normally rely on_handle_streaming_responseto callclose_once; for consistency and leak prevention, consider always callingclose_once()infinalizewhenself._closedis stillFalserather than guarding onstream_started. - The hard-coded
STREAM_CONSUMPTION_CLOSE_TIMEOUT_SEC = 30used for closing unconsumed streams might need tuning in different deployments; consider wiring this through configuration or at least grouping similar timeouts (e.g. tool_call_timeout, DeerFlow timeouts) so operators can control them in one place.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- In `_RunnerLifecycle.finalize`, the runner is only closed when `not self._stream_started`, which means non-streaming flows (where `stream_started` is never set) are handled, but streaming flows that complete normally rely on `_handle_streaming_response` to call `close_once`; for consistency and leak prevention, consider always calling `close_once()` in `finalize` when `self._closed` is still `False` rather than guarding on `stream_started`.
- The hard-coded `STREAM_CONSUMPTION_CLOSE_TIMEOUT_SEC = 30` used for closing unconsumed streams might need tuning in different deployments; consider wiring this through configuration or at least grouping similar timeouts (e.g. tool_call_timeout, DeerFlow timeouts) so operators can control them in one place.
## Individual Comments
### Comment 1
<location path="astrbot/core/pipeline/process_stage/method/agent_sub_stages/third_party.py" line_range="94" />
<code_context>
stream_to_general: bool = False,
custom_error_message: str | None = None,
-) -> AsyncGenerator[MessageChain | None, None]:
+) -> AsyncGenerator["_ThirdPartyRunnerOutput", None]:
"""
运行第三方 agent runner 并转换响应格式
</code_context>
<issue_to_address>
**issue (complexity):** Consider simplifying the new third-party runner handling by returning simple tuples from `run_third_party_agent` and sharing a single helper for consuming responses and aggregating errors across streaming and non‑streaming paths.
You can reduce the added complexity without changing behavior by:
1. **Dropping `_ThirdPartyRunnerOutput`** (use a simple tuple)
2. **Centralizing “consume runner + error aggregation” logic** for both streaming and non‑streaming paths
### 1. Remove `_ThirdPartyRunnerOutput` indirection
`run_third_party_agent` always produces both `chain` and `is_error`, and every caller uses both. You can simplify by returning a tuple instead of a dataclass:
```python
# Before
async def run_third_party_agent(...) -> AsyncGenerator["_ThirdPartyRunnerOutput", None]:
...
if resp.type == "streaming_delta":
...
yield _ThirdPartyRunnerOutput(chain=resp.data["chain"], is_error=False)
elif resp.type == "llm_result":
...
yield _ThirdPartyRunnerOutput(chain=resp.data["chain"], is_error=False)
elif resp.type == "err":
yield _ThirdPartyRunnerOutput(chain=resp.data["chain"], is_error=True)
...
yield _ThirdPartyRunnerOutput(chain=MessageChain().message(err_msg), is_error=True)
@dataclass
class _ThirdPartyRunnerOutput:
chain: MessageChain
is_error: bool = False
```
```python
# After (same behavior, less indirection)
async def run_third_party_agent(
runner: "BaseAgentRunner",
stream_to_general: bool = False,
custom_error_message: str | None = None,
) -> AsyncGenerator[tuple[MessageChain, bool], None]:
try:
async for resp in runner.step_until_done(max_step=30): # type: ignore[misc]
if resp.type == "streaming_delta":
if stream_to_general:
continue
yield resp.data["chain"], False
elif resp.type == "llm_result":
if stream_to_general:
yield resp.data["chain"], False
elif resp.type == "err":
yield resp.data["chain"], True
except Exception as e:
logger.error(f"Third party agent runner error: {e}")
err_msg = custom_error_message or (
f"Error occurred during AI execution.\n"
f"Error Type: {type(e).__name__} (3rd party)\n"
f"Error Message: {str(e)}"
)
yield MessageChain().message(err_msg), True
```
Call sites become simpler and more readable:
```python
# _handle_streaming_response
async for chain, is_error in run_third_party_agent(...):
if is_error:
stream_has_runner_error = True
_set_runner_error_extra(event, True)
yield chain
# _handle_non_streaming_response
async for chain, is_error in run_third_party_agent(...):
merged_chain.extend(chain.chain or [])
if is_error:
has_intermediate_error = True
yield
```
### 2. Centralize “consume + aggregate error” logic
Both `_handle_streaming_response` and `_handle_non_streaming_response` are tracking “any error seen” and merging chains. You can move that into a small helper and call it from both paths, reducing duplication and making the control flow clearer.
For example:
```python
async def _consume_runner_and_aggregate(
runner: "BaseAgentRunner",
*,
stream_to_general: bool,
custom_error_message: str | None,
) -> tuple[list, bool]:
merged_chain: list = []
has_intermediate_error = False
async for chain, is_error in run_third_party_agent(
runner,
stream_to_general=stream_to_general,
custom_error_message=custom_error_message,
):
merged_chain.extend(chain.chain or [])
if is_error:
has_intermediate_error = True
# let caller decide whether to yield to outer caller, so no yield here
return merged_chain, has_intermediate_error
```
Then you can simplify the non‑streaming handler:
```python
async def _handle_non_streaming_response(
self,
*,
runner: "BaseAgentRunner",
event: AstrMessageEvent,
stream_to_general: bool,
custom_error_message: str | None,
) -> AsyncGenerator[None, None]:
merged_chain, has_intermediate_error = await _consume_runner_and_aggregate(
runner=runner,
stream_to_general=stream_to_general,
custom_error_message=custom_error_message,
)
yield # keep existing yield point
final_resp = runner.get_final_llm_resp()
if not final_resp or not final_resp.result_chain:
if merged_chain:
logger.warning(
"Agent Runner returned no final response, fallback to streamed error/result chain."
)
else:
logger.warning("Agent Runner 未返回最终结果。")
final_chain, is_runner_error, result_content_type = _resolve_final_result(
merged_chain=merged_chain,
final_resp=final_resp,
has_intermediate_error=has_intermediate_error,
)
_set_runner_error_extra(event, is_runner_error)
event.set_result(
MessageEventResult(
chain=final_chain,
result_content_type=result_content_type,
),
)
yield
```
And in the streaming handler you can still rely on the same error aggregation policy by only using `_resolve_final_result` once you have `stream_has_runner_error`, without duplicating merge/error logic.
These two changes keep all features (per‑chunk error flags, final resolution policy, event extras, and runner cleanup) but remove one abstraction layer and centralize error/chain aggregation, which should make the control flow easier to follow.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
astrbot/core/pipeline/process_stage/method/agent_sub_stages/third_party.py
Outdated
Show resolved
Hide resolved
|
@sourcery-ai review |
There was a problem hiding this comment.
Hey - 我发现了 3 个问题
Prompt for AI Agents
Please address the comments from this code review:
## Individual Comments
### Comment 1
<location path="astrbot/core/agent/runners/deerflow/deerflow_agent_runner.py" line_range="596-604" />
<code_context>
+ },
+ }
+
+ def _handle_values_event(
+ self,
+ data: T.Any,
+ state: _StreamState,
+ ) -> list[AgentResponse]:
+ responses: list[AgentResponse] = []
+ values_messages = extract_messages_from_values_data(data)
+ if not values_messages:
+ return responses
+
+ if not state.baseline_initialized:
</code_context>
<issue_to_address>
**issue (bug_risk):** 第一个 `values` 事件仅被当作 baseline 处理,其内容从未被展示,当只有单个 values 负载时,会导致有效响应被丢弃。
在 `_handle_values_event` 中,当 `state.baseline_initialized` 为 `False` 时,你会初始化 baseline 然后直接返回,而不会更新 `state.run_values_messages` 或 `state.latest_text`。如果后端把所有内容都放在一个 `values` 事件里,就不会有第二个事件被处理,因此 `run_values_messages` 会保持为空,`latest_text` 仍然是 `""`,而 `_resolve_final_output` 会退回到 "DeerFlow returned an empty response."。这会丢弃原本有效的响应。
你可以考虑:
- 使用第一个 `values` 事件同时作为 baseline 和数据(在返回前填充 `run_values_messages` 和 `latest_text`),或者
- 只有在能够区分历史内容和新内容时(如果 API 暴露了这点),才将消息视为 baseline。
无论哪种方式,都能在只发出一个 `values` 负载时保留响应。
</issue_to_address>
### Comment 2
<location path="astrbot/core/pipeline/process_stage/method/agent_sub_stages/third_party.py" line_range="131" />
<code_context>
+ yield MessageChain().message(err_msg), True
+
+
+async def _consume_runner_and_aggregate(
+ runner: "BaseAgentRunner",
+ *,
</code_context>
<issue_to_address>
**issue (complexity):** 建议通过内联聚合与结果解析逻辑,并用直接的本地处理加一个可选的 watchdog 任务替代生命周期辅助类,以简化新的第三方 runner 控制流。
你可以在不丢失任何现有行为的前提下,显著降低新的控制流复杂度,具体包括:
1. **移除 `_consume_runner_and_aggregate` 以及多元组状态的传递**
2. **在每个处理函数中内联 `_resolve_final_result` 的逻辑**
3. **将 `_RunnerLifecycle` 简化为一个带可选 watchdog 任务的本地 `try/finally`**
4. **去掉 `_set_runner_error_extra` 的间接调用**
下面是一些具体的、本地化的重构建议,它们能保持当前所有特性。
---
### 1. 用内联聚合替代 `_consume_runner_and_aggregate`
当前你通过 `_consume_runner_and_aggregate` 在线程中传递 `(chain, is_error, merged_chain, has_intermediate_error)`,而在调用点只部分使用这些值。
你可以保持 `run_third_party_agent` 不变(包括错误标志),让每个处理函数在本地维护自己的聚合状态。
```python
async def _handle_non_streaming_response(
self,
*,
runner: "BaseAgentRunner",
event: AstrMessageEvent,
stream_to_general: bool,
custom_error_message: str | None,
) -> AsyncGenerator[None, None]:
merged_chain: list = []
has_intermediate_error = False
async for chain, is_error in run_third_party_agent(
runner,
stream_to_general=stream_to_general,
custom_error_message=custom_error_message,
):
merged_chain.extend(chain.chain or [])
if is_error:
has_intermediate_error = True
event.set_extra(THIRD_PARTY_RUNNER_ERROR_EXTRA_KEY, True)
yield
final_resp = runner.get_final_llm_resp()
# Inline the resolution (see next section)
if not final_resp or not final_resp.result_chain:
if merged_chain:
logger.warning(
"Agent Runner returned no final response, fallback to streamed error/result chain."
)
final_chain = merged_chain
is_runner_error = has_intermediate_error
result_content_type = (
ResultContentType.AGENT_RUNNER_ERROR
if is_runner_error
else ResultContentType.LLM_RESULT
)
else:
logger.warning("Agent Runner 未返回最终结果。")
fallback_error_chain = MessageChain().message(
"Agent Runner did not return any result.",
)
final_chain = fallback_error_chain.chain or []
is_runner_error = True
result_content_type = ResultContentType.AGENT_RUNNER_ERROR
else:
is_runner_error = has_intermediate_error or final_resp.role == "err"
final_chain = final_resp.result_chain.chain or []
result_content_type = (
ResultContentType.AGENT_RUNNER_ERROR
if is_runner_error
else ResultContentType.LLM_RESULT
)
event.set_extra(THIRD_PARTY_RUNNER_ERROR_EXTRA_KEY, is_runner_error)
event.set_result(
MessageEventResult(
chain=final_chain,
result_content_type=result_content_type,
),
)
yield
```
对于流式路径,你也可以类似地避免 `_consume_runner_and_aggregate`,并在一个地方维护状态:
```python
async def _handle_streaming_response(
self,
*,
lifecycle: _RunnerLifecycle, # or runner directly, see below
runner: "BaseAgentRunner",
event: AstrMessageEvent,
custom_error_message: str | None,
) -> AsyncGenerator[None, None]:
stream_has_runner_error = False
merged_chain: list = []
async def _stream_runner_chain() -> AsyncGenerator[MessageChain, None]:
nonlocal merged_chain, stream_has_runner_error
lifecycle.mark_stream_consumed()
try:
async for chain, is_error in run_third_party_agent(
runner,
stream_to_general=False,
custom_error_message=custom_error_message,
):
merged_chain.extend(chain.chain or [])
if is_error:
stream_has_runner_error = True
event.set_extra(THIRD_PARTY_RUNNER_ERROR_EXTRA_KEY, True)
yield chain
finally:
await lifecycle.close_once()
event.set_result(
MessageEventResult()
.set_result_content_type(ResultContentType.STREAMING_RESULT)
.set_async_stream(_stream_runner_chain()),
)
lifecycle.mark_stream_started()
yield
if runner.done():
final_resp = runner.get_final_llm_resp()
if final_resp and final_resp.result_chain:
final_chain = final_resp.result_chain.chain or []
is_runner_error = stream_has_runner_error or final_resp.role == "err"
else:
# same fallback logic as non-streaming, reusing merged_chain
if merged_chain:
final_chain = merged_chain
is_runner_error = stream_has_runner_error
else:
fallback_error_chain = MessageChain().message(
"Agent Runner did not return any result.",
)
final_chain = fallback_error_chain.chain or []
is_runner_error = True
event.set_extra(THIRD_PARTY_RUNNER_ERROR_EXTRA_KEY, is_runner_error)
event.set_result(
MessageEventResult(
chain=final_chain,
result_content_type=ResultContentType.STREAMING_FINISH,
),
)
```
这样可以移除整整一层异步生成器以及多值元组的传递,同时保持合并聚合与错误跟踪的行为。
---
### 2. 内联 `_resolve_final_result` 的逻辑
一旦聚合逻辑在本地,`_resolve_final_result` 就只剩下三个输出,而其输入在调用点都已经在作用域中。如果像上面那样直接内联会更容易理解;或者如果你希望保留辅助函数,可以让它返回一个小的 dataclass 或单个对象,而不是解包三元组。
如果你想保留辅助函数,同时简化签名,可以采用如下最小改动:
```python
@dataclass
class FinalResult:
chain: list
is_error: bool
content_type: ResultContentType
def _resolve_final_result(
merged_chain: list,
final_resp: "LLMResponse | None",
has_intermediate_error: bool,
) -> FinalResult:
...
```
这样可以避免依赖元组的位置顺序,让调用点更清晰。
---
### 3. 将 `_RunnerLifecycle` 压扁为本地 `try/finally` + 可选 watchdog
`_RunnerLifecycle` 的大部分工作是在协调 `_stream_started`、`_stream_consumed`、`_idle_task` 和 `finalize`。一个更简单的模式是:
- 在 `process` 中使用 `try/finally`,确保总是调用 `close_runner`。
- 对于“流从未被消费”的场景,只在流式分支里创建一个 watchdog 任务,而不是把它编码到一个通用的生命周期对象中。
下面是一个示例简化实现,它保留了“从未消费 -> 超时关闭”的行为:
```python
async def _close_runner_if_supported(runner: "BaseAgentRunner") -> None:
close_callable = getattr(runner, "close", None)
if not callable(close_callable):
return
try:
res = close_callable()
if inspect.isawaitable(res):
await res
except Exception as e:
logger.warning(f"Failed to close third-party runner cleanly: {e}")
async def process(...):
...
runner = ...
await runner.reset(...)
close_task: asyncio.Task[None] | None = None
try:
if streaming_response and not stream_to_general:
# watchdog in case nobody consumes the stream
async def _watchdog() -> None:
try:
await asyncio.sleep(self.stream_consumption_close_timeout_sec)
await _close_runner_if_supported(runner)
except asyncio.CancelledError:
return
close_task = asyncio.create_task(_watchdog())
async for _ in self._handle_streaming_response(
runner=runner,
event=event,
custom_error_message=custom_error_message,
watchdog_task=close_task, # let handler cancel when consumed
):
yield
else:
async for _ in self._handle_non_streaming_response(
runner=runner,
event=event,
stream_to_general=stream_to_general,
custom_error_message=custom_error_message,
):
yield
finally:
if close_task and not close_task.done():
close_task.cancel()
await _close_runner_if_supported(runner)
```
然后,在 `_handle_streaming_response` 内部,在确认流实际被消费后调用 `watchdog_task.cancel()`:
```python
async def _handle_streaming_response(
self,
*,
runner: "BaseAgentRunner",
event: AstrMessageEvent,
custom_error_message: str | None,
watchdog_task: asyncio.Task[None] | None,
) -> AsyncGenerator[None, None]:
...
async def _stream_runner_chain() -> AsyncGenerator[MessageChain, None]:
if watchdog_task and not watchdog_task.done():
watchdog_task.cancel()
try:
async for chain, is_error in run_third_party_agent(...):
...
yield chain
finally:
await _close_runner_if_supported(runner)
```
这样就可以完全移除 `_RunnerLifecycle`,同时保留:
- 如果流从未被消费,则在超时后自动关闭;
- 当流被消费并结束时,正确关闭;
- 非流式路径中的正常关闭。
---
### 4. 移除 `_set_runner_error_extra`
由于这是一个只在本地使用的一行函数,并且键名已经很清晰,它带来的命名开销大于价值。可以直接用以下替代:
```python
_set_runner_error_extra(event, is_runner_error)
```
替换为:
```python
event.set_extra(THIRD_PARTY_RUNNER_ERROR_EXTRA_KEY, is_runner_error)
```
这样在调用点就能更直观地看到设置的是哪个 extra。
---
这些改动在保留所有新行为(错误标记、合并链、流式结束结果、空闲超时关闭)的同时,压缩了控制流,使第三方阶段更易于理解和调试。
</issue_to_address>
### Comment 3
<location path="astrbot/core/agent/runners/deerflow/deerflow_agent_runner.py" line_range="44" />
<code_context>
+ from typing_extensions import override
+
+
+class DeerFlowAgentRunner(BaseAgentRunner[TContext]):
+ """DeerFlow Agent Runner via LangGraph HTTP API."""
+
</code_context>
<issue_to_address>
**issue (complexity):** 建议将内容转换辅助函数、流状态/去重逻辑以及最终结果解析提取到一些小而专注的模块/类中,使 DeerFlowAgentRunner 更短、更易理解。
一些有针对性的抽取可以降低 `DeerFlowAgentRunner` 的认知负担,同时保持当前行为。
### 1. 抽取内容到组件的转换
`_image_component_from_url`、`_append_components_from_content`、`_build_chain_from_ai_content`、`_is_likely_base64_image` 和 `_build_user_content` 基本上都是纯辅助函数,不依赖 runner 状态(除了 `_is_likely_base64_image` 只在 `_build_user_content` 中使用)。将它们移入一个专门的辅助模块,可以让 runner 的主流程更易阅读与测试。
示例抽取:
```python
# deerflow_content_adapter.py
import base64
import typing as T
import astrbot.core.message.components as Comp
from astrbot.core.message.message_event_result import MessageChain
from .deerflow_stream_utils import extract_text
def is_likely_base64_image(value: str) -> bool:
if " " in value:
return False
compact = value.replace("\n", "").replace("\r", "")
if not compact or len(compact) < 32 or len(compact) % 4 != 0:
return False
base64_chars = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/="
if any(ch not in base64_chars for ch in compact):
return False
try:
base64.b64decode(compact, validate=True)
except Exception:
return False
return True
def build_user_content(prompt: str, image_urls: list[str]) -> T.Any:
# essentially your existing _build_user_content, using is_likely_base64_image
...
def image_component_from_url(url: T.Any) -> Comp.Image | None:
# your existing _image_component_from_url body
...
def append_components_from_content(
content: T.Any,
components: list[Comp.BaseMessageComponent],
) -> None:
# your existing _append_components_from_content body, using image_component_from_url
...
def build_chain_from_ai_content(content: T.Any) -> MessageChain:
components: list[Comp.BaseMessageComponent] = []
append_components_from_content(content, components)
if components:
return MessageChain(chain=components)
fallback_text = extract_text(content)
if fallback_text:
return MessageChain(chain=[Comp.Plain(fallback_text)])
return MessageChain()
```
然后在 runner 中:
```python
from .deerflow_content_adapter import (
build_user_content,
build_chain_from_ai_content,
)
def _build_messages(...):
...
messages.append({
"role": "user",
"content": build_user_content(prompt, image_urls),
})
...
def _resolve_final_output(...):
...
if latest_ai_message:
final_chain = build_chain_from_ai_content(latest_ai_message.get("content"))
...
```
这样可以在不改变行为的前提下,从 runner 中移除大约 150 行代码。
### 2. 封装流状态与去重逻辑
`_StreamState`、`_extract_new_messages_from_values`、`_remember_seen_message_id` 和 `_fingerprint_message` 共同构成了一个小型的去重引擎。将它们打包进一个 API 简单的类,可以让 `_execute_deerflow_request` 和 `_handle_values_event` 不那么隐晦。
示例:
```python
# deerflow_stream_state.py
import json, hashlib, typing as T
from collections import deque
from dataclasses import dataclass, field
from .deerflow_stream_utils import get_message_id
_MAX_VALUES_HISTORY = 200
@dataclass
class StreamState:
latest_text: str = ""
prev_text_for_streaming: str = ""
clarification_text: str = ""
task_failures: list[str] = field(default_factory=list)
seen_message_ids: set[str] = field(default_factory=set)
seen_message_order: deque[str] = field(default_factory=deque)
no_id_message_fingerprints: dict[int, str] = field(default_factory=dict)
baseline_initialized: bool = False
has_values_text: bool = False
run_values_messages: list[dict[str, T.Any]] = field(default_factory=list)
timed_out: bool = False
def _fingerprint(self, msg: dict[str, T.Any]) -> str:
try:
raw = json.dumps(msg, sort_keys=True, ensure_ascii=False, default=str)
except (TypeError, ValueError):
raw = repr(msg)
return hashlib.sha1(raw.encode("utf-8", errors="ignore")).hexdigest()
def remember_id(self, msg_id: str) -> None:
if not msg_id or msg_id in self.seen_message_ids:
return
self.seen_message_ids.add(msg_id)
self.seen_message_order.append(msg_id)
while len(self.seen_message_order) > _MAX_VALUES_HISTORY:
dropped = self.seen_message_order.popleft()
self.seen_message_ids.discard(dropped)
def extract_new_messages(self, values_messages: list[T.Any]) -> list[dict[str, T.Any]]:
new_messages: list[dict[str, T.Any]] = []
no_id_indexes_seen: set[int] = set()
for idx, msg in enumerate(values_messages):
if not isinstance(msg, dict):
continue
msg_id = get_message_id(msg)
if msg_id:
if msg_id in self.seen_message_ids:
continue
self.remember_id(msg_id)
new_messages.append(msg)
continue
no_id_indexes_seen.add(idx)
fp = self._fingerprint(msg)
if self.no_id_message_fingerprints.get(idx) == fp:
continue
self.no_id_message_fingerprints[idx] = fp
new_messages.append(msg)
# align no-id fingerprints with current payload shape
for idx in list(self.no_id_message_fingerprints.keys()):
if idx not in no_id_indexes_seen:
self.no_id_message_fingerprints.pop(idx, None)
return new_messages
```
然后在 runner 中:
```python
from .deerflow_stream_state import StreamState
def _handle_values_event(self, data: T.Any, state: StreamState) -> list[AgentResponse]:
...
if not state.baseline_initialized:
state.baseline_initialized = True
for idx, msg in enumerate(values_messages):
if not isinstance(msg, dict):
continue
msg_id = get_message_id(msg)
if msg_id:
state.remember_id(msg_id)
continue
state.no_id_message_fingerprints[idx] = state._fingerprint(msg)
return responses
new_messages = state.extract_new_messages(values_messages)
...
```
这样可以把去重逻辑集中在一个地方,并暴露清晰的 API(`extract_new_messages`、`remember_id`)。
### 3. 将最终结果解析拆分为内容与状态
`_resolve_final_output` 目前同时决策“展示什么文本/组件”和“是否失败(failures_only)”。你可以在保持语义不变的前提下拆分关注点,让每个函数更短。
示例:
```python
def _select_final_chain(self, state: _StreamState) -> MessageChain:
if state.clarification_text:
return MessageChain(chain=[Comp.Plain(state.clarification_text)])
latest_ai_message = extract_latest_ai_message(state.run_values_messages)
if latest_ai_message:
chain = self._build_chain_from_ai_content(latest_ai_message.get("content"))
if chain.chain:
return chain
if state.latest_text:
return MessageChain(chain=[Comp.Plain(state.latest_text)])
failure_text = build_task_failure_summary(state.task_failures)
if failure_text:
return MessageChain(chain=[Comp.Plain(failure_text)])
logger.warning("DeerFlow returned no text content in stream events.")
return MessageChain(chain=[Comp.Plain("DeerFlow returned an empty response.")])
def _was_failures_only(self, state: _StreamState, chain: MessageChain) -> bool:
# preserve your current semantics
failure_text = build_task_failure_summary(state.task_failures)
return bool(failure_text) and chain.chain and chain.chain[0].text == failure_text
def _resolve_final_output(self, state: _StreamState) -> tuple[MessageChain, bool]:
final_chain = self._select_final_chain(state)
failures_only = self._was_failures_only(state, final_chain)
return final_chain, failures_only
```
这在保持当前行为的同时(可能需要稍微调整 `_was_failures_only` 来完全匹配现有的 `failures_only` 逻辑),让基于不同关注点的推理和测试更加直接。
仅仅这三处抽取,就能显著缩短 runner 的行数,并将彼此独立的关注点(内容解码、流状态/去重、结果解析)拆分开来,而不改变行为或回退 DeerFlow 集成。
</issue_to_address>Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Original comment in English
Hey - I've found 3 issues
Prompt for AI Agents
Please address the comments from this code review:
## Individual Comments
### Comment 1
<location path="astrbot/core/agent/runners/deerflow/deerflow_agent_runner.py" line_range="596-604" />
<code_context>
+ },
+ }
+
+ def _handle_values_event(
+ self,
+ data: T.Any,
+ state: _StreamState,
+ ) -> list[AgentResponse]:
+ responses: list[AgentResponse] = []
+ values_messages = extract_messages_from_values_data(data)
+ if not values_messages:
+ return responses
+
+ if not state.baseline_initialized:
</code_context>
<issue_to_address>
**issue (bug_risk):** First `values` event is treated purely as baseline and its content is never surfaced, which can drop valid responses when there is only a single values payload.
In `_handle_values_event`, when `state.baseline_initialized` is `False`, you initialize the baseline and return without touching `state.run_values_messages` or `state.latest_text`. If the backend sends all content in a single `values` event, there is no second event to process, so `run_values_messages` stays empty, `latest_text` stays `""`, and `_resolve_final_output` falls back to "DeerFlow returned an empty response." This drops otherwise valid responses.
You could either:
- Use the first `values` event as both baseline and data (populate `run_values_messages` and `latest_text` before returning), or
- Only treat messages as baseline when you can distinguish historical from new content (if the API exposes that).
Either approach would preserve responses when only one `values` payload is emitted.
</issue_to_address>
### Comment 2
<location path="astrbot/core/pipeline/process_stage/method/agent_sub_stages/third_party.py" line_range="131" />
<code_context>
+ yield MessageChain().message(err_msg), True
+
+
+async def _consume_runner_and_aggregate(
+ runner: "BaseAgentRunner",
+ *,
</code_context>
<issue_to_address>
**issue (complexity):** Consider simplifying the new third‑party runner control flow by inlining aggregation and result resolution logic, and replacing the lifecycle helper with straightforward local handling plus an optional watchdog task.
You can substantially reduce the new control‑flow complexity without losing any of the new behavior by:
1. **Removing `_consume_runner_and_aggregate` and threading of multiple tuple states**
2. **Inlining `_resolve_final_result` logic per handler**
3. **Reducing `_RunnerLifecycle` to a simple `try/finally` with an optional watchdog task**
4. **Dropping `_set_runner_error_extra` indirection**
Below are concrete, local refactors that keep all current features intact.
---
### 1. Inline aggregation instead of `_consume_runner_and_aggregate`
Right now you thread `(chain, is_error, merged_chain, has_intermediate_error)` via `_consume_runner_and_aggregate` and then only partially use those values at each call site.
You can keep `run_third_party_agent` as‑is (including the error flag) and let each handler maintain its own aggregation state locally.
```python
async def _handle_non_streaming_response(
self,
*,
runner: "BaseAgentRunner",
event: AstrMessageEvent,
stream_to_general: bool,
custom_error_message: str | None,
) -> AsyncGenerator[None, None]:
merged_chain: list = []
has_intermediate_error = False
async for chain, is_error in run_third_party_agent(
runner,
stream_to_general=stream_to_general,
custom_error_message=custom_error_message,
):
merged_chain.extend(chain.chain or [])
if is_error:
has_intermediate_error = True
event.set_extra(THIRD_PARTY_RUNNER_ERROR_EXTRA_KEY, True)
yield
final_resp = runner.get_final_llm_resp()
# Inline the resolution (see next section)
if not final_resp or not final_resp.result_chain:
if merged_chain:
logger.warning(
"Agent Runner returned no final response, fallback to streamed error/result chain."
)
final_chain = merged_chain
is_runner_error = has_intermediate_error
result_content_type = (
ResultContentType.AGENT_RUNNER_ERROR
if is_runner_error
else ResultContentType.LLM_RESULT
)
else:
logger.warning("Agent Runner 未返回最终结果。")
fallback_error_chain = MessageChain().message(
"Agent Runner did not return any result.",
)
final_chain = fallback_error_chain.chain or []
is_runner_error = True
result_content_type = ResultContentType.AGENT_RUNNER_ERROR
else:
is_runner_error = has_intermediate_error or final_resp.role == "err"
final_chain = final_resp.result_chain.chain or []
result_content_type = (
ResultContentType.AGENT_RUNNER_ERROR
if is_runner_error
else ResultContentType.LLM_RESULT
)
event.set_extra(THIRD_PARTY_RUNNER_ERROR_EXTRA_KEY, is_runner_error)
event.set_result(
MessageEventResult(
chain=final_chain,
result_content_type=result_content_type,
),
)
yield
```
For the streaming path, you can similarly avoid `_consume_runner_and_aggregate` and keep the state in one place:
```python
async def _handle_streaming_response(
self,
*,
lifecycle: _RunnerLifecycle, # or runner directly, see below
runner: "BaseAgentRunner",
event: AstrMessageEvent,
custom_error_message: str | None,
) -> AsyncGenerator[None, None]:
stream_has_runner_error = False
merged_chain: list = []
async def _stream_runner_chain() -> AsyncGenerator[MessageChain, None]:
nonlocal merged_chain, stream_has_runner_error
lifecycle.mark_stream_consumed()
try:
async for chain, is_error in run_third_party_agent(
runner,
stream_to_general=False,
custom_error_message=custom_error_message,
):
merged_chain.extend(chain.chain or [])
if is_error:
stream_has_runner_error = True
event.set_extra(THIRD_PARTY_RUNNER_ERROR_EXTRA_KEY, True)
yield chain
finally:
await lifecycle.close_once()
event.set_result(
MessageEventResult()
.set_result_content_type(ResultContentType.STREAMING_RESULT)
.set_async_stream(_stream_runner_chain()),
)
lifecycle.mark_stream_started()
yield
if runner.done():
final_resp = runner.get_final_llm_resp()
if final_resp and final_resp.result_chain:
final_chain = final_resp.result_chain.chain or []
is_runner_error = stream_has_runner_error or final_resp.role == "err"
else:
# same fallback logic as non-streaming, reusing merged_chain
if merged_chain:
final_chain = merged_chain
is_runner_error = stream_has_runner_error
else:
fallback_error_chain = MessageChain().message(
"Agent Runner did not return any result.",
)
final_chain = fallback_error_chain.chain or []
is_runner_error = True
event.set_extra(THIRD_PARTY_RUNNER_ERROR_EXTRA_KEY, is_runner_error)
event.set_result(
MessageEventResult(
chain=final_chain,
result_content_type=ResultContentType.STREAMING_FINISH,
),
)
```
This removes an entire async generator layer and the multi‑value tuple threading, while preserving merged aggregation and error tracking behavior.
---
### 2. Inline `_resolve_final_result` logic
Once aggregation is local, `_resolve_final_result` becomes a thin wrapper with three outputs whose inputs are already in scope. It’s easier to follow if you inline it as shown above, or at least make it return a small dataclass or a single object (if you prefer to keep a helper) instead of unpacking a triple.
Minimal alternative if you want to keep a helper but simplify the signature:
```python
@dataclass
class FinalResult:
chain: list
is_error: bool
content_type: ResultContentType
def _resolve_final_result(
merged_chain: list,
final_resp: "LLMResponse | None",
has_intermediate_error: bool,
) -> FinalResult:
...
```
This removes the need to track tuple positions and makes call sites clearer.
---
### 3. Flatten `_RunnerLifecycle` into local `try/finally` + optional watchdog
Most of `_RunnerLifecycle` is coordination between `_stream_started`, `_stream_consumed`, `_idle_task`, and `finalize`. A simpler pattern is:
- Use `try/finally` in `process` to always call `close_runner`.
- For the “stream never consumed” case, create a watchdog task only in the streaming branch, instead of encoding it into a general lifecycle object.
Example simplification that preserves the “never consumed -> timeout close” behavior:
```python
async def _close_runner_if_supported(runner: "BaseAgentRunner") -> None:
close_callable = getattr(runner, "close", None)
if not callable(close_callable):
return
try:
res = close_callable()
if inspect.isawaitable(res):
await res
except Exception as e:
logger.warning(f"Failed to close third-party runner cleanly: {e}")
async def process(...):
...
runner = ...
await runner.reset(...)
close_task: asyncio.Task[None] | None = None
try:
if streaming_response and not stream_to_general:
# watchdog in case nobody consumes the stream
async def _watchdog() -> None:
try:
await asyncio.sleep(self.stream_consumption_close_timeout_sec)
await _close_runner_if_supported(runner)
except asyncio.CancelledError:
return
close_task = asyncio.create_task(_watchdog())
async for _ in self._handle_streaming_response(
runner=runner,
event=event,
custom_error_message=custom_error_message,
watchdog_task=close_task, # let handler cancel when consumed
):
yield
else:
async for _ in self._handle_non_streaming_response(
runner=runner,
event=event,
stream_to_general=stream_to_general,
custom_error_message=custom_error_message,
):
yield
finally:
if close_task and not close_task.done():
close_task.cancel()
await _close_runner_if_supported(runner)
```
Then, inside `_handle_streaming_response`, call `watchdog_task.cancel()` once you know the stream is actually being consumed:
```python
async def _handle_streaming_response(
self,
*,
runner: "BaseAgentRunner",
event: AstrMessageEvent,
custom_error_message: str | None,
watchdog_task: asyncio.Task[None] | None,
) -> AsyncGenerator[None, None]:
...
async def _stream_runner_chain() -> AsyncGenerator[MessageChain, None]:
if watchdog_task and not watchdog_task.done():
watchdog_task.cancel()
try:
async for chain, is_error in run_third_party_agent(...):
...
yield chain
finally:
await _close_runner_if_supported(runner)
```
This removes `_RunnerLifecycle` entirely, while keeping:
- auto‑close after a timeout if stream is never consumed,
- proper closing when the stream is consumed and finished,
- clean non‑streaming closure.
---
### 4. Remove `_set_runner_error_extra`
Since it’s a one‑liner used locally and the key is already named, it adds more naming overhead than value. Replace:
```python
_set_runner_error_extra(event, is_runner_error)
```
with:
```python
event.set_extra(THIRD_PARTY_RUNNER_ERROR_EXTRA_KEY, is_runner_error)
```
This also makes it obvious at the callsite what extra is being set.
---
These changes keep all the new behaviors (error flagging, merged chains, streaming finish result, idle close timeout) but collapse the control flow into fewer, linear functions, making the third‑party stage easier to reason about and debug.
</issue_to_address>
### Comment 3
<location path="astrbot/core/agent/runners/deerflow/deerflow_agent_runner.py" line_range="44" />
<code_context>
+ from typing_extensions import override
+
+
+class DeerFlowAgentRunner(BaseAgentRunner[TContext]):
+ """DeerFlow Agent Runner via LangGraph HTTP API."""
+
</code_context>
<issue_to_address>
**issue (complexity):** Consider extracting the content-conversion helpers, streaming state/dedup logic, and final-result resolution into small focused modules/classes to make DeerFlowAgentRunner shorter and easier to understand.
A few targeted extractions would reduce the cognitive load of `DeerFlowAgentRunner` while preserving your current behavior.
### 1. Extract content-to-component transformation
All of `_image_component_from_url`, `_append_components_from_content`, `_build_chain_from_ai_content`, `_is_likely_base64_image`, and `_build_user_content` are pure-ish helpers that don’t need runner state (except `_is_likely_base64_image` being used only by `_build_user_content`). Moving them into a focused helper module makes the runner’s core flow easier to read and test.
Example extraction:
```python
# deerflow_content_adapter.py
import base64
import typing as T
import astrbot.core.message.components as Comp
from astrbot.core.message.message_event_result import MessageChain
from .deerflow_stream_utils import extract_text
def is_likely_base64_image(value: str) -> bool:
if " " in value:
return False
compact = value.replace("\n", "").replace("\r", "")
if not compact or len(compact) < 32 or len(compact) % 4 != 0:
return False
base64_chars = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/="
if any(ch not in base64_chars for ch in compact):
return False
try:
base64.b64decode(compact, validate=True)
except Exception:
return False
return True
def build_user_content(prompt: str, image_urls: list[str]) -> T.Any:
# essentially your existing _build_user_content, using is_likely_base64_image
...
def image_component_from_url(url: T.Any) -> Comp.Image | None:
# your existing _image_component_from_url body
...
def append_components_from_content(
content: T.Any,
components: list[Comp.BaseMessageComponent],
) -> None:
# your existing _append_components_from_content body, using image_component_from_url
...
def build_chain_from_ai_content(content: T.Any) -> MessageChain:
components: list[Comp.BaseMessageComponent] = []
append_components_from_content(content, components)
if components:
return MessageChain(chain=components)
fallback_text = extract_text(content)
if fallback_text:
return MessageChain(chain=[Comp.Plain(fallback_text)])
return MessageChain()
```
Then in the runner:
```python
from .deerflow_content_adapter import (
build_user_content,
build_chain_from_ai_content,
)
def _build_messages(...):
...
messages.append({
"role": "user",
"content": build_user_content(prompt, image_urls),
})
...
def _resolve_final_output(...):
...
if latest_ai_message:
final_chain = build_chain_from_ai_content(latest_ai_message.get("content"))
...
```
This removes ~150 lines from the runner without changing behavior.
### 2. Encapsulate streaming state and dedup logic
`_StreamState`, `_extract_new_messages_from_values`, `_remember_seen_message_id`, and `_fingerprint_message` work together as a mini dedup engine. Packaging this into a tiny class with a narrow API makes `_execute_deerflow_request` and `_handle_values_event` less subtle.
Example:
```python
# deerflow_stream_state.py
import json, hashlib, typing as T
from collections import deque
from dataclasses import dataclass, field
from .deerflow_stream_utils import get_message_id
_MAX_VALUES_HISTORY = 200
@dataclass
class StreamState:
latest_text: str = ""
prev_text_for_streaming: str = ""
clarification_text: str = ""
task_failures: list[str] = field(default_factory=list)
seen_message_ids: set[str] = field(default_factory=set)
seen_message_order: deque[str] = field(default_factory=deque)
no_id_message_fingerprints: dict[int, str] = field(default_factory=dict)
baseline_initialized: bool = False
has_values_text: bool = False
run_values_messages: list[dict[str, T.Any]] = field(default_factory=list)
timed_out: bool = False
def _fingerprint(self, msg: dict[str, T.Any]) -> str:
try:
raw = json.dumps(msg, sort_keys=True, ensure_ascii=False, default=str)
except (TypeError, ValueError):
raw = repr(msg)
return hashlib.sha1(raw.encode("utf-8", errors="ignore")).hexdigest()
def remember_id(self, msg_id: str) -> None:
if not msg_id or msg_id in self.seen_message_ids:
return
self.seen_message_ids.add(msg_id)
self.seen_message_order.append(msg_id)
while len(self.seen_message_order) > _MAX_VALUES_HISTORY:
dropped = self.seen_message_order.popleft()
self.seen_message_ids.discard(dropped)
def extract_new_messages(self, values_messages: list[T.Any]) -> list[dict[str, T.Any]]:
new_messages: list[dict[str, T.Any]] = []
no_id_indexes_seen: set[int] = set()
for idx, msg in enumerate(values_messages):
if not isinstance(msg, dict):
continue
msg_id = get_message_id(msg)
if msg_id:
if msg_id in self.seen_message_ids:
continue
self.remember_id(msg_id)
new_messages.append(msg)
continue
no_id_indexes_seen.add(idx)
fp = self._fingerprint(msg)
if self.no_id_message_fingerprints.get(idx) == fp:
continue
self.no_id_message_fingerprints[idx] = fp
new_messages.append(msg)
# align no-id fingerprints with current payload shape
for idx in list(self.no_id_message_fingerprints.keys()):
if idx not in no_id_indexes_seen:
self.no_id_message_fingerprints.pop(idx, None)
return new_messages
```
Then the runner:
```python
from .deerflow_stream_state import StreamState
def _handle_values_event(self, data: T.Any, state: StreamState) -> list[AgentResponse]:
...
if not state.baseline_initialized:
state.baseline_initialized = True
for idx, msg in enumerate(values_messages):
if not isinstance(msg, dict):
continue
msg_id = get_message_id(msg)
if msg_id:
state.remember_id(msg_id)
continue
state.no_id_message_fingerprints[idx] = state._fingerprint(msg)
return responses
new_messages = state.extract_new_messages(values_messages)
...
```
This keeps the dedup complexity in one place with a clear API (`extract_new_messages`, `remember_id`).
### 3. Split final result resolution into content vs status
`_resolve_final_output` currently decides both “what text/components to show” and “did we fail (failures_only)”. You can keep its semantics but separate the concerns so each function stays short.
Example:
```python
def _select_final_chain(self, state: _StreamState) -> MessageChain:
if state.clarification_text:
return MessageChain(chain=[Comp.Plain(state.clarification_text)])
latest_ai_message = extract_latest_ai_message(state.run_values_messages)
if latest_ai_message:
chain = self._build_chain_from_ai_content(latest_ai_message.get("content"))
if chain.chain:
return chain
if state.latest_text:
return MessageChain(chain=[Comp.Plain(state.latest_text)])
failure_text = build_task_failure_summary(state.task_failures)
if failure_text:
return MessageChain(chain=[Comp.Plain(failure_text)])
logger.warning("DeerFlow returned no text content in stream events.")
return MessageChain(chain=[Comp.Plain("DeerFlow returned an empty response.")])
def _was_failures_only(self, state: _StreamState, chain: MessageChain) -> bool:
# preserve your current semantics
failure_text = build_task_failure_summary(state.task_failures)
return bool(failure_text) and chain.chain and chain.chain[0].text == failure_text
def _resolve_final_output(self, state: _StreamState) -> tuple[MessageChain, bool]:
final_chain = self._select_final_chain(state)
failures_only = self._was_failures_only(state, final_chain)
return final_chain, failures_only
```
This keeps the same behavior (you may need to tweak `_was_failures_only` to perfectly match your current `failures_only` logic) but makes reasoning and testing per concern more straightforward.
These three extractions alone cut down the runner’s line count and separate orthogonal concerns (content decoding, stream state/dedup, result resolution) without changing its behavior or reverting the DeerFlow integration.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
astrbot/core/pipeline/process_stage/method/agent_sub_stages/third_party.py
Outdated
Show resolved
Hide resolved
| from typing_extensions import override | ||
|
|
||
|
|
||
| class DeerFlowAgentRunner(BaseAgentRunner[TContext]): |
There was a problem hiding this comment.
issue (complexity): 建议将内容转换辅助函数、流状态/去重逻辑以及最终结果解析提取到一些小而专注的模块/类中,使 DeerFlowAgentRunner 更短、更易理解。
一些有针对性的抽取可以降低 DeerFlowAgentRunner 的认知负担,同时保持当前行为。
1. 抽取内容到组件的转换
_image_component_from_url、_append_components_from_content、_build_chain_from_ai_content、_is_likely_base64_image 和 _build_user_content 基本上都是纯辅助函数,不依赖 runner 状态(除了 _is_likely_base64_image 只在 _build_user_content 中使用)。将它们移入一个专门的辅助模块,可以让 runner 的主流程更易阅读与测试。
示例抽取:
# deerflow_content_adapter.py
import base64
import typing as T
import astrbot.core.message.components as Comp
from astrbot.core.message.message_event_result import MessageChain
from .deerflow_stream_utils import extract_text
def is_likely_base64_image(value: str) -> bool:
if " " in value:
return False
compact = value.replace("\n", "").replace("\r", "")
if not compact or len(compact) < 32 or len(compact) % 4 != 0:
return False
base64_chars = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/="
if any(ch not in base64_chars for ch in compact):
return False
try:
base64.b64decode(compact, validate=True)
except Exception:
return False
return True
def build_user_content(prompt: str, image_urls: list[str]) -> T.Any:
# essentially your existing _build_user_content, using is_likely_base64_image
...
def image_component_from_url(url: T.Any) -> Comp.Image | None:
# your existing _image_component_from_url body
...
def append_components_from_content(
content: T.Any,
components: list[Comp.BaseMessageComponent],
) -> None:
# your existing _append_components_from_content body, using image_component_from_url
...
def build_chain_from_ai_content(content: T.Any) -> MessageChain:
components: list[Comp.BaseMessageComponent] = []
append_components_from_content(content, components)
if components:
return MessageChain(chain=components)
fallback_text = extract_text(content)
if fallback_text:
return MessageChain(chain=[Comp.Plain(fallback_text)])
return MessageChain()然后在 runner 中:
from .deerflow_content_adapter import (
build_user_content,
build_chain_from_ai_content,
)
def _build_messages(...):
...
messages.append({
"role": "user",
"content": build_user_content(prompt, image_urls),
})
...
def _resolve_final_output(...):
...
if latest_ai_message:
final_chain = build_chain_from_ai_content(latest_ai_message.get("content"))
...这样可以在不改变行为的前提下,从 runner 中移除大约 150 行代码。
2. 封装流状态与去重逻辑
_StreamState、_extract_new_messages_from_values、_remember_seen_message_id 和 _fingerprint_message 共同构成了一个小型的去重引擎。将它们打包进一个 API 简单的类,可以让 _execute_deerflow_request 和 _handle_values_event 不那么隐晦。
示例:
# deerflow_stream_state.py
import json, hashlib, typing as T
from collections import deque
from dataclasses import dataclass, field
from .deerflow_stream_utils import get_message_id
_MAX_VALUES_HISTORY = 200
@dataclass
class StreamState:
latest_text: str = ""
prev_text_for_streaming: str = ""
clarification_text: str = ""
task_failures: list[str] = field(default_factory=list)
seen_message_ids: set[str] = field(default_factory=set)
seen_message_order: deque[str] = field(default_factory=deque)
no_id_message_fingerprints: dict[int, str] = field(default_factory=dict)
baseline_initialized: bool = False
has_values_text: bool = False
run_values_messages: list[dict[str, T.Any]] = field(default_factory=list)
timed_out: bool = False
def _fingerprint(self, msg: dict[str, T.Any]) -> str:
try:
raw = json.dumps(msg, sort_keys=True, ensure_ascii=False, default=str)
except (TypeError, ValueError):
raw = repr(msg)
return hashlib.sha1(raw.encode("utf-8", errors="ignore")).hexdigest()
def remember_id(self, msg_id: str) -> None:
if not msg_id or msg_id in self.seen_message_ids:
return
self.seen_message_ids.add(msg_id)
self.seen_message_order.append(msg_id)
while len(self.seen_message_order) > _MAX_VALUES_HISTORY:
dropped = self.seen_message_order.popleft()
self.seen_message_ids.discard(dropped)
def extract_new_messages(self, values_messages: list[T.Any]) -> list[dict[str, T.Any]]:
new_messages: list[dict[str, T.Any]] = []
no_id_indexes_seen: set[int] = set()
for idx, msg in enumerate(values_messages):
if not isinstance(msg, dict):
continue
msg_id = get_message_id(msg)
if msg_id:
if msg_id in self.seen_message_ids:
continue
self.remember_id(msg_id)
new_messages.append(msg)
continue
no_id_indexes_seen.add(idx)
fp = self._fingerprint(msg)
if self.no_id_message_fingerprints.get(idx) == fp:
continue
self.no_id_message_fingerprints[idx] = fp
new_messages.append(msg)
# align no-id fingerprints with current payload shape
for idx in list(self.no_id_message_fingerprints.keys()):
if idx not in no_id_indexes_seen:
self.no_id_message_fingerprints.pop(idx, None)
return new_messages然后在 runner 中:
from .deerflow_stream_state import StreamState
def _handle_values_event(self, data: T.Any, state: StreamState) -> list[AgentResponse]:
...
if not state.baseline_initialized:
state.baseline_initialized = True
for idx, msg in enumerate(values_messages):
if not isinstance(msg, dict):
continue
msg_id = get_message_id(msg)
if msg_id:
state.remember_id(msg_id)
continue
state.no_id_message_fingerprints[idx] = state._fingerprint(msg)
return responses
new_messages = state.extract_new_messages(values_messages)
...这样可以把去重逻辑集中在一个地方,并暴露清晰的 API(extract_new_messages、remember_id)。
3. 将最终结果解析拆分为内容与状态
_resolve_final_output 目前同时决策“展示什么文本/组件”和“是否失败(failures_only)”。你可以在保持语义不变的前提下拆分关注点,让每个函数更短。
示例:
def _select_final_chain(self, state: _StreamState) -> MessageChain:
if state.clarification_text:
return MessageChain(chain=[Comp.Plain(state.clarification_text)])
latest_ai_message = extract_latest_ai_message(state.run_values_messages)
if latest_ai_message:
chain = self._build_chain_from_ai_content(latest_ai_message.get("content"))
if chain.chain:
return chain
if state.latest_text:
return MessageChain(chain=[Comp.Plain(state.latest_text)])
failure_text = build_task_failure_summary(state.task_failures)
if failure_text:
return MessageChain(chain=[Comp.Plain(failure_text)])
logger.warning("DeerFlow returned no text content in stream events.")
return MessageChain(chain=[Comp.Plain("DeerFlow returned an empty response.")])
def _was_failures_only(self, state: _StreamState, chain: MessageChain) -> bool:
# preserve your current semantics
failure_text = build_task_failure_summary(state.task_failures)
return bool(failure_text) and chain.chain and chain.chain[0].text == failure_text
def _resolve_final_output(self, state: _StreamState) -> tuple[MessageChain, bool]:
final_chain = self._select_final_chain(state)
failures_only = self._was_failures_only(state, final_chain)
return final_chain, failures_only这在保持当前行为的同时(可能需要稍微调整 _was_failures_only 来完全匹配现有的 failures_only 逻辑),让基于不同关注点的推理和测试更加直接。
仅仅这三处抽取,就能显著缩短 runner 的行数,并将彼此独立的关注点(内容解码、流状态/去重、结果解析)拆分开来,而不改变行为或回退 DeerFlow 集成。
Original comment in English
issue (complexity): Consider extracting the content-conversion helpers, streaming state/dedup logic, and final-result resolution into small focused modules/classes to make DeerFlowAgentRunner shorter and easier to understand.
A few targeted extractions would reduce the cognitive load of DeerFlowAgentRunner while preserving your current behavior.
1. Extract content-to-component transformation
All of _image_component_from_url, _append_components_from_content, _build_chain_from_ai_content, _is_likely_base64_image, and _build_user_content are pure-ish helpers that don’t need runner state (except _is_likely_base64_image being used only by _build_user_content). Moving them into a focused helper module makes the runner’s core flow easier to read and test.
Example extraction:
# deerflow_content_adapter.py
import base64
import typing as T
import astrbot.core.message.components as Comp
from astrbot.core.message.message_event_result import MessageChain
from .deerflow_stream_utils import extract_text
def is_likely_base64_image(value: str) -> bool:
if " " in value:
return False
compact = value.replace("\n", "").replace("\r", "")
if not compact or len(compact) < 32 or len(compact) % 4 != 0:
return False
base64_chars = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/="
if any(ch not in base64_chars for ch in compact):
return False
try:
base64.b64decode(compact, validate=True)
except Exception:
return False
return True
def build_user_content(prompt: str, image_urls: list[str]) -> T.Any:
# essentially your existing _build_user_content, using is_likely_base64_image
...
def image_component_from_url(url: T.Any) -> Comp.Image | None:
# your existing _image_component_from_url body
...
def append_components_from_content(
content: T.Any,
components: list[Comp.BaseMessageComponent],
) -> None:
# your existing _append_components_from_content body, using image_component_from_url
...
def build_chain_from_ai_content(content: T.Any) -> MessageChain:
components: list[Comp.BaseMessageComponent] = []
append_components_from_content(content, components)
if components:
return MessageChain(chain=components)
fallback_text = extract_text(content)
if fallback_text:
return MessageChain(chain=[Comp.Plain(fallback_text)])
return MessageChain()Then in the runner:
from .deerflow_content_adapter import (
build_user_content,
build_chain_from_ai_content,
)
def _build_messages(...):
...
messages.append({
"role": "user",
"content": build_user_content(prompt, image_urls),
})
...
def _resolve_final_output(...):
...
if latest_ai_message:
final_chain = build_chain_from_ai_content(latest_ai_message.get("content"))
...This removes ~150 lines from the runner without changing behavior.
2. Encapsulate streaming state and dedup logic
_StreamState, _extract_new_messages_from_values, _remember_seen_message_id, and _fingerprint_message work together as a mini dedup engine. Packaging this into a tiny class with a narrow API makes _execute_deerflow_request and _handle_values_event less subtle.
Example:
# deerflow_stream_state.py
import json, hashlib, typing as T
from collections import deque
from dataclasses import dataclass, field
from .deerflow_stream_utils import get_message_id
_MAX_VALUES_HISTORY = 200
@dataclass
class StreamState:
latest_text: str = ""
prev_text_for_streaming: str = ""
clarification_text: str = ""
task_failures: list[str] = field(default_factory=list)
seen_message_ids: set[str] = field(default_factory=set)
seen_message_order: deque[str] = field(default_factory=deque)
no_id_message_fingerprints: dict[int, str] = field(default_factory=dict)
baseline_initialized: bool = False
has_values_text: bool = False
run_values_messages: list[dict[str, T.Any]] = field(default_factory=list)
timed_out: bool = False
def _fingerprint(self, msg: dict[str, T.Any]) -> str:
try:
raw = json.dumps(msg, sort_keys=True, ensure_ascii=False, default=str)
except (TypeError, ValueError):
raw = repr(msg)
return hashlib.sha1(raw.encode("utf-8", errors="ignore")).hexdigest()
def remember_id(self, msg_id: str) -> None:
if not msg_id or msg_id in self.seen_message_ids:
return
self.seen_message_ids.add(msg_id)
self.seen_message_order.append(msg_id)
while len(self.seen_message_order) > _MAX_VALUES_HISTORY:
dropped = self.seen_message_order.popleft()
self.seen_message_ids.discard(dropped)
def extract_new_messages(self, values_messages: list[T.Any]) -> list[dict[str, T.Any]]:
new_messages: list[dict[str, T.Any]] = []
no_id_indexes_seen: set[int] = set()
for idx, msg in enumerate(values_messages):
if not isinstance(msg, dict):
continue
msg_id = get_message_id(msg)
if msg_id:
if msg_id in self.seen_message_ids:
continue
self.remember_id(msg_id)
new_messages.append(msg)
continue
no_id_indexes_seen.add(idx)
fp = self._fingerprint(msg)
if self.no_id_message_fingerprints.get(idx) == fp:
continue
self.no_id_message_fingerprints[idx] = fp
new_messages.append(msg)
# align no-id fingerprints with current payload shape
for idx in list(self.no_id_message_fingerprints.keys()):
if idx not in no_id_indexes_seen:
self.no_id_message_fingerprints.pop(idx, None)
return new_messagesThen the runner:
from .deerflow_stream_state import StreamState
def _handle_values_event(self, data: T.Any, state: StreamState) -> list[AgentResponse]:
...
if not state.baseline_initialized:
state.baseline_initialized = True
for idx, msg in enumerate(values_messages):
if not isinstance(msg, dict):
continue
msg_id = get_message_id(msg)
if msg_id:
state.remember_id(msg_id)
continue
state.no_id_message_fingerprints[idx] = state._fingerprint(msg)
return responses
new_messages = state.extract_new_messages(values_messages)
...This keeps the dedup complexity in one place with a clear API (extract_new_messages, remember_id).
3. Split final result resolution into content vs status
_resolve_final_output currently decides both “what text/components to show” and “did we fail (failures_only)”. You can keep its semantics but separate the concerns so each function stays short.
Example:
def _select_final_chain(self, state: _StreamState) -> MessageChain:
if state.clarification_text:
return MessageChain(chain=[Comp.Plain(state.clarification_text)])
latest_ai_message = extract_latest_ai_message(state.run_values_messages)
if latest_ai_message:
chain = self._build_chain_from_ai_content(latest_ai_message.get("content"))
if chain.chain:
return chain
if state.latest_text:
return MessageChain(chain=[Comp.Plain(state.latest_text)])
failure_text = build_task_failure_summary(state.task_failures)
if failure_text:
return MessageChain(chain=[Comp.Plain(failure_text)])
logger.warning("DeerFlow returned no text content in stream events.")
return MessageChain(chain=[Comp.Plain("DeerFlow returned an empty response.")])
def _was_failures_only(self, state: _StreamState, chain: MessageChain) -> bool:
# preserve your current semantics
failure_text = build_task_failure_summary(state.task_failures)
return bool(failure_text) and chain.chain and chain.chain[0].text == failure_text
def _resolve_final_output(self, state: _StreamState) -> tuple[MessageChain, bool]:
final_chain = self._select_final_chain(state)
failures_only = self._was_failures_only(state, final_chain)
return final_chain, failures_onlyThis keeps the same behavior (you may need to tweak _was_failures_only to perfectly match your current failures_only logic) but makes reasoning and testing per concern more straightforward.
These three extractions alone cut down the runner’s line count and separate orthogonal concerns (content decoding, stream state/dedup, result resolution) without changing its behavior or reverting the DeerFlow integration.


变更概览
本 PR 将 DeerFlow 集成为可配置的第三方 Agent Runner,并补齐了流式/非流式路径下的错误语义、生命周期管理与稳定性处理。
主要改动
DeerFlowAgentRunner、DeerFlowAPIClient、deerflow_stream_utils,接入 LangGraph SSE 事件解析。AgentResponse。agent_runnerprovider 选项,补齐配置模板/元数据与图标映射。ResultContentType.AGENT_RUNNER_ERROR。THIRD_PARTY_RUNNER_ERROR_EXTRA_KEY向下游传播错误状态,避免错误被误判为普通 LLM 输出。step_until_done(max_step)强制生效,避免潜在无限循环。asyncio.CancelledError显式上抛,不再被吞并为普通错误。session_id时改为每次请求生成唯一 ID,避免 thread 污染。values事件支持无id消息去重回退机制。run_values_messages/seen_message_ids增加上界,避免长会话内存无限增长。b64decode(validate=True))。ClientSession改为惰性初始化,减少无请求场景开销与泄漏风险。proxy配置已透传到 HTTP 请求。close()生命周期约束,__del__仅保留兜底告警。验证
uv run ruff format .uv run ruff check .备注
7e91cd22~98a48403)。Sourcery 总结
集成 DeerFlow 第三方代理运行器,并改进第三方运行器的错误语义、生命周期管理以及流式/非流式稳定性。
新特性:
DeerFlowAgentRunner,并将其作为可配置的agent_runner提供者集成,包含控制台元数据、模板和图标。缺陷修复:
AGENT_RUNNER_ERROR结果类型和事件扩展信息正确传播错误状态,避免将错误误分类为正常的 LLM 输出。增强改进:
文档:
agent_runner配置选项。Original summary in English
Summary by Sourcery
Integrate the DeerFlow third-party agent runner and improve third-party runner error semantics, lifecycle management, and streaming/non-streaming stability.
New Features:
Bug Fixes:
Enhancements:
Documentation:
Original summary in English
Sourcery 总结
集成 DeerFlow 第三方代理运行器,并改进第三方运行器的错误语义、生命周期管理以及流式/非流式稳定性。
新特性:
DeerFlowAgentRunner,并将其作为可配置的agent_runner提供者集成,包含控制台元数据、模板和图标。缺陷修复:
AGENT_RUNNER_ERROR结果类型和事件扩展信息正确传播错误状态,避免将错误误分类为正常的 LLM 输出。增强改进:
文档:
agent_runner配置选项。Original summary in English
Summary by Sourcery
Integrate the DeerFlow third-party agent runner and improve third-party runner error semantics, lifecycle management, and streaming/non-streaming stability.
New Features:
Bug Fixes:
Enhancements:
Documentation:
Original summary in English
Sourcery 总结
集成 DeerFlow 第三方代理运行器,并改进第三方运行器的错误语义、生命周期管理以及流式/非流式稳定性。
新特性:
DeerFlowAgentRunner,并将其作为可配置的agent_runner提供者集成,包含控制台元数据、模板和图标。缺陷修复:
AGENT_RUNNER_ERROR结果类型和事件扩展信息正确传播错误状态,避免将错误误分类为正常的 LLM 输出。增强改进:
文档:
agent_runner配置选项。Original summary in English
Summary by Sourcery
Integrate the DeerFlow third-party agent runner and improve third-party runner error semantics, lifecycle management, and streaming/non-streaming stability.
New Features:
Bug Fixes:
Enhancements:
Documentation:
Original summary in English
Sourcery 总结
集成 DeerFlow 第三方代理运行器,并改进第三方运行器的错误语义、生命周期管理以及流式/非流式稳定性。
新特性:
DeerFlowAgentRunner,并将其作为可配置的agent_runner提供者集成,包含控制台元数据、模板和图标。缺陷修复:
AGENT_RUNNER_ERROR结果类型和事件扩展信息正确传播错误状态,避免将错误误分类为正常的 LLM 输出。增强改进:
文档:
agent_runner配置选项。Original summary in English
Summary by Sourcery
Integrate the DeerFlow third-party agent runner and improve third-party runner error semantics, lifecycle management, and streaming/non-streaming stability.
New Features:
Bug Fixes:
Enhancements:
Documentation:
Original summary in English
Sourcery 总结
集成 DeerFlow 第三方代理运行器,并改进第三方运行器的错误语义、生命周期管理以及流式/非流式稳定性。
新特性:
DeerFlowAgentRunner,并将其作为可配置的agent_runner提供者集成,包含控制台元数据、模板和图标。缺陷修复:
AGENT_RUNNER_ERROR结果类型和事件扩展信息正确传播错误状态,避免将错误误分类为正常的 LLM 输出。增强改进:
文档:
agent_runner配置选项。Original summary in English
Summary by Sourcery
Integrate the DeerFlow third-party agent runner and improve third-party runner error semantics, lifecycle management, and streaming/non-streaming stability.
New Features:
Bug Fixes:
Enhancements:
Documentation:
Original summary in English
Sourcery 总结
集成 DeerFlow 第三方代理运行器,并改进第三方运行器的错误语义、生命周期管理以及流式/非流式稳定性。
新特性:
DeerFlowAgentRunner,并将其作为可配置的agent_runner提供者集成,包含控制台元数据、模板和图标。缺陷修复:
AGENT_RUNNER_ERROR结果类型和事件扩展信息正确传播错误状态,避免将错误误分类为正常的 LLM 输出。增强改进:
文档:
agent_runner配置选项。Original summary in English
Summary by Sourcery
Integrate the DeerFlow third-party agent runner and improve third-party runner error semantics, lifecycle management, and streaming/non-streaming stability.
New Features:
Bug Fixes:
Enhancements:
Documentation:
Original summary in English
Sourcery 总结
集成 DeerFlow 第三方代理运行器,并改进第三方运行器的错误语义、生命周期管理以及流式/非流式稳定性。
新特性:
DeerFlowAgentRunner,并将其作为可配置的agent_runner提供者集成,包含控制台元数据、模板和图标。缺陷修复:
AGENT_RUNNER_ERROR结果类型和事件扩展信息正确传播错误状态,避免将错误误分类为正常的 LLM 输出。增强改进:
文档:
agent_runner配置选项。Original summary in English
Summary by Sourcery
Integrate the DeerFlow third-party agent runner and improve third-party runner error semantics, lifecycle management, and streaming/non-streaming stability.
New Features:
Bug Fixes:
Enhancements:
Documentation:
Original summary in English
Sourcery 总结
集成 DeerFlow 第三方代理运行器,并改进第三方运行器的错误语义、生命周期管理以及流式/非流式稳定性。
新特性:
DeerFlowAgentRunner,并将其作为可配置的agent_runner提供者集成,包含控制台元数据、模板和图标。缺陷修复:
AGENT_RUNNER_ERROR结果类型和事件扩展信息正确传播错误状态,避免将错误误分类为正常的 LLM 输出。增强改进:
文档:
agent_runner配置选项。Original summary in English
Summary by Sourcery
Integrate the DeerFlow third-party agent runner and improve third-party runner error semantics, lifecycle management, and streaming/non-streaming stability.
New Features:
Bug Fixes:
Enhancements:
Documentation: