Skip to content

[FEATURE] Support graph cycles (looping back to previous nodes via edge conditions) #2387

@yananym

Description

@yananym

Problem Statement

Currently, Strands graphs are effectively DAGs — while the code doesn't explicitly reject cycles, _find_newly_ready_nodes only considers destination nodes of outbound edges from the just-completed batch, meaning back-edges pointing to already-completed nodes will never re-schedule them. There is no mechanism for a completed node to become "ready" again based on downstream state or edge conditions.

This makes it impossible to build common agentic patterns that require iteration:

  • ReAct loops: LLM calls tools, evaluates results, and decides whether to call more tools or finish
  • Iterative refinement: A generator produces output, an evaluator grades it, and routes back to the generator if quality is insufficient
  • Multi-turn data gathering: An agent collects information across multiple passes, looping back to a data-collection node until a completeness threshold is met
  • Retry with adaptation: A node fails or produces partial results, and the graph routes back to retry with modified context

The existing reset_on_revisit flag and max_node_executions limit suggest looping was anticipated in the design, but the scheduling logic doesn't actually support it.

Proposed Solution

Introduce first-class support for graph cycles through edge-condition-driven back-edges:

  1. Back-edge scheduling in _find_newly_ready_nodes

When a node completes and its outbound edge points to an already-completed node, evaluate the edge condition. If it returns True, re-schedule that node for execution:

  def _find_newly_ready_nodes(self, completed_batch: list[GraphNode]) -> list[GraphNode]:
      candidates = {edge.to_node for edge in self.edges if edge.from_node in completed_batch}

      newly_ready = []
      for node in candidates:
          if node in self.state.completed_nodes:
              # Back-edge: node was already completed, check if it should loop
              if self._should_revisit(node, completed_batch):
                  self._prepare_for_revisit(node)
                  newly_ready.append(node)
          elif self._is_node_ready_with_conditions(node, completed_batch):
              newly_ready.append(node)
      return newly_ready
  1. Node state reset on revisit

When a node is re-scheduled, clear its previous execution state (leveraging the existing reset_on_revisit mechanism and reset_executor_state()). Remove it from completed_nodes so dependency resolution works
correctly for its downstream nodes.

  1. Loop termination via edge conditions with invocation_state

Edge conditions (enhanced by PR #2305 to receive invocation_state) serve as the loop exit mechanism:

  def should_loop_back(state: GraphState, *, invocation_state: dict[str, Any]) -> bool:
      """Edge condition on back-edge: return True to loop, False to exit."""
      results = state.results.get("evaluator")
      return results and "needs_improvement" in str(results.result)

  builder.add_edge("evaluator", "generator", condition=should_loop_back)
  builder.add_edge("evaluator", "output_formatter", condition=lambda s, **_: not should_loop_back(s))
  1. Safety: max_node_executions as recursion limit

The existing max_node_executions serves as the infinite-loop guard (analogous to LangGraph's recursion_limit). This counts total node executions across all iterations. Additionally, consider a per-node execution cap for finer control.

  1. Status.SKIPPED for conditional bypass within loops

Introduce Status.SKIPPED (see #2240 / PR #2258) to allow hooks to bypass a node while still satisfying its downstream dependencies. Within a loop, a skipped node on iteration N is still eligible for execution on iteration N+1 after reset:

  # Iteration 1: step_b runs normally
  # Iteration 2: hook skips step_b (already gathered data), downstream still executes
  # Iteration 3: step_b executes again (state changed, hook allows it)
  1. Status.CANCELLED for branch termination

Separate from SKIPPED — aligns with the TypeScript SDK's cancel behavior. Cancelled nodes terminate their branch; downstream nodes do NOT execute. Useful for aborting optional side-branches within a loop without killing the entire cycle.

Use Case

  1. ReAct agent loop
 builder.add_node(llm_agent, "llm")
 builder.add_node(tool_executor, "tools")
 builder.set_entry_point("llm")
 
 # Forward edge: LLM → tools (when tool calls exist)
 builder.add_edge("llm", "tools", condition=has_tool_calls)
 
 # Back-edge: tools → LLM (always loop back after executing tools)
 builder.add_edge("tools", "llm")
 
 # Exit edge: LLM → output (when no tool calls)
 builder.add_edge("llm", "output", condition=no_tool_calls)
 
 graph = builder.build(max_node_executions=50)
  1. Iterative refinement with quality gate
builder.add_node(writer_agent, "writer")
builder.add_node(reviewer_agent, "reviewer")
builder.add_node(publisher_agent, "publisher")
builder.set_entry_point("writer") 

builder.add_edge("writer", "reviewer")
builder.add_edge("reviewer", "writer", condition=needs_revision)  # Loop back
builder.add_edge("reviewer", "publisher", condition=approved)     # Exit loop
  1. Multi-turn data collection with interrupt
builder.add_node(collector_agent, "collector")
builder.add_node(validator_agent, "validator")
builder.set_entry_point("collector")

builder.add_edge("collector", "validator")
builder.add_edge("validator", "collector", condition=incomplete)  # Loop back for more data

# collector uses interrupt() to ask user for input on each iteration
# Graph pauses, resumes with user response, collector completes, validator checks
  1. Conditional skip within a loop (requires [BUG] cancel_node in BeforeNodeCallEvent raises RuntimeError that kills the entire graph on resume #2240 fix)
def skip_enrichment_if_cached(event: BeforeNodeCallEvent):
    if event.invocation_state.get("cache_hit"):
        event.cancel_node = "data already cached"  # SKIPPED, downstream continues
        
builder.add_node(fetcher, "fetch")
builder.add_node(enricher, "enrich")
builder.add_node(evaluator, "evaluate")
builder.set_entry_point("fetch")

builder.add_edge("fetch", "enrich")
builder.add_edge("enrich", "evaluate")
builder.add_edge("evaluate", "fetch", condition=needs_more_data)  # Loop back

Alternatives Solutions

Approach Pros Cons
Edge-condition back-edges (proposed) Declarative, composable, works with existing invocation_state infra, familiar graph semantics Requires changes to node scheduling logic
Explicit Command-style routing (LangGraph approach) Maximum flexibility, node controls routing Mixes control flow into node logic, harder to visualize/reason about
Functional API with explicit while-loop Simple, no framework changes needed Loses graph benefits (parallelism, observability, interrupt/resume, session persistence)
Wrapper that re-invokes the graph Works today Loses accumulated state between iterations, no per-iteration checkpointing, interrupt/resume breaks

Additional Context

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions