Skip to content

feat: GraphAgent β€” directed-graph workflow orchestration for ADKΒ #4581

@drahnreb

Description

@drahnreb

πŸ”΄ Required Information

Is your feature request related to a specific problem?

ADK provides SequentialAgent, ParallelAgent, and LoopAgent for linear/parallel/iterative patterns but lacks a general-purpose directed-graph orchestrator. Users cannot express conditional branching, cycles, or arbitrary DAG topologies with the existing agents β€” they must manually wire async functions and state plumbing, losing observability, resumability, and YAML config support.

Describe the Solution You'd Like

Add GraphAgent β€” a directed-graph workflow engine for ADK that enables:

  • Nodes wrapping BaseAgent or async functions with typed state
  • Conditional and priority-based edge routing
  • Cyclic execution with max_iterations guard
  • GraphState with StateReducer (OVERWRITE, APPEND, MERGE) for deterministic state merges
  • Typed GraphEvent streaming with FULL, FINAL, NONE modes
  • Node lifecycle callbacks (before_node, after_node, on_edge_condition)
  • Rewind to prior node for correction workflows
  • Pause/resume via GraphAgentState (BaseAgentState)
  • OpenTelemetry tracing with configurable TelemetryConfig
  • Evaluation metrics (graph_path_match, state_contains_keys, node_execution_count)
  • CLI adk web graph visualization for GraphAgent topologies
  • YAML/JSON config support via GraphAgentConfig
  • Pattern nodes: DynamicNode (runtime dispatch), NestedGraphNode (hierarchical composition), DynamicParallelGroup (fan-out)
  • ParallelNodeGroup with WAIT_ALL/WAIT_ANY/WAIT_N join strategies and error policies
  • InterruptService for human-in-the-loop pause/resume, approval gates, cancellation
  • CheckpointService for state persistence with delta compression, locking, retention

Impact on your work

This is the foundational orchestration primitive for building production agent workflows that go beyond linear pipelines β€” conditional routing, human oversight, parallel fan-out/fan-in, and crash recovery.

Willingness to contribute

Yes β€” implementation is ready, split into 5 focused PRs for reviewability.


🟑 Recommended Information

Describe Alternatives You've Considered

  • LangGraph: External dependency with different state model; doesn't integrate with ADK's session/artifact/telemetry services.
  • Manual function-node wiring: Works but loses observability, resumability, YAML config, and CLI visualization.

Proposed API / Implementation

Split into 5 stacked PRs:

  1. Core GraphAgent β€” engine, state, routing, callbacks, CLI viz, telemetry, evaluation (~370 tests)
  2. Graph patterns β€” DynamicNode, NestedGraphNode, DynamicParallelGroup (~25 tests)
  3. Parallel execution β€” ParallelNodeGroup with join strategies (~52 tests)
  4. InterruptService β€” human-in-the-loop workflows (~175 tests)
  5. CheckpointService β€” state persistence with delta compression (~105 tests)
from google.adk.agents.graph import GraphAgent, GraphNode, GraphState

agent_a = LlmAgent(name="researcher", model="gemini-2.0-flash")
agent_b = LlmAgent(name="writer", model="gemini-2.0-flash")

graph = GraphAgent(name="pipeline")
graph.add_node(GraphNode(name="research", agent=agent_a))
graph.add_node(GraphNode(name="write", agent=agent_b))
graph.add_edge("research", "write")
graph.set_start("research")
graph.set_end("write")

Additional Context

Source: src/google/adk/agents/graph/, src/google/adk/checkpoints/, telemetry extensions, CLI viz extension.
Total: ~727 tests across 25 test files, 26 sample agents, 6 design documents.

Metadata

Metadata

Assignees

No one assigned

    Labels

    workflow[Component] This issue is related to ADKworkflow

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions