Bug Description
When using CallbackHandler with LangGraph, trace names are non-deterministic on graph resume (e.g., after a human-in-the-loop interrupt). Sometimes the trace gets the correct compiled graph name (e.g., "my-agent"), sometimes it gets an empty string "".
Root Cause
In CallbackHandler.on_chain_start (line 320-325), when the root chain starts (parent_run_id is None), propagate_attributes() is called without the trace_name parameter:
# Current code (simplified):
span_name = self.get_langchain_run_name(serialized, **kwargs) # line 310
if parent_run_id is None:
self._propagation_context_manager = propagate_attributes(
user_id=...,
session_id=...,
tags=...,
metadata=...,
# trace_name is NOT passed here
)
The propagate_attributes() API does support a trace_name parameter, but it's not being used.
Why This Causes Non-Deterministic Trace Names
LangGraph's Pregel.stream() calls on_chain_start with:
# langgraph/pregel/main.py
name=config.get("run_name", self.get_name()) # self.get_name() = compiled graph name
On initial run, the first on_chain_start event comes from the root graph, so span_name correctly reflects the compiled graph name (e.g., "my-agent").
On resume (e.g., after HITL interrupt via Command(resume=...)), the graph may resume from an internal node. The first on_chain_start event can come from a subgraph node whose name is "", causing the trace to get an empty name.
Since trace_name is not propagated via propagate_attributes(), the trace name depends entirely on whichever on_chain_start fires first — which is non-deterministic on resume.
Reproduction
from langgraph.graph import StateGraph
from langgraph.types import Command, interrupt
from langfuse.langchain import CallbackHandler
# Build a graph with HITL interrupt
def my_node(state):
answer = interrupt("question?")
return {"messages": [AIMessage(content=answer)]}
graph = builder.compile(checkpointer=checkpointer, name="my-agent")
handler = CallbackHandler()
# Initial run — trace name = "my-agent" ✅
for chunk in graph.stream(input, config={"callbacks": [handler]}):
pass
# Resume — trace name is "" (non-deterministic) ❌
for chunk in graph.stream(Command(resume="yes"), config={"callbacks": [handler]}):
pass
Proposed Fix
Pass span_name as trace_name to propagate_attributes():
if parent_run_id is None:
self._propagation_context_manager = propagate_attributes(
trace_name=span_name, # <-- add this
user_id=parsed_trace_attributes.get("user_id", None),
session_id=parsed_trace_attributes.get("session_id", None),
tags=parsed_trace_attributes.get("tags", None),
metadata=parsed_trace_attributes.get("metadata", None),
)
This ensures the trace name is always set from the callback's name parameter, regardless of which internal node fires first on resume.
Workaround
Setting run_name in the LangGraph config forces a consistent name:
config = {"run_name": "my-agent", "callbacks": [handler]}
graph.stream(Command(resume="yes"), config=config)
This works because LangGraph uses config.get("run_name", self.get_name()), so an explicit run_name overrides the non-deterministic behavior. However, users shouldn't need to duplicate the compiled graph name into the config.
Environment
- langfuse SDK: 4.x (OTel-based)
- langgraph: 0.4.x
- Python: 3.12+
Related
propagate_attributes() already supports trace_name — it's just not used in the callback handler
_parse_langfuse_trace_attributes() parses langfuse_session_id, langfuse_user_id, langfuse_tags from metadata, but has no support for langfuse_trace_name either
Bug Description
When using
CallbackHandlerwith LangGraph, trace names are non-deterministic on graph resume (e.g., after a human-in-the-loop interrupt). Sometimes the trace gets the correct compiled graph name (e.g.,"my-agent"), sometimes it gets an empty string"".Root Cause
In
CallbackHandler.on_chain_start(line 320-325), when the root chain starts (parent_run_id is None),propagate_attributes()is called without thetrace_nameparameter:The
propagate_attributes()API does support atrace_nameparameter, but it's not being used.Why This Causes Non-Deterministic Trace Names
LangGraph's
Pregel.stream()callson_chain_startwith:On initial run, the first
on_chain_startevent comes from the root graph, sospan_namecorrectly reflects the compiled graph name (e.g.,"my-agent").On resume (e.g., after HITL interrupt via
Command(resume=...)), the graph may resume from an internal node. The firston_chain_startevent can come from a subgraph node whose name is"", causing the trace to get an empty name.Since
trace_nameis not propagated viapropagate_attributes(), the trace name depends entirely on whicheveron_chain_startfires first — which is non-deterministic on resume.Reproduction
Proposed Fix
Pass
span_nameastrace_nametopropagate_attributes():This ensures the trace name is always set from the callback's
nameparameter, regardless of which internal node fires first on resume.Workaround
Setting
run_namein the LangGraph config forces a consistent name:This works because LangGraph uses
config.get("run_name", self.get_name()), so an explicitrun_nameoverrides the non-deterministic behavior. However, users shouldn't need to duplicate the compiled graph name into the config.Environment
Related
propagate_attributes()already supportstrace_name— it's just not used in the callback handler_parse_langfuse_trace_attributes()parseslangfuse_session_id,langfuse_user_id,langfuse_tagsfrom metadata, but has no support forlangfuse_trace_nameeither