-
Notifications
You must be signed in to change notification settings - Fork 3k
Description
There is a bug in the bigquery_agent_analytics_plugin where the trace_id for a single invocation splits halfway through the turn.
Early lifecycle events (USER_MESSAGE_RECEIVED and INVOCATION_STARTING) are logged with one
trace_id (fallback to invocation_id), while subsequent events (starting at AGENT_STARTING and continuing through LLM/Tool interactions) are logged under a completely different, newly generated trace_id.
Impact: This breaks downstream joins and observability.
A trace_id is meant to be the single, unbroken thread that ties a user's action to the database logs. By falling back to the invocation_id early, but generating an OTel trace ID late, the plugin breaks the data model in half. You cannot easily GROUP BY trace_id to evaluate an entire turn without writing complex queries holding the invocation_id together.
You can verify the data corruption in BigQuery using the following query on the destination table:
WITH SequencedEvents AS (
SELECT
timestamp,
session_id,
invocation_id,
event_type,
trace_id,
LEAD(event_type) OVER(PARTITION BY invocation_id ORDER BY timestamp ASC) as next_event,
LEAD(trace_id) OVER(PARTITION BY invocation_id ORDER BY timestamp ASC) as next_trace_id
FROM `YOUR_PROJECT.YOUR_DATASET.agent_events`
WHERE event_type IN ('INVOCATION_STARTING', 'AGENT_STARTING')
)
SELECT *
FROM SequencedEvents
WHERE event_type = 'INVOCATION_STARTING'
AND next_event = 'AGENT_STARTING'
AND trace_id != next_trace_id
ORDER BY timestamp DESC
