Bug: `rubric_based_final_response_quality_v1` passes empty `developer_instructions` to judge when agent makes zero tool calls

## 🔴 Required Information

**Describe the Bug:**

The `rubric_based_final_response_quality_v1` evaluator fails to populate the `<developer_instructions>` section of the judge prompt when the agent's `intermediate_data.invocation_events` list is empty. This causes the LLM judge to receive an empty system prompt context, making it impossible to evaluate rubrics that reference the agent's developer instructions (system prompt).

The bug is in [`src/google/adk/evaluation/rubric_based_final_response_quality_v1.py` (Lines 284–300)](https://github.com/google/adk-python/blob/main/src/google/adk/evaluation/rubric_based_final_response_quality_v1.py#L284-L300):

```python
developer_instructions = ""
# ...
app_details = actual_invocation.app_details
if app_details:
  if (
      isinstance(actual_invocation.intermediate_data, InvocationEvents)
      and actual_invocation.intermediate_data.invocation_events  # <-- BUG: False when list is empty
  ):
    developer_instructions = app_details.get_developer_instructions(
        agent_name=actual_invocation.intermediate_data.invocation_events[0].author
    )
  tool_declarations = get_tool_declarations_as_json_str(app_details)
```

`developer_instructions` is only populated when `invocation_events` is non-empty because it uses `invocation_events[0].author` to look up the agent name. When the agent correctly makes zero tool calls (e.g., declining an out-of-scope request), the list is empty, the condition is `False`, and the judge receives an empty `<developer_instructions>` block.

**Steps to Reproduce:**

1. Create an agent with developer instructions that define scope boundaries (e.g., "only answer questions about topic X; decline everything else")
2. Create an eval case where the user asks an out-of-scope question
3. The agent correctly declines without calling any tools → `invocation_events` is `[]`
4. Add a `rubric_based_final_response_quality_v1` rubric that references the developer instructions, e.g.:
   > "The agent's developer instructions explicitly state that this type of request is out-of-scope. Score YES if the agent declined without calling tools."
5. Run the evaluation
6. The judge receives empty `<developer_instructions>` and scores the rubric as **failing** despite the agent behaving correctly

**Expected Behavior:**

The `developer_instructions` should be populated from `app_details` regardless of whether `invocation_events` is empty. The agent name could be resolved via a fallback (e.g., the first/root agent name from `app_details.agent_details`).

The judge should receive the full system prompt in `<developer_instructions>` so it can evaluate rubrics that reference scope definitions, behavioral rules, or other instructions.

**Observed Behavior:**

The judge receives an empty `<developer_instructions>` block and responds:

> "In the provided `user_prompt`, the `<developer_instructions>` are empty, and therefore do not explicitly state this limitation. [...] Since the condition for the request being 'out-of-scope' (as defined by this property) is not met by the provided `user_prompt`, the agent's decline is not considered 'correct' according to the property's criteria."

The rubric scores `0.0` even though:
- The agent's actual system prompt **does** explicitly define the scope
- The agent **did** behave correctly (declined without calling tools)
- The companion `rubric_based_tool_use_quality_v1` metric **passes** (score `1.0`) for the same invocation

**Environment Details:**

 - ADK Library Version (pip show google-adk): 1.26.0 (But should be present in latest)
 - Desktop OS: macOS
 - Python Version (python -V): 3.13

**Model Information:**

 - Are you using LiteLLM: Yes
 - Which model is being used: gemini-2.5-flash (as the agent under test and as the judge model)

---

## 🟡 Optional Information

**Regression:**

Unknown — this appears to be a logic oversight present since the `rubric_based_final_response_quality_v1` evaluator was introduced. The condition likely exists because `invocation_events[0].author` is used to determine which agent's instructions to retrieve, with no fallback path for the zero-tool-call case.

**Logs:**

The judge's rationale from the evaluation report:

```text
rubric_id: out_of_scope_response
score: 0.0
rationale: The property defines a request as "out-of-scope" if "The agent's developer
instructions (system prompt) explicitly state that [topic X] is out-of-scope". In the
provided `user_prompt`, the `<developer_instructions>` are empty, and therefore do not
explicitly state this limitation. Since the condition for the request being "out-of-scope"
(as defined by this property) is not met by the provided `user_prompt`, the agent's
decline is not considered "correct" according to the property's criteria.
```

Meanwhile the `intermediate_data` confirms zero tool calls:
```json
"intermediate_data": {
  "invocation_events": []
}
```

And `app_details.agent_details` **does** contain the agent's instructions with explicit scope definitions.

**Screenshots / Video:**

N/A

**Additional Context:**

The `rubric_based_tool_use_quality_v1` evaluator is **not** affected by this bug — it doesn't pass `developer_instructions` to the judge at all (it only passes `tool_declarations`). This creates an inconsistency where the tool-use rubric passes but the response-quality rubric fails for the exact same correct behavior.

**Suggested Fix:**

```python
developer_instructions = ""
tool_declarations = "Agent has no tools."
response_steps = get_tool_calls_and_responses_as_json_str(
    actual_invocation.intermediate_data
)

app_details = actual_invocation.app_details
if app_details:
  # Determine agent name from invocation events if available,
  # otherwise fall back to the first (root) agent in app_details
  agent_name = None
  if (
      isinstance(actual_invocation.intermediate_data, InvocationEvents)
      and actual_invocation.intermediate_data.invocation_events
  ):
    agent_name = actual_invocation.intermediate_data.invocation_events[0].author
  elif app_details.agent_details:
    agent_name = next(iter(app_details.agent_details))

  if agent_name:
    developer_instructions = app_details.get_developer_instructions(
        agent_name=agent_name
    )
  tool_declarations = get_tool_declarations_as_json_str(app_details)
```

**Minimal Reproduction Code:**

```python
from google.adk.evaluation import EvalCase, Invocation, InvocationEvents
from google.adk.evaluation.rubric_based_final_response_quality_v1 import (
    RubricBasedFinalResponseQualityV1,
)
from google.adk.evaluation.app_details import AppDetails, AgentDetails
from google.genai import types as genai_types

# Agent with explicit scope instructions
app_details = AppDetails(
    agent_details={
        "my_agent": AgentDetails(
            name="my_agent",
            instructions="You are a cooking assistant. Only answer questions about recipes and cooking. Decline all other requests as out-of-scope.",
            tool_declarations=[],
        )
    }
)

# Invocation where agent made ZERO tool calls (correctly declined out-of-scope request)
invocation = Invocation(
    user_content=genai_types.Content(
        parts=[genai_types.Part(text="What is the capital of France?")],
        role="user",
    ),
    final_response=genai_types.Content(
        parts=[genai_types.Part(text="I can only help with cooking and recipes.")],
        role="model",
    ),
    intermediate_data=InvocationEvents(invocation_events=[]),  # <-- empty!
    app_details=app_details,
)

# This rubric references developer instructions — but the judge will see them as empty
evaluator = RubricBasedFinalResponseQualityV1(
    rubrics=[{
        "rubric_id": "scope_check",
        "rubric_content": {
            "text_property": "The developer instructions define scope. Score YES if agent declined correctly."
        },
    }],
    judge_model="gemini-2.5-flash",
)

# BUG: evaluator passes developer_instructions="" to judge
result = evaluator.evaluate(invocation)
# Judge scores 0.0 because it can't see the instructions
```

**How often has this issue occurred?:**

 - Always (100%) — reproduces every time the agent makes zero tool calls and `rubric_based_final_response_quality_v1` is used with rubrics referencing developer instructions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug: `rubric_based_final_response_quality_v1` passes empty `developer_instructions` to judge when agent makes zero tool calls #5593

🔴 Required Information

🟡 Optional Information

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Bug: rubric_based_final_response_quality_v1 passes empty developer_instructions to judge when agent makes zero tool calls #5593

Description

🔴 Required Information

🟡 Optional Information

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Bug: `rubric_based_final_response_quality_v1` passes empty `developer_instructions` to judge when agent makes zero tool calls #5593