From ae180ddcdc95ae3e3824ab7709973804fd987576 Mon Sep 17 00:00:00 2001 From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com> Date: Tue, 19 May 2026 22:43:04 +0000 Subject: [PATCH 1/2] Initial plan From 37f2b39f3125f00bbaf8767d27ee43d0dc3707c8 Mon Sep 17 00:00:00 2001 From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com> Date: Tue, 19 May 2026 22:49:04 +0000 Subject: [PATCH 2/2] Fix azure-ai-evaluation changelog validation for 1.16.8 Agent-Logs-Url: https://github.com/Azure/azure-sdk-for-python/sessions/bf25033d-f3b4-418f-8965-252d30472bd2 Co-authored-by: m7md7sien <16615690+m7md7sien@users.noreply.github.com> --- sdk/evaluation/azure-ai-evaluation/CHANGELOG.md | 8 -------- .../azure-ai-evaluation/azure/ai/evaluation/_version.py | 2 +- 2 files changed, 1 insertion(+), 9 deletions(-) diff --git a/sdk/evaluation/azure-ai-evaluation/CHANGELOG.md b/sdk/evaluation/azure-ai-evaluation/CHANGELOG.md index e923bd9dc187..0410ed460b7c 100644 --- a/sdk/evaluation/azure-ai-evaluation/CHANGELOG.md +++ b/sdk/evaluation/azure-ai-evaluation/CHANGELOG.md @@ -2,16 +2,10 @@ ## 1.16.8 (Unreleased) -### Features Added - ### Breaking Changes - Updated `EVALUATOR_NAME_METRICS_MAPPINGS` so `document_retrieval` and `rouge_score` report single primary metrics (`document_retrieval`, `rouge`), with previous sub-metrics now represented in each evaluator's `*_properties` payload. -### Bugs Fixed - -### Other Changes - ## 1.16.7 (2026-05-07) ### Features Added @@ -23,8 +17,6 @@ - Added `skipped` to `ResultCount` and `skipped`/`errored` to `PerTestingCriteriaResult` typed contracts. - App Insights logging now forwards arbitrary evaluator-specific keys from each event's `properties` payload as a single `gen_ai.evaluation.properties` JSON attribute (carried inside `internal_properties`). Previously only the four red-team keys (`attack_success`, `attack_technique`, `attack_complexity`, `attack_success_threshold`) were forwarded; structured outputs such as rubric `dimension_scores` were silently dropped. Payloads larger than 7500 characters are replaced with a valid JSON marker (`{"truncated": true, "original_size_bytes": }`) so consumers can always `json.loads` the value. Non-dict `properties` payloads are now safely ignored instead of raising in the red-team forwarder. -### Breaking Changes - ### Bugs Fixed - `_TaskNavigationEfficiencyEvaluator` now accepts JSON-stringified `response` and `ground_truth` inputs (e.g., from data pipelines that serialize list/tuple inputs to strings). String inputs are parsed as JSON; on parse failure the original value is preserved so downstream validation surfaces the error as before. diff --git a/sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_version.py b/sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_version.py index 7f7138b3715f..f8c5d12c048c 100644 --- a/sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_version.py +++ b/sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_version.py @@ -3,4 +3,4 @@ # --------------------------------------------------------- # represents upcoming version -VERSION = "1.16.7" +VERSION = "1.16.8"