Conversation
|
Thank you for your contribution @ahmad-nader! We will review the pull request and get back to you soon. |
|
@ahmad-nader please read the following Contributor License Agreement(CLA). If you agree with the CLA, please reply with the following information.
Contributor License AgreementContribution License AgreementThis Contribution License Agreement (“Agreement”) is agreed to by the party signing below (“You”),
|
2 similar comments
|
@ahmad-nader please read the following Contributor License Agreement(CLA). If you agree with the CLA, please reply with the following information.
Contributor License AgreementContribution License AgreementThis Contribution License Agreement (“Agreement”) is agreed to by the party signing below (“You”),
|
|
@ahmad-nader please read the following Contributor License Agreement(CLA). If you agree with the CLA, please reply with the following information.
Contributor License AgreementContribution License AgreementThis Contribution License Agreement (“Agreement”) is agreed to by the party signing below (“You”),
|
There was a problem hiding this comment.
Pull request overview
This PR extends the evaluation-result conversion pipeline in azure-ai-evaluation so that nonstandard evaluator outputs prefixed with custom_... are preserved on AOAI evaluation results under a properties bag (while keeping well-known custom_score/custom_reason/custom_threshold/custom_label mapped to standard top-level fields).
Changes:
- Added logic to route nonstandard
custom_...metric keys into apropertiesdict during_update_metric_valueprocessing. - Included
propertiesin AOAI result object construction. - Added a unit test case covering
custom_...→propertiesbehavior in AOAI conversion.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_evaluate/_evaluate.py | Adds properties routing for nonstandard custom_... outputs and emits properties on AOAI result objects. |
| sdk/evaluation/azure-ai-evaluation/tests/unittests/test_evaluate.py | Adds unit coverage asserting nonstandard custom_... keys land in AOAI result properties. |
sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_evaluate/_evaluate.py
Show resolved
Hide resolved
Update AOAI result conversion to retain non-standard evaluator fields in properties and align unit test fixtures with the new output contract. Authored-by: GitHub Copilot Coding Agent v1 Model: GPT-5.4 (gpt-5.4) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_evaluate/_evaluate.py
Outdated
Show resolved
Hide resolved
Treat outputs with custom_score as custom evaluator results when deciding whether to emit AOAI properties. This preserves custom properties even when evaluator metadata looks builtin and keeps the conversion regression coverage aligned. Authored-by: GitHub Copilot Coding Agent Model: GPT-5.4 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_evaluate/_evaluate.py
Show resolved
Hide resolved
Stop emitting explanation as a standalone AOAI result field. Preserve it through the custom property bag instead and align the focused conversion regression fixture and assertions. Authored-by: GitHub Copilot Coding Agent Model: GPT-5.4 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Description
Please add an informative description that covers that changes made by the pull request and link all relevant issues.
If an SDK is being regenerated based on a new API spec, a link to the pull request containing these API spec changes should be included above.
All SDK Contribution checklist:
General Guidelines and Best Practices
Testing Guidelines
Summary
Adds support for routing nonstandard
custom_...evaluator outputs into the AOAI resultpropertiesbag during evaluation result conversion.What changed
propertiessupport to AOAI result constructioncustom_...keys intopropertiescustom_score,custom_reason,custom_threshold, andcustom_labelmapped to the normal AOAI top-level fieldstest_convert_results_to_aoai_evaluation_resultsExample
Input keys like:
outputs.friendly_evaluator_gh4y.custom_scoreoutputs.friendly_evaluator_gh4y.custom_reasonoutputs.friendly_evaluator_gh4y.custom_thresholdoutputs.friendly_evaluator_gh4y.custom_labeloutputs.friendly_evaluator_gh4y.custom_observation_flagnow produce an AOAI result where:
score,reason,threshold, andlabelstay in their standard fieldscustom_observation_flagis emitted underpropertiesasobservation_flagValidation
Ran: