feat: add GenAI evaluation OTel event support#1656
Open
anirudha wants to merge 1 commit intostrands-agents:mainfrom
Open
feat: add GenAI evaluation OTel event support#1656anirudha wants to merge 1 commit intostrands-agents:mainfrom
anirudha wants to merge 1 commit intostrands-agents:mainfrom
Conversation
Author
|
Author
|
Waiting for trace data... ================================================================================
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Evaluation frameworks like
strands_evalsneed a way to export evaluation results as OpenTelemetry events so they can be visualized in any OTel-compatible backend (Datadog, Jaeger, Honeycomb, etc.). Currently there is no standard way to emitgen_ai.evaluation.resultevents on spans from within the SDK.open-telemetry/semantic-conventions#3398
#1633
This PR adds a lightweight evaluation telemetry API to
strands.telemetrythat follows the proposedgen_ai.evaluation.resultOTel semantic convention. The API is opt-in — no telemetry is emitted unless the developer explicitly calls these functions.Public API Changes
New exports from
strands.telemetry:None-valued fields are omitted from OTel attributes. None/non-recording spans are silently skipped.
Use Cases
response_idRelated Issues
N/A — new feature
Documentation PR
N/A — docs update will follow separately
Type of Change
New feature
Testing
31 tests (25 unit + 6 property-based with Hypothesis):
EvaluationResultdataclass construction andto_otel_attributes()mappingEvaluationEventEmitter.emit()span interactionadd_evaluation_event()convenience function equivalenceset_test_suite_context()/set_test_case_context()attribute correctnessEdge cases: None span, non-recording span, missing name ValueError
Public API export verification
I ran
hatch run prepareChecklist