feat(ffe): add flag evaluation metrics E2E tests for Go#6410
Open
leoromanovsky wants to merge 13 commits intomainfrom
Open
feat(ffe): add flag evaluation metrics E2E tests for Go#6410leoromanovsky wants to merge 13 commits intomainfrom
leoromanovsky wants to merge 13 commits intomainfrom
Conversation
Add system tests validating the feature_flag.evaluations OTel metric emitted by dd-trace-go's OpenFeature provider. - Enable DD_METRICS_OTEL_ENABLED and OTLP endpoint in FFE scenario - 4 test cases: basic metric, count, different flags, error tags - Update Go manifest for new test file
Contributor
|
|
Attribute dropped from dd-trace-go — always "Datadog", adds no value.
…test The Go weblog was calling ofClient.Object() for all evaluations, ignoring the variationType field. This meant type conversion errors could never occur, unlike Python/Node.js which dispatch to the type-specific methods (BooleanValue, StringValue, etc.). Fix the Go weblog to dispatch based on variationType, matching the behavior of other language weblogs. Add Test_FFE_Eval_Metric_Type_Mismatch: configures a STRING flag but evaluates it as BOOLEAN, triggering a type conversion error that happens after the core evaluate() returns. This test would fail with the old evaluate()-level metric recording (which would see targeting_match / no error) and only passes when metrics are recorded via a Finally hook (which sees error / type_mismatch).
Add type annotations to module-level helper functions and move boolean default to keyword-only argument to satisfy ruff ANN001 and FBT002 rules.
|
✨ Fix all issues with BitsAI or with Cursor
|
Only Go supports flag evaluation metrics via OTel so far. Without this, the test file runs for all FFE-enabled languages and fails.
brettlangdon
approved these changes
Mar 3, 2026
Member
brettlangdon
left a comment
There was a problem hiding this comment.
manifests/python.yml lgtm
nccatoni
reviewed
Mar 4, 2026
Replace hardcoded time.sleep(25) in each test setup with agent_interface_timeout=30 on the FFE scenario. The container shutdown flushes metrics; the timeout gives the agent time to receive and process them.
nccatoni
approved these changes
Mar 9, 2026
cbeauchesne
reviewed
Mar 9, 2026
Collaborator
cbeauchesne
left a comment
There was a problem hiding this comment.
Framework usage : all good !
But could you get a review frome someone familiar with the tested feature ?
Contributor
Author
Thanks yes I have asked FFE engineers to review as well before merging. |
Assert that feature_flag.result.allocation_key tag is present with value "default-allocation" on successful flag evaluations.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Motivation
Per the RFC "Flag evaluations tracking for APM tracers" (Oleksii Shmalko, 2026-01-20): we want to collect a metric for flag evaluations to track usage of flags. The companion dd-trace-go PR implements the
feature_flag.evaluationsOTel metric in the OpenFeature provider. These system tests validate the end-to-end pipeline: flag evaluation in the Go weblog → OTel SDK aggregation → OTLP export to agent → agent forwards to backend (proxy) → system tests capture and assert.Changes
utils/_context/_scenarios/__init__.py: AddedDD_METRICS_OTEL_ENABLED=trueandOTEL_EXPORTER_OTLP_METRICS_ENDPOINT=http://agent:4318/v1/metricsto the FFE scenario weblog env. The OTLP endpoint points directly to the agent container's OTLP receiver (port 4318), since the proxy does not have an OTLP listener in this scenario.tests/ffe/test_flag_eval_metrics.py: 4 E2E test classes:Test_FFE_Eval_Metric_Basic: Verifies metric exists with correct tags (feature_flag.key,feature_flag.provider.name,feature_flag.result.variant,feature_flag.result.reason)Test_FFE_Eval_Metric_Count: Evaluates same flag 5 times, verifies aggregated metric count >= 5Test_FFE_Eval_Metric_Different_Flags: Evaluates two flags, verifies separate metric series perfeature_flag.keyTest_FFE_Eval_Metric_Error: Evaluates non-existent flag, verifiesfeature_flag.result.reason=erroranderror.type=flag_not_foundmanifests/golang.yml: Addedtests/ffe/test_flag_eval_metrics.py: v2.7.0-devto enable new tests for Go.Decisions
DD_TRACE_AGENT_URLpointing to the proxy, but the proxy doesn't have an OTLP receiver. We setOTEL_EXPORTER_OTLP_METRICS_ENDPOINTdirectly toagent:4318/v1/metricsto bypass the proxy for metric export. The agent then forwards processed metrics to the proxy via/api/v2/series, where system tests capture them viainterfaces.agent.get_metrics().interfaces.agent.get_metrics()filtering/api/v2/series— no proxy changes needed.Local test evidence
System tests (all 17 FFE tests pass — 0 regressions)
Companion PR