feat: cap default sampling rate increases#4488
feat: cap default sampling rate increases#4488gh-worker-dd-mergequeue-cf854d[bot] merged 5 commits intomainfrom
Conversation
BenchmarksBenchmark execution time: 2026-03-10 17:53:57 Comparing candidate commit 411ee23 in PR branch Found 0 performance improvements and 0 performance regressions! Performance is the same for 155 metrics, 9 unstable metrics.
|
Codecov Report❌ Patch coverage is
Additional details and impacted files
🚀 New features to boost your workflow:
|
|
✅ Tests 🎉 All green!❄️ No new flaky tests detected 🎯 Code Coverage (details) 🔗 Commit SHA: 411ee23 | Docs | Datadog PR Page | Was this helpful? React with 👍/👎 or give us feedback! |
genesor
left a comment
There was a problem hiding this comment.
Great work! I have two questions, once they are answered I can approve the changes 😉
|
@codex review |
|
Codex Review: Didn't find any major issues. 🎉 ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
|
LGTM |
7a80388
into
main
When the trace-agent is restarted, a rate of 100% is initially provided by the trace-agent, increasing dramatically the number of traces sampled. A rate could go suddenly from 0.1% to 100% and back to 0.1% when the trace-agent eventually computes the new sampling rate.
In particular it is observed that when the agent restarts, the payload buffering that waits for new container tags breaches its memory limit and we send spans without container tags.
This PR applies a limit of sampling rate increases of x2 every 1s resulting in a x10 completed every 3-4s
1->100% takes 7s
0.1 -> 100% takes 10s
Matching system-test: DataDog/system-tests#6412
Below is a screen of the before/after of the dd-trace-go implementation with go_span_new using the PR tracer and go_spam_old using the latest release of dd-trace-go and both applications generating 500 traces/s. Notice how the new code does not burst in throughput

Motivation
Reviewer's Checklist
make lintlocally.make testlocally.make generatelocally.make fix-moduleslocally.Unsure? Have a question? Request a review!