-
Notifications
You must be signed in to change notification settings - Fork 17
Description
Description
The DataDog Lambda extension causes potential memory leak when DD_UNIVERSAL_INSTRUMENTATION=true is set. Max memory used increases with each warm invocation, and the growth rate is directly correlated with the Lambda response payload size. The Lambda function itself shows no memory leak when profiled — the issue is isolated to the extension process.
Environment
- Datadog Extension layer:
arn:aws:lambda:us-east-1:464622532012:layer:Datadog-Extension:92 - Go tracer:
github.com/DataDog/dd-trace-go/contrib/aws/datadog-lambda-go/v2 v2.6.0 - Lambda handler wrapper:
ddlambda.WrapHandler() - DD_CAPTURE_LAMBDA_PAYLOAD=true
- DD_ENV=PROD
- DD_FLUSH_TO_LOG=true
- DD_LOGS_CONFIG_LOGS_NO_SSL=true
- DD_LOGS_CONFIG_USE_COMPRESSION=true
- DD_LOGS_CONFIG_USE_HTTP=true
- DD_LOGS_ENABLED=true
- DD_MERGE_XRAY_TRACES=true
- DD_PROXY_HTTP=***
- DD_PROXY_HTTPS=***
- DD_SERVICE=shop orders
- DD_SITE=datadoghq.eu
- DD_TRACE_ENABLED=true
- DD_TRACE_STARTUP_LOGS=false
- DD_UNIVERSAL_INSTRUMENTATION=true (set to false after applying workaround)
- DD_VERSION=v0.72.1
To Reproduce
- Deploy a Go Lambda with ddlambda.WrapHandler() and DD_UNIVERSAL_INSTRUMENTATION=true
- Configure the Lambda to return a large response payload (e.g. paginated list, ~500KB+)
- Invoke the Lambda repeatedly in a warm execution environment
- Observe max memory used increasing with each invocation in CloudWatch metrics
Expected behavior
Memory used should remain stable across warm invocations.
Actual behavior
Max memory used grows monotonically across invocations. Growth rate scales with response payload size, suggesting the Runtime API proxy buffers response payloads without releasing them between invocations.
Workaround
Disabling DD_UNIVERSAL_INSTRUMENTATION stops the memory growth. We replicated the payload capture manually in the handler using tracer.SpanFromContext(ctx) and span.SetTag("function.request", ...) / span.SetTag("function.response", ...), which confirms the leak is in the extension's proxy buffering, not in the Go function.
Questions for the team
Is DD_SERVERLESS_FLUSH_STRATEGY intended to have any effect on the Runtime API proxy buffer lifecycle, or is it strictly limited to the telemetry pipeline?
Is there any existing mechanism (env var, extension API, or Lambda lifecycle hook) to force the proxy buffer to be released between warm invocations without disabling DD_UNIVERSAL_INSTRUMENTATION entirely?