Skip to content

Max memory used grows monotonically across invocations #1049

@adatob

Description

@adatob

Description

The DataDog Lambda extension causes potential memory leak when DD_UNIVERSAL_INSTRUMENTATION=true is set. Max memory used increases with each warm invocation, and the growth rate is directly correlated with the Lambda response payload size. The Lambda function itself shows no memory leak when profiled — the issue is isolated to the extension process.

Environment

  • Datadog Extension layer: arn:aws:lambda:us-east-1:464622532012:layer:Datadog-Extension:92
  • Go tracer: github.com/DataDog/dd-trace-go/contrib/aws/datadog-lambda-go/v2 v2.6.0
  • Lambda handler wrapper: ddlambda.WrapHandler()
  • DD_CAPTURE_LAMBDA_PAYLOAD=true
  • DD_ENV=PROD
  • DD_FLUSH_TO_LOG=true
  • DD_LOGS_CONFIG_LOGS_NO_SSL=true
  • DD_LOGS_CONFIG_USE_COMPRESSION=true
  • DD_LOGS_CONFIG_USE_HTTP=true
  • DD_LOGS_ENABLED=true
  • DD_MERGE_XRAY_TRACES=true
  • DD_PROXY_HTTP=***
  • DD_PROXY_HTTPS=***
  • DD_SERVICE=shop orders
  • DD_SITE=datadoghq.eu
  • DD_TRACE_ENABLED=true
  • DD_TRACE_STARTUP_LOGS=false
  • DD_UNIVERSAL_INSTRUMENTATION=true (set to false after applying workaround)
  • DD_VERSION=v0.72.1

To Reproduce

  1. Deploy a Go Lambda with ddlambda.WrapHandler() and DD_UNIVERSAL_INSTRUMENTATION=true
  2. Configure the Lambda to return a large response payload (e.g. paginated list, ~500KB+)
  3. Invoke the Lambda repeatedly in a warm execution environment
  4. Observe max memory used increasing with each invocation in CloudWatch metrics

Expected behavior

Memory used should remain stable across warm invocations.

Actual behavior

Max memory used grows monotonically across invocations. Growth rate scales with response payload size, suggesting the Runtime API proxy buffers response payloads without releasing them between invocations.

Workaround

Disabling DD_UNIVERSAL_INSTRUMENTATION stops the memory growth. We replicated the payload capture manually in the handler using tracer.SpanFromContext(ctx) and span.SetTag("function.request", ...) / span.SetTag("function.response", ...), which confirms the leak is in the extension's proxy buffering, not in the Go function.

Questions for the team

Is DD_SERVERLESS_FLUSH_STRATEGY intended to have any effect on the Runtime API proxy buffer lifecycle, or is it strictly limited to the telemetry pipeline?
Is there any existing mechanism (env var, extension API, or Lambda lifecycle hook) to force the proxy buffer to be released between warm invocations without disabling DD_UNIVERSAL_INSTRUMENTATION entirely?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions