High-level architecture of the Datadog Java APM agent. Start here to orient yourself in the codebase.
dd-trace-java is a Java agent that auto-instruments JVM applications at runtime via bytecode manipulation.
It attaches to a running JVM using the -javaagent flag, intercepts class loading, and rewrites method bytecode
to inject tracing, security, profiling, and observability logic. No application code changes required.
Ships ~120 integrations (~200 instrumentations) covering major frameworks (Spring, Servlet, gRPC, JDBC, Kafka, etc.) and supports multiple Datadog products through a single jar: Tracing, Profiling, Application Security (AppSec), IAST, CI Visibility, Dynamic Instrumentation, LLM Observability, Crash Tracking, Data Streams, Feature Flagging, and USM.
Communicates with a local Datadog Agent process (or directly with the Datadog intake APIs) to send collected telemetry.
-
AgentBootstrap.premain()— JVM entry point. Runs on the application classloader with minimal logic: locates the agent jar, creates an isolated classloader, jumps toAgent.start(). Must remain tiny and side-effect-free. -
Agent.start()— Runs on the bootstrap classloader. Creates the agent classloader, reads configuration, determines which products are enabled, starts each subsystem on dedicated threads. -
AgentInstaller— Installs the ByteBuddyClassFileTransformerthat intercepts all class loading. Discovers allInstrumenterModuleimplementations via service loading, registers their type matchers and advice classes. -
Product subsystems start — Each enabled product is started via its own
*System.start()method, receiving shared communication objects.
Main agent module. Produces the final shadow jar (dd-java-agent.jar) using a composite shadow jar
strategy. Each product module builds its own shadow jar, embedded as a nested directory inside the
main jar (inst/, profiling/, appsec/, iast/, debugger/, ci-visibility/, llm-obs/,
shared/, trace/, etc.). A dedicated sharedShadowJar bundles common transitive dependencies
(OkHttp, JCTools, LZ4, etc.) to avoid duplication across feature jars. All dependencies are relocated
under datadog. prefixes to prevent classpath conflicts. Class files inside feature jars are renamed
to .classdata to prevent unintended loading. See docs/how_to_work_with_gradle.md.
-
src/—AgentBootstrapandAgentJar, the entry point loaded by-javaagent. Deliberately minimal. -
agent-bootstrap/— Classes on the bootstrap classloader:Agent(startup orchestrator), decorator base classes (HttpServerDecorator,DatabaseClientDecorator, etc.), and bootstrap-safe utilities. Visible to all classloaders, so instrumentation advice and helpers can use them directly. Seedocs/bootstrap_design_guidelines.md. -
agent-builder/— ByteBuddy integration layer. Class transformer pipeline:DDClassFileTransformerintercepts every class load,GlobalIgnoresMatcherapplies early filtering,CombiningMatcherevaluates instrumentation matchers,SplittingTransformerapplies matched transformations. Theignored_class_name.trieis a compiled trie built at build time that short-circuits matcher evaluation for known non-transformable classes (JVM internals, agent infrastructure, monitoring libraries, large framework packages). When a class is unexpectedly not instrumented, check the trie first. -
agent-tooling/— Instrumentation framework. Key types:InstrumenterModule— Base class for all instrumentation modules. Declares a target system (Tracing, AppSec, IAST, Profiling, CiVisibility, USM, etc.) and one or more instrumentations.Instrumenter— Type matching interface:ForSingleType,ForKnownTypes,ForTypeHierarchy,ForBootstrap.muzzle/— Build-time and runtime safety checks. Verifies that expected types and methods exist in the library version at runtime. If not, the instrumentation is silently skipped. Seedocs/how_instrumentations_work.mdanddocs/add_new_instrumentation.md.
-
instrumentation/— All auto-instrumentations, organized as{framework}/{framework}-{minVersion}/. Nearly 200 framework directories. Each follows the same pattern: anInstrumenterModuledeclares the target system and integration name, one or moreInstrumenterimplementations select target types via matchers, advice classes inject bytecode via@Advice.OnMethodEnter/@Advice.OnMethodExit, and decorator/helper classes contain the actual product logic. Instrumentations are discovered via@AutoService(InstrumenterModule.class)(Java SPI) and validated by Muzzle at build time. Seedocs/how_instrumentations_work.mdanddocs/add_new_instrumentation.mdfor details. -
appsec/— Application Security. Entry point:AppSecSystem.start(). Runs the Datadog WAF to detect and block attacks in real-time. Hooks into the gateway to intercept HTTP requests. -
agent-iast/— Interactive Application Security Testing. Entry point:IastSystem.start(). Performs taint tracking: marks user input as tainted, propagates taint through string operations, and reports when tainted data reaches dangerous sinks (SQL injection, XSS, command injection, etc.). -
agent-ci-visibility/— CI Visibility. Entry point:CiVisibilitySystem.start(). Instruments test frameworks (JUnit, TestNG, Gradle, Maven, Cucumber) to collect test results, code coverage, and performance metrics. -
agent-profiling/— Continuous Profiling. Entry point:ProfilingAgent. Collects CPU, memory, and wall-clock profiles using JFR or the Datadog native profiler (ddprof). Uploads profiles to the Datadog backend. -
agent-debugger/— Dynamic Instrumentation. Entry point:DebuggerAgent. Probes, snapshot capture, exception replay, code origin mapping. Driven by remote configuration. -
agent-llmobs/— LLM Observability. Entry point:LLMObsSystem.start(). Monitors LLM API calls (OpenAI, LangChain, etc.): token usage, model inference, evaluations. -
agent-crashtracking/— Crash Tracking. Detects JVM crashes and fatal exceptions, collects system metadata, and uploads crash reports to Datadog's error tracking intake. -
agent-otel/— OpenTelemetry compatibility shim.OtelTracerProvider,OtelSpan,OtelContextand other wrappers implement the OTel API by delegating to the Datadog tracer. Paired with instrumentations ininstrumentation/opentelemetry/that intercept OTel API calls and redirect them to shim instances.
Core tracing engine. Grew organically and now also hosts product-specific features that depend on
tight integration with span creation, interception, or serialization. New code should go in
products/ or components/ instead. Core tracing types:
CoreTracer— Tracer implementation. Creates spans, manages sampling, drives the writer pipeline. ImplementsAgentTracer.TracerAPI.DDSpan/DDSpanContext— Concrete span and context implementations with Datadog-specific metadata.PendingTrace— Collects all spans in a trace. Flushes to the writer when the root span finishes.scopemanager/—ContinuableScopeManager,ContinuableScope,ScopeContinuation. Active span per thread, async context propagation via continuations.propagation/— Trace context propagation codecs: Datadog, W3C TraceContext, B3, Haystack, X-Ray.common/writer/— Writer pipeline.DDAgentWriterbuffers traces and dispatches viaPayloadDispatcherImplto the Datadog Agent's/v0.4/tracesendpoint.DDIntakeWriterfor direct API submission.TraceProcessingWorkerfor async processing.common/sampling/— Sampling logic:RuleBasedTraceSampler,RateByServiceTraceSampler,SingleSpanSampler. Supports both head-based and rule-based sampling.tagprocessor/— Post-processing of span tags: peer service calculation, base service naming, query obfuscation, endpoint resolution.
Non-tracing code that also lives here due to organic growth:
datastreams/— Data Streams Monitoring. Tracks message pipeline latency across Kafka, RabbitMQ, SQS, etc.civisibility/— CI Visibility trace interceptors and protocol adapters. Hooks into the trace completion pipeline to filter and reformat test spans for the CI Test Cycle intake.lambda/— AWS Lambda support. Coordinates span creation with the serverless extension, handling invocation start/end and trace context propagation.llmobs/— LLM Observability span mapper. Serializes LLM-specific spans (messages, tool calls) to the dedicated LLM Obs intake format.
Public API. Types application developers may use directly: Tracer, GlobalTracer, DDTags,
DDSpanTypes, Trace (annotation), ConfigDefaults. Also houses all configuration key constants
by domain: TracerConfig, GeneralConfig, AppSecConfig, ProfilingConfig, CiVisibilityConfig,
IastConfig, DebuggerConfig, etc.
Internal shared API across all agent modules (not public). Like dd-trace-core, grew organically
and now hosts interfaces for many products beyond tracing. New product APIs should go in
products/ or components/.
Core tracing abstractions:
AgentTracer— Static tracer facade. Instrumentations callAgentTracer.startSpan(),AgentTracer.activateSpan(), etc.AgentSpan/AgentScope/AgentSpanContext— Internal span/scope/context interfaces.AgentPropagation— Context propagation interfaces (Getter,Setter) that instrumentations implement to inject/extract trace context from framework-specific carriers (HTTP headers, message properties, etc.).Config/InstrumenterConfig— Master configuration class and instrumenter-specific config, centralizing settings for all products.InstrumenterConfigis separated fromConfigdue to GraalVM native-image constraints: in native-image builds, all bytecode instrumentation must be applied at build time (ahead-of-time compilation), so configuration that controls instrumentation decisions (which classes to instrument, which integrations to enable, resolver behavior, field injection flags) must be frozen into the native image binary. Runtime-only settings (agent endpoints, service names, sampling rates) remain inConfig. Seedocs/add_new_configurations.md.
Cross-product abstractions:
gateway/— Instrumentation Gateway: event bus (InstrumentationGateway,SubscriptionService,Events,CallbackProvider,RequestContext) decoupling instrumentations from product modules. Primarily used by AppSec and IAST to hook into the HTTP request lifecycle without modifying instrumentations.cache/— Shared caching primitives (DDCache,FixedSizeCache,RadixTreeCache) used throughout the agent.naming/— Service and span operation naming schemas (v0, v1) for databases, messaging, cloud services, etc.telemetry/— Multi-product telemetry collection interfaces (MetricCollector,WafMetricCollector,LLMObsMetricCollector, etc.).
Product-specific APIs that also live here:
iast/— IAST vulnerability detection interfaces: taint tracking (Taintable,IastContext), sink definitions for each vulnerability type (SQL injection, XSS, command injection, etc.), and call site instrumentation hooks. About 60 files.civisibility/— CI Visibility interfaces: test identification, code coverage, build/test event handlers, and CI-specific telemetry metrics. About 95 files.datastreams/— Data Streams Monitoring interfaces: pathway context, stats points, and schema registry integration.appsec/— AppSec interfaces: HTTP client request/response payloads for WAF analysis, RASP call sites.profiling/— Profiler integration: recording data, timing, and enablement interfaces.llmobs/— LLM Observability context.
Low-level shared platform components. Not tied to any product, no external dependencies, bootstrap-safe:
context— Immutable context propagation framework. ProvidesContext,ContextKey, andPropagatorabstractions for storing and propagating key-value pairs across threads and carrier objects.environment— JVM and OS detection utilities.JavaVersionfor version parsing,JavaVirtualMachinefor JVM implementation detection (OpenJDK, Graal, J9),OperatingSystemfor OS/architecture detection, andEnvironmentVariables/SystemPropertiesfor safe access and mocking.json— Lightweight, dependency-free JSON serialization.JsonWriterfor building JSON with a fluent API,JsonReaderfor streaming parsing.native-loader— Platform-aware native library loading with pluggable strategies.NativeLoaderhandles OS/architecture detection, resource extraction from JARs, and temp file management.
Self-contained product modules following a layered submodule pattern:
{product}-api/— Public API interfaces, zero dependencies.{product}-bootstrap/— Data classes safe for the bootstrap classloader.{product}-lib/— Core implementation (shadow jar, excludes shared dependencies).{product}-agent/— Agent integration entry point (shadow jar).
Current products:
metrics/— StatsD client and monitoring abstraction. ProvidesMonitoringinterface with counters, timers, and histograms for internal agent metrics collection.feature-flagging/— Server-side feature flag evaluation driven by remote configuration. Implements the OpenFeature SDK, handles the Unified Feature Control (UFC) protocol, and tracks flag exposure per user/session.
HTTP transport to the Datadog Agent and intake APIs. SharedCommunicationObjects holds shared
OkHttpClient instances (Unix domain socket and named pipe support), agent URL, feature discovery,
and the configuration poller. All product modules receive this at startup.
Remote configuration client. DefaultConfigurationPoller periodically polls the Datadog Agent
for configuration updates (AppSec rules, debugger probes, sampling rates, feature flags).
Uses TUF (The Update Framework) for signature validation.
Agent telemetry. TelemetrySystem collects and reports which features are enabled,
which integrations loaded, performance metrics, and product-specific counters.
Each product registers periodic actions that collect domain-specific metrics.
Shared utilities, each in its own submodule:
config-utils—ConfigProviderfor reading and merging configuration from environment variables, system properties, properties files, and CI environment.container-utils— Parses container runtime information (Docker, Kubernetes, ECS).filesystem-utils— Permission-safe file existence checks that handleSecurityException.flare-utils— Tracer flare collection (TracerFlareService) that gathers diagnostics (logs, spans, system info) and sends them to Datadog for troubleshooting.queue-utils— High-performance lock-free queues (MpscArrayQueue,SpscArrayQueue) for inter-thread communication and span buffering.socket-utils— Socket factories (UnixDomainSocketFactory,NamedPipeSocket) for connecting to the local Datadog Agent via Unix sockets or named pipes.time-utils— Time source abstractions (TimeSource,ControllableTimeSource) for testable time handling and delay parsing.version-utils— Agent version string (VersionInfo.VERSION) read from packaged resources.test-utils— Testing utilities:@Flakyannotation, log capture, GC control, forked test configuration.test-agent-utils— Message decoders for parsing v04/v05 binary protocol frames in tests.
Legacy OpenTracing compatibility library. Publishes a standalone JAR artifact (dd-trace-ot.jar)
that implements the io.opentracing.Tracer interface by wrapping the Datadog CoreTracer.
This is a pure library for manual instrumentation only — there is no auto-instrumentation or
bytecode advice.
End-to-end smoke tests. Each boots a real application with the agent jar and verifies traces, spans, and product behavior. Covers Spring Boot, Play, Vert.x, Quarkus, WildFly, and more. Core test hierarchy (Groovy/Spock):
ProcessManager— Base. Spawns forked JVM processes with the agent viaProcessBuilder, captures stdout to log files, tears down on cleanup.assertNoErrorLogs()scans logs for errors.AbstractSmokeTestextendsProcessManager— Adds a mock Datadog Agent (TestHttpServer) receiving traces (v0.4/v0.5), telemetry, remote config, and EVP proxy requests. Polling helpers:waitForTraceCount,waitForSpan,waitForTelemetryFlat.AbstractServerSmokeTestextendsAbstractSmokeTest— For HTTP server apps. Adds port management, waits for server port to open, verifies expected trace output.