dlt_tracing

Build observability from dlt pipeline traces.

dlt_tracing is a lightweight utility that extends the native dlt experience by capturing pipeline trace information (last_trace) immediately after a pipeline run. These traces can be stored using a filesystem-backed dlt pipeline and later consumed for observability, BI reporting, or AI-driven anomaly detection.

Why dlt_tracing?

dlt pipelines already produce rich execution traces, but they are often underutilized.
This package makes it easy to:

Capture pipeline.last_trace after every run
Persist traces in a structured, queryable format
Build observability dashboards (e.g., Power BI)
Feed traces into AI agents for anomaly detection and insights
Keep the developer experience native and minimal

No decorators. No monkey-patching. Just a clean wrapper around pipeline.run().

Features

Drop-in replacement for pipeline.run()
Automatically collects dlt last_trace
Stores traces using dlt filesystem destination
Packaged as a pip-installable wheel
Safe for production pipelines
Fully dlt-native behavior

Project Structure

dlt_tracing/

├── dlt_tracing/

│ ├── init.py

│ └── trace_pipeline.py

├── setup.py

└── README.md

Prerequisites

Python 3.9+
pip
Virtual environment (recommended)

Build the Wheel File

From the root of the project (where setup.py is located):

# Upgrade build tools
python -m pip install --upgrade pip setuptools wheel

# Build the wheel file
python setup.py bdist_wheel

Install the Wheel
Install from local path
pip install dist/dlt_tracing-0.1.0-py3-none-any.whl

Install from a shared location
pip install /path/to/dlt_tracing-0.1.0-py3-none-any.whl


All required dependencies will be installed automatically:

dlt[filesystem]

enlighten

duckdb

pandas

Usage
Basic Example
import dlt
from dlt_tracing import trace_pipeline

pipeline = dlt.pipeline(
    pipeline_name="orders_pipeline",
    destination="duckdb+parquet:///"
)

load_info = trace_pipeline(
    pipeline,
    bucket_url="/logs",
    log_pipeline_name="dlthub_pipeline_logs",
    table_name="breaches_logs"
).run(source)

print(load_info)

How It Works

Your pipeline runs exactly as before

After completion, pipeline.last_trace is retrieved

Trace data is written using a filesystem-backed dlt pipeline

The original load_info is returned unchanged

If trace collection fails, the pipeline run still succeeds.

Observability & AI Use Cases

Build Power BI dashboards for pipeline monitoring

Track pipeline health, latency, and failures

Centralize traces in cloud storage (ADLS, S3, local FS)

Feed traces into AI agents for:

Anomaly detection

Trend analysis

Root-cause insights

Predictive monitoring

This enables a shift from reactive monitoring to proactive, AI-driven observability.

Design Principles

Explicit over implicit

No hidden behavior

No dlt internals modification

Production-safe defaults

Developer-controlled storage and analysis

License

MIT License

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
dlt_tracing		dlt_tracing
test		test
.gitignore		.gitignore
readme.md		readme.md
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

dlt_tracing

Why dlt_tracing?

Features

Project Structure

Prerequisites

Build the Wheel File

How It Works

Observability & AI Use Cases

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

dlt_tracing

Why dlt_tracing?

Features

Project Structure

Prerequisites

Build the Wheel File

How It Works

Observability & AI Use Cases

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages