Skip to content

sketchmyview/dlt_tracing

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

dlt_tracing

Build observability from dlt pipeline traces.

dlt_tracing is a lightweight utility that extends the native dlt experience by capturing pipeline trace information (last_trace) immediately after a pipeline run. These traces can be stored using a filesystem-backed dlt pipeline and later consumed for observability, BI reporting, or AI-driven anomaly detection.


Why dlt_tracing?

dlt pipelines already produce rich execution traces, but they are often underutilized.
This package makes it easy to:

  • Capture pipeline.last_trace after every run
  • Persist traces in a structured, queryable format
  • Build observability dashboards (e.g., Power BI)
  • Feed traces into AI agents for anomaly detection and insights
  • Keep the developer experience native and minimal

No decorators. No monkey-patching. Just a clean wrapper around pipeline.run().


Features

  • Drop-in replacement for pipeline.run()
  • Automatically collects dlt last_trace
  • Stores traces using dlt filesystem destination
  • Packaged as a pip-installable wheel
  • Safe for production pipelines
  • Fully dlt-native behavior

Project Structure

dlt_tracing/

├── dlt_tracing/

│ ├── init.py

│ └── trace_pipeline.py

├── setup.py

└── README.md


Prerequisites

  • Python 3.9+
  • pip
  • Virtual environment (recommended)

Build the Wheel File

From the root of the project (where setup.py is located):

# Upgrade build tools
python -m pip install --upgrade pip setuptools wheel

# Build the wheel file
python setup.py bdist_wheel

Install the Wheel
Install from local path
pip install dist/dlt_tracing-0.1.0-py3-none-any.whl

Install from a shared location
pip install /path/to/dlt_tracing-0.1.0-py3-none-any.whl


All required dependencies will be installed automatically:

dlt[filesystem]

enlighten

duckdb

pandas

Usage
Basic Example
import dlt
from dlt_tracing import trace_pipeline

pipeline = dlt.pipeline(
    pipeline_name="orders_pipeline",
    destination="duckdb+parquet:///"
)

load_info = trace_pipeline(
    pipeline,
    bucket_url="/logs",
    log_pipeline_name="dlthub_pipeline_logs",
    table_name="breaches_logs"
).run(source)

print(load_info)

How It Works

Your pipeline runs exactly as before

After completion, pipeline.last_trace is retrieved

Trace data is written using a filesystem-backed dlt pipeline

The original load_info is returned unchanged

If trace collection fails, the pipeline run still succeeds.


Observability & AI Use Cases

Build Power BI dashboards for pipeline monitoring

Track pipeline health, latency, and failures

Centralize traces in cloud storage (ADLS, S3, local FS)

Feed traces into AI agents for:

Anomaly detection

Trend analysis

Root-cause insights

Predictive monitoring

This enables a shift from reactive monitoring to proactive, AI-driven observability.

Design Principles

Explicit over implicit

No hidden behavior

No dlt internals modification

Production-safe defaults

Developer-controlled storage and analysis

License

MIT License

About

Build observability from dlt pipeline traces.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages