Extracts Airworthiness Directives (ADs) and Service Bulletins (SBs) from aviation compliance report PDFs using AWS Textract and Bedrock LLMs.
- PDF document processing via AWS Textract (OCR + layout, form and table analysis - AsyncAnalyzeDocument)
- LLM-based field extraction via AWS Bedrock (Claude and Nova models)
- Hybrid validation: LLM extraction with Textract fact-checking
- PDF coordinate enrichment: every extracted field linked to page location and bounding box (Not current hooked up)
- Step Functions orchestration for scalable, parallel batch processing
- Configurable extraction profiles (accuracy vs speed tradeoffs) - defines the textract and llm configuration
- Validation harness for ground truth comparison and experiment tracking - creation of dashboard and tracking of runs/metrics
bluetail-poc/
├── src/ # Core application code
│ ├── bedrock/ # LLM integration via AWS Bedrock
│ ├── classifier/ # Report family classification (ADSI vs Status)
│ ├── extractor/ # Page processing orchestration, prompts & few-shot examples
│ ├── schemas/ # Canonical data models (records, enums, extraction envelope)
│ ├── textract/ # OCR & document analysis
│ ├── validator/ # Field matching & validation
│ ├── lambdas/ # Lambda handlers for workflow steps
│ ├── tasks/ # Task logic (reused by Lambda handlers)
│ ├── aws/ # AWS client management
│ └── common/ # Shared config, logging & utilities
├── config/ # Extraction profiles (YAML)
├── validation_harness/ # Ground truth testing & experiment tracking
├── web_viewer/ # Flask-based PDF visualization tool
├── terraform/ # AWS infrastructure as code
│ ├── live/dev/ # Environment configuration
│ └── modules/ # Reusable modules (S3, Lambda, Step Functions, etc.)
├── scripts/ # Utility scripts (local testing, deployment)
├── tests/ # pytest test suite
├── Dockerfile # Local testing via docker-compose
└── docker-compose.yml # Local development
| Module | Purpose |
|---|---|
bedrock/ |
LLM integration via AWS Bedrock (model-agnostic invocation, response parsing, prompt caching) |
classifier/ |
Report family classification (ADSI vs Status) using Bedrock |
extractor/ |
Page processing orchestration with prompts, few-shot examples, and record auditing |
schemas/ |
Canonical data models — record types, enums, extraction envelope (single source of truth) |
textract/ |
AWS Textract wrapper for OCR, text detection, and bounding box extraction |
validator/ |
Field matching and validation using fuzzy matching strategies |
lambdas/ |
Lambda handlers (s3_path_resolver, classify_report, run_textract, build_manifest, split_batches, aggregate_results, process_document) |
tasks/ |
Task logic reused by Lambda handlers (process_document) |
aws/ |
Singleton AWS client management (S3, Textract, Bedrock) |
common/ |
Shared configuration, logging, and utilities |
| Service | Purpose |
|---|---|
| S3 | Document storage (input PDFs, output results) |
| Textract | OCR and document analysis (async) |
| Bedrock | LLM inference (Claude Sonnet, Nova) |
| Step Functions | Workflow orchestration |
| EventBridge | S3 upload triggers |
| SQS | Job progress tracking (FIFO queue) |
S3 Upload ({client_id}/{job_id}/input/report.pdf)
│
▼
EventBridge Rule ──► Step Functions
│
▼
Step 1: Preprocess (Lambda)
└── Resolve S3 paths from input key
│
▼
Step 2: ClassifyReport (Lambda)
└── Determine report family (ADSI / Status)
│
▼
Step 3: RunTextract (Lambda)
└── Full document OCR + two-tier caching
│
▼
Step 4: BuildManifest (Lambda)
└── Page inventory + page count verification
│
▼
Step 5: SplitBatches (Lambda)
└── Batch pages for parallel processing
│
▼
Step 6: ProcessBatchesMap (Lambda, parallel per batch)
└── Bedrock extraction
│
▼
Step 7: AggregateResults (Lambda)
└── Merge batches + record auditing → extraction_result.json
│
▼
Results ──► S3 {client_id}/{job_id}/output/extraction_result.json
| Lambda | Purpose |
|---|---|
s3_path_resolver |
Resolves output paths from input S3 keys (Preprocess) |
classify_report |
Classifies report family (ADSI vs Status) using Bedrock |
run_textract |
Runs full-document OCR with two-tier caching (S3 + DynamoDB) |
build_manifest |
Builds a page inventory and validates page counts |
split_batches |
Splits pages into batches for parallel processing |
aggregate_results |
Merges batch outputs, validates coverage, and runs record auditing |
process_document |
Processes PDF pages with Bedrock extraction (Distributed Map) |
nmd-bluetail-poc-bucket/
└── {client_id}/{job_id}/
├── input/
│ └── report.pdf ← Triggers workflow
├── cached-results/
│ ├── classification.json ← Report family classification result
│ ├── textract_results/
│ │ ├── raw.json ← Full Textract response
│ │ └── extracted.txt ← Human-readable text
│ ├── page_manifest.json ← Page inventory + metadata
│ └── batches/ ← Per-batch outputs (intermediate)
└── output/
└── extraction_result.json ← Extracted ADs/SBs with coordinates
Each pipeline step emits status events to an SQS FIFO queue so downstream consumers can track job progress in real time.
Queue: bluetail-job-progress.fifo (FIFO with MessageGroupId = job_id)
Event lifecycle for a successful job:
Each step emits exactly two events: STARTED → COMPLETED (or FAILED on error).
STARTED (Preprocess) seq=11
COMPLETED (Preprocess) seq=12
STARTED (ClassifyReport) seq=21
COMPLETED (ClassifyReport) seq=22
STARTED (Textract) seq=31
COMPLETED (Textract) seq=32
STARTED (BuildManifest) seq=41
COMPLETED (BuildManifest) seq=42
STARTED (SplitBatches) seq=51
COMPLETED (SplitBatches) seq=52
STARTED (ProcessBatches) seq=61 ← emitted by Step Functions Pass state
COMPLETED (ProcessBatches) seq=62 ← emitted by Step Functions Pass state
STARTED (AggregateResults) seq=71
COMPLETED (AggregateResults) seq=72
Sequence numbering: step_index * 10 + offset (STARTED=+1, COMPLETED=+2, FAILED=+9). E.g. Preprocess: STARTED=11, COMPLETED=12; ClassifyReport: STARTED=21, COMPLETED=22.
Message schema:
| Field | Type | Description |
|---|---|---|
job_id |
string | Unique job identifier (ULID) |
client_id |
string | Client/tenant identifier |
status |
string | STARTED, COMPLETED, FAILED |
step_name |
string | Pipeline step name |
sequence |
int | Monotonic sequence for ordering/dedup |
timestamp |
string | ISO 8601 UTC |
step_index |
int or null | 1-based pipeline position |
total_steps |
int or null | Total pipeline steps (currently 7) |
message |
string or null | Human-readable summary |
details |
object or null | Step-specific context |
Design notes:
- Dedup IDs (
{job_id}:{step_name}:{sequence}) handle Step Functions retries ProgressPublisheris fire-and-forget — SQS failures never break the pipeline- When
PROGRESS_QUEUE_URLis unset (local dev), all emit calls are silent no-ops - All emission uses
ProgressPublisherfromsrc/common/progress.py(all Lambda handlers share the same code path)
- Python 3.12+
uvpackage manager- AWS CLI configured with
BlueTailprofile - Docker and Docker Compose (potentially for local testing only)
- Terraform >= 1.5
There is a Makefile included in the root of this project. It defines a number of commands that can help with project initialization, running code quality checks via pre-commit hooks, and deploying terraform.
| Command | Description |
|---|---|
make init |
Install uv/prek utililes if missing, validate Terraform version, uv sync --all-groups, install pre-commit hooks |
make check |
Run all checks (format, lint, typecheck, test) via pre-commit hooks |
make build-lambda-artifacts |
Build Lambda handler zips and shared layer for Terraform |
make deploy-dev |
Build Lambda artifacts then terraform apply in terraform/live/dev |
make clean |
Remove build artifacts, caches, and __pycache__ directories |
# Initialize project (install tools, deps, pre-commit hooks)
make init
# Activate python virtual environment - necessary to run python code locally
source .venv/bin/activateUnit tests cover the source code and are run with pytest
# Run all tests
pytest tests/
# Run with coverage
pytest tests/ --cov=src --cov-report=term-missingMore comprehensive documentation for terraform can be found in this README.md
This documentation will explain the modifications you will want to make before deploying.
Assuming you have updated the terraform s3 backend configuration and any terraform variables you wanted
# Initialize the terraform providers, needs to be run the first time and any time providers change
terraform -chdir=terraform/live/dev init
# Run a terraform plan to see what is going to be created
terraform -chdir=terraform/live/dev init -var-file dev.tfvars
# Build the lambda layer zipfiles - packages code lambdas can use - needs to be run before deploying
# Although `make deploy-dev` will run this automatically
make build-lambda-artifacts
# Build lambda artifacts and deploy to AWS, this runs `terraform apply under the hood`
# Note, if you are using a *.tfvars not named dev.tfvars you will need to update this in the Makefile for this command
make deploy-dev
# Destroy deployed resources
terraform -chdir=terraform/live/dev destroyUpload a PDF to the S3 input bucket to trigger the Step Functions workflow:
aws s3 cp report.pdf s3://nmd-bluetail-poc-bucket/{client_id}/{job_id}/input/report.pdf --profile <YOUR_AWS_PROFILE_NAME>Monitor progress in the AWS Step Functions console.
Select a profile via the EXTRACTION_PROFILE environment variable or the terraform extraction_profile variable.
Profiles are defined in config/extraction_profiles.yaml
These allow for "extraction approaches" to be configurable. When deploying terraform you can set the EXTRACTION_PROFILE envvar in terraform to change you approach or test new ones -- although validation_harness is best for A/B testing.
The profile determines a configuration. Then the EXTRACTION_PROFILE envvar (change in .env for local and terraform for cloud) determines which approach the process will use
Example of a profile:
profiles:
default:
name: "Sonnet Hybrid Batch 5"
# Model settings
model_id: "us.anthropic.claude-sonnet-4-5-20250929-v1:0"
max_tokens: 16000
max_tokens_multi_page: 32000
temperature: 0.0
# Extraction behavior
type: hybrid
batch_size: 5
few_shot: true
prompt_caching: true
# Textract settings
textract_features:
- TABLES
- FORMS
- LAYOUT| Profile | Type | Model | Batch Size | Features |
|---|---|---|---|---|
default |
hybrid | Claude Sonnet 4.5 | 5 | Few-shot + caching |
accurate |
hybrid | Claude Opus 4.5 | 3 | Few-shot + caching |
fast |
llm | Claude Haiku | 10 | No few-shot |
Extraction Types:
| Type | Description | Notes |
|---|---|---|
llm |
Pure LLM visual extraction from PDF | |
hybrid |
LLM + Textract (Textract as spell-checker for accuracy) | |
textract |
Textract-only parsing (no LLM) | <-- Not really used in this project |
| Variable | Description | Default | Notes |
|---|---|---|---|
AWS_REGION |
AWS region | us-west-2 |
|
AWS_PROFILE |
AWS credentials profile (local only) | BlueTail |
|
EXTRACTION_PROFILE |
Extraction profile name | default |
|
APP_LOG_LEVEL |
Log level | INFO |
|
APP_LOG_FORMAT |
Log format (json or text) |
json |
|
APP_LOGGER_PREFIX |
Logger name prefix | src |
|
S3_ENDPOINT_URL |
S3 endpoint URL (LocalStack) | - | |
TEXTRACT_ENDPOINT_URL |
Textract endpoint URL (LocalStack) | - | |
BEDROCK_ENDPOINT_URL |
Bedrock endpoint URL (LocalStack) | - | Not supported in LocalStack |
TEXTRACT_INPUT_BUCKET |
S3 bucket for input PDFs | - | Only used for local testing (eventbridge determines this in cloud deploy) |
TEXTRACT_OUTPUT_BUCKET |
S3 bucket for output results | - | Only used for local testing (eventbridge determines this in cloud deploy) |
TEXTRACT_OPERATION_TYPE |
Textract operation type | analyze_document |
detect_text, analyze_document, or analyze_async |
TEXTRACT_FEATURES |
Textract features | TABLES,FORMS,LAYOUT |
Comma-separated |
PROGRESS_QUEUE_URL |
SQS FIFO queue URL for progress events | - | Unset disables progress tracking (safe for local dev) |
Textract results are cached in two layers to avoid repeated OCR work:
- Same-job cache: reuses Textract results from earlier steps in the same job
- Cross-job cache: reuses results across jobs via DynamoDB dedupe
Cache controls:
TEXTRACT_CACHE_VERSION- cache invalidation version (bump to force refresh)TEXTRACT_CACHE_TABLE_NAME- DynamoDB table name (unset disables cross-job cache)
Compare extraction approaches against ground truth for AD/SB extraction accuracy.
Note: The Validation harness is a custom application and should not be assumed to be bug-free.
From analysis of the data and careful inspection of results it is working correctly, but for future continued use more deliberate development and/or another tool should be used.
Quick Start:
# Run extraction with all approaches in manifest
uv run python validation_harness/tracking_app/execute.py --folder 2023_camp_adsi --pages all
# Generate comparison dashboard
uv run python -m validation_harness.tracking_app.cli dashboard
# View results
open validation_harness/.tracking/reports/index.htmlOptions:
--folder- Report folder name (e.g.,2023_camp_adsi)--pages-all,1,1-10, or1,3,5--approach- Run only a specific approach by name
Each validation folder contains:
manifest.yaml- Extraction approaches to testreport.pdf- Source PDFground_truth_results.json- Expected results for comparison
See validation_harness/README.md for full documentation.
A web-based PDF viewer for visualizing extraction results with field-level highlighting.
This requires hooking up and running the src/validation module to map extracted results to locations in the pdf. Currently not hooked up
You will need two json files, one with the locations in the pdf and one with the output results
Features:
- Upload PDF reports and extraction JSON results
- Click entries to highlight fields in the PDF
- Color-coded highlights by field type
- Confidence indicators (High/Medium/Low)
Quick Start:
cd web_viewer
docker-compose up --build
# Open http://localhost:5000See web_viewer/README.md for full documentation.
mypy src/| Issue | Solution |
|---|---|
| AWS credentials error | Check ~/.aws/credentials has [BlueTail] profile or matches you have in terraform for aws_profile |
| Change model/settings | Edit EXTRACTION_PROFILE in .env or extraction_profile in terraform variables |
| Import errors | Run uv sync and activate venv `source .venv/bin/activate |
| Debug logs | Set APP_LOG_LEVEL=DEBUG in .env |
| Validation failed: terraform/modules/lambda | Ensure you have run make build-lambda-artifacts to build lambda layers |
