A generic flight tracking utility that builds a local SQLite database of completed flights, filtered by any combination of aircraft types and airline callsign prefixes. Works for any operator or aircraft type — configured out of the box for FedEx and DHL C208/C408 feeder operations as an example.
Uses four complementary data sources:
- adsb.lol — free live ADS-B feed, polled every 60 seconds
- OurAirports — free CC0 airport database for origin/destination resolution
- OpenFlights — free airline name database for schedule display
- FlightAware AeroAPI or AeroDataBox — optional historical enrichment (configurable, one key required)
Designed to run continuously on a free-tier cloud VM (Fly.io, Oracle Cloud, etc.) at near-zero cost.
Every 60 seconds, FlightScrapper fetches all airborne aircraft matching the configured type(s) (e.g. C208, C408) from the adsb.lol API, then filters by callsign prefix (e.g. BEZ, PCM).
Each aircraft is tracked in memory. When an aircraft's altitude crosses 500ft upward, a takeoff is recorded and its position is snapped to the nearest airport in the OurAirports database. When the aircraft disappears from the feed below 3000ft for 5+ minutes, a landing is recorded and its last known position is snapped to the nearest airport.
If an aircraft is first seen already airborne but below 2000ft and its position snaps cleanly to a known airport, it is accepted as a near-takeoff join — handling cases where the poller first detects the aircraft just after liftoff (e.g. low-altitude ADS-B coverage areas like the Caribbean). Aircraft first seen above 2000ft are discarded as mid-flight joins.
A flight record is written to the database only when both origin and destination airports are successfully identified. Partial flights, unresolvable positions, and high-altitude mid-joins are silently discarded.
Any new tail number observed flying under a matching callsign prefix is added to data/tails.json and its history is fetched immediately from the configured enrichment provider.
Once every 24 hours, FlightScrapper runs a batch job against the configured history provider. For each known tail number not fetched in the last 7 days, it retrieves completed flight history and merges it into the database.
Duplicate detection uses callsign + aircraft type + origin + destination + departure date. When a duplicate is found, missing fields are filled in from the other source (e.g. airport names from history, max altitude from live tracking). If origin/destination differs between sources, the history source values are treated as authoritative.
Historical enrichment is optional — if no API key is configured the system runs on adsb.lol alone.
Airport snapping uses the OurAirports CC0 dataset (~25,000 airports with ICAO codes). Priority rules:
- A small airport within 500m always wins
- Otherwise medium/large airports are preferred over small ones
- Within the same tier, the closest wins
- Fallback radius (3–10km) only considers medium/large airports
Four-level fallback chain:
- Hardcoded table of known operators (FedEx/DHL feeders)
- OpenFlights airlines.dat lookup by ICAO prefix (~5,800 airlines)
- Tail number suffix heuristic (FE/FX → FedEx feeder, HL → DHL feeder)
- Airline name stored from history provider responses
| Field | Description |
|---|---|
callsign |
Flight callsign (e.g. BEZ321) |
tail |
Aircraft registration / tail number (e.g. N960HL) |
aircraft_type |
ICAO type designator (e.g. C208) |
airline_name |
Airline name from history provider |
icao_hex |
Mode-S transponder hex code (adsb source only) |
origin_icao |
Departure airport ICAO code |
origin_name / origin_city / origin_region / origin_country |
Departure airport details |
origin_lat / origin_lon |
Departure position coordinates |
dest_icao |
Arrival airport ICAO code |
dest_name / dest_city / dest_region / dest_country |
Arrival airport details |
dest_lat / dest_lon |
Arrival position coordinates |
departure_time |
Takeoff timestamp (UTC ISO 8601) |
arrival_time |
Landing timestamp (UTC ISO 8601) |
duration_min |
Flight duration in minutes |
max_alt_ft |
Maximum observed altitude in feet, rounded to nearest 1,000 (live tracking only) |
flightaware_url |
Link to validate the flight on FlightAware |
source |
adsb, flightaware, or aerodatabox |
recorded_at |
When the record was written |
- Python 3.9+
- Internet access (adsb.lol + OurAirports CSV on first run)
- FlightAware or AeroDataBox API key (optional, for historical enrichment)
git clone <repo-url>
cd FlightScrapper
python -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
pip install -r requirements.txtAll settings are in config.py:
| Setting | Description |
|---|---|
AIRCRAFT_TYPES |
List of ICAO type designators to track (e.g. ["C208", "C408"]). Must not be empty — the live feed is fetched by aircraft type; callsign prefix filtering is applied on top of that result set |
CALLSIGN_PREFIXES |
Only track callsigns starting with these prefixes. Empty list = all callsigns |
KNOWN_TAILS |
Seed list of tail numbers to include in history batches from the start |
| Setting | Default | Description |
|---|---|---|
POLL_INTERVAL_SECONDS |
60 |
How often to poll adsb.lol |
POLL_JITTER_SECONDS |
10 |
Random ± jitter added to each poll interval |
| Setting | Default | Description |
|---|---|---|
TAKEOFF_ALTITUDE_FT |
500 |
Altitude threshold for normal takeoff detection |
NEAR_AIRPORT_ALT_FT |
2000 |
First observation below this + snaps to airport = accepted as near-takeoff join |
LANDING_ALTITUDE_FT |
3000 |
Maximum altitude when disappearing to count as a landing |
LANDING_TIMEOUT_SECONDS |
300 |
Seconds unseen before declaring an aircraft landed |
SNAP_RADIUS_KM_PRIMARY |
3.0 |
Primary airport snap radius in km |
SNAP_RADIUS_KM_FALLBACK |
10.0 |
Fallback snap radius if nothing found within primary |
Set HISTORY_PROVIDER in config.py to choose your enrichment source:
HISTORY_PROVIDER = "flightaware" # or "aerodatabox"History fetches are suppressed per tail for 7 days after a successful fetch (configurable via FA_SUPPRESS_DAYS in tails.py). When a new tail is discovered by the live poller, its history is fetched immediately.
| Setting | Default | Description |
|---|---|---|
FLIGHTAWARE_API_KEY |
"" |
AeroAPI key — or set env var FLIGHTAWARE_API_KEY |
FLIGHTAWARE_LOOKBACK_DAYS |
10 |
Days of history to fetch per tail number (personal tier max) |
FLIGHTAWARE_BATCH_INTERVAL_HOURS |
24 |
How often to check for tails due for refresh |
Coverage: Global including Hawaii and Caribbean (own receiver network). Cost: $5/month free credit (~1,000 calls at $0.005/result set). Pay-as-you-go above that, no minimum.
Getting an API key:
- Sign up at flightaware.com/aeroapi/portal
- The personal tier provides $5/month free credit with no subscription required
| Setting | Default | Description |
|---|---|---|
AERODATABOX_API_KEY |
"" |
RapidAPI key — or set env var AERODATABOX_API_KEY |
AERODATABOX_LOOKBACK_DAYS |
7 |
Days of history to fetch per tail number |
AERODATABOX_BATCH_INTERVAL_HOURS |
24 |
How often to check for tails due for refresh |
Coverage: Good for US and Europe. Limited in Hawaii and Caribbean. Cost: Free tier (600 units/month) or $5/month Basic plan (6,000 units, ~1,000 calls).
Getting an API key:
- Sign up at rapidapi.com
- Subscribe to the AeroDataBox Basic plan
Setting API keys (never put them in config.py or commit to git):
# Add to ~/.zshrc or ~/.bashrc for persistence
export FLIGHTAWARE_API_KEY=your_key_here
# or
export AERODATABOX_API_KEY=your_key_here
source ~/.zshrcadsb.lol requires no key — it is a free community ADS-B feed.
python main.pyOn first run, the OurAirports airport database (~8MB CSV) is downloaded to data/airports.csv. This only happens once.
Example output:
FlightScrapper starting.
Aircraft types : C208, C408
Callsign filter: ['BEZ', 'PCM', ...]
Poll interval : 60s ± 10s
History source : FlightAware — enabled (batch every 24h)
Loaded 32700 airports.
Loaded 12 known tail numbers.
Resumed 3 in-flight aircraft from buffer.
[2026-03-27 09:53:49 UTC] poll #1 — 7 aircraft matching filters
Near-takeoff join: BEZ321 from TJSJ (San Juan) at 1200ft
New tail discovered: N960HL — fetching history...
History for N960HL: 7 fetched, 7 saved
[FlightAware] Starting batch for 8/12 tail(s) due for refresh...
[FA] Saved: BEZ2321 TJSJ->TFFJ (2026-03-24)
[FA] N960HL (1/8): 7 fetched, 7 saved, 0 already known
[FlightAware] Batch complete — 12 saved, 3 duplicates skipped.
Stop with Ctrl+C — active flights are saved to buffer.json and resumed on next start.
| File | Description |
|---|---|
flights.db |
SQLite database of completed flights |
buffer.json |
In-flight tracker state, persisted for crash resilience |
data/airports.csv |
OurAirports database, downloaded on first run |
data/airlines.dat |
OpenFlights airline database, downloaded on first schedule run |
data/tails.json |
Known tail numbers with last history fetch timestamps |
None of these are committed to git.
schedule.py reads the flights database and produces a weekly timetable — grouped by airline, broken down by day — in three output formats.
# Pretty-print to stdout (default if no flags given)
python3 schedule.py
# Write schedule.html (opens in any browser)
python3 schedule.py --html
# Write schedule.csv (import into Excel, Google Sheets, etc.)
python3 schedule.py --csv
# All three at once
python3 schedule.py --text --html --csv
# Custom output paths
python3 schedule.py --html --html-out reports/schedule.html \
--csv --csv-out reports/schedule.csvEach output lists every recorded flight per airline per day of the week, sorted by departure time, with columns:
| Column | Description |
|---|---|
| Flight | Callsign (e.g. BEZ321) |
| From | Origin ICAO code + city |
| To | Destination ICAO code + city |
| Dep (UTC) | Departure time, rounded to nearest 5 min |
| Arr (UTC) | Arrival time, rounded to nearest 5 min |
| Dur | Flight duration in minutes |
| A/C | Aircraft type (C208, C408, …) |
Each airline gets a distinct colour from a rotating palette. Known operators have their network role displayed (e.g. FedEx feeder, DHL feeder). Unknown operators are resolved via OpenFlights or the airline name stored from the history provider.
| File | Description |
|---|---|
schedule.html |
Styled browser timetable, one card per airline |
schedule.csv |
Flat file with one row per flight-day, suitable for spreadsheet analysis |
Neither file is committed to git — regenerate them any time from the live database.
Use any SQLite client, or Datasette for a browser UI with built-in CSV export:
pip install datasette
datasette flights.dbThen open http://localhost:8001.
Example queries:
-- All flights, newest first
SELECT callsign, tail, aircraft_type, origin_icao, origin_city, dest_icao, dest_city,
departure_time, duration_min, source
FROM flights ORDER BY departure_time DESC;
-- Caribbean routes only
SELECT * FROM flights WHERE origin_country = 'PR' OR dest_country = 'PR'
ORDER BY departure_time DESC;
-- Most common routes
SELECT origin_icao, origin_city, dest_icao, dest_city, COUNT(*) as count
FROM flights
GROUP BY origin_icao, dest_icao
ORDER BY count DESC;
-- Flights by source
SELECT source, COUNT(*) as count FROM flights GROUP BY source;
-- Average duration per route
SELECT origin_icao, dest_icao, ROUND(AVG(duration_min)) as avg_min, COUNT(*) as flights
FROM flights
GROUP BY origin_icao, dest_icao
ORDER BY avg_min DESC;| Source | Use | Cost | Key required |
|---|---|---|---|
| adsb.lol | Live ADS-B polling | Free | No |
| OurAirports | Airport database for O/D resolution | Free (CC0) | No |
| OpenFlights | Airline name lookup for schedule display | Free (ODbL) | No |
| FlightAware AeroAPI | Historical enrichment (recommended) | $5/month free credit | Yes |
| AeroDataBox via RapidAPI | Historical enrichment (alternative) | Free (600 units) or $5/month | Yes |
adsb.lol for live polling:
The key requirement was a native filter by ICAO aircraft type (e.g. C208) so we only receive relevant aircraft without downloading the entire global feed. adsb.lol is the only free community ADS-B source with a /v2/type/{aircraft_type} endpoint. Alternatives considered:
- adsb.fi — identical data quality and better-documented use policy, but no type filter endpoint; only supports lookup by hex, callsign, registration, or lat/lon radius
- ADSBexchange — no public API; would require scraping their web UI
- airplanes.live — no documented use policy for automated polling
- OpenSky Network — no native type filter; 400 API credits/day on free tier; historical data only via research account registration
FlightAware AeroAPI for historical enrichment (recommended): Provides global coverage including Hawaii and Caribbean via their own receiver network. Personal tier gives $5/month free credit (~1,000 calls at $0.005/result set) with no subscription commitment. Supports 10-day lookback on the personal tier.
AeroDataBox as alternative: Good US/Europe coverage at $5/month for 6,000 units (~1,000 calls). Limited in Hawaii and Caribbean. 7-day lookback on the basic tier.
Alternatives not chosen:
- FlightRadar24 API — no free tier; $9/month minimum
- Aviationstack — historical data requires paid plan starting at $49.99/month
- OpenSky Network — community feeders only; poor Hawaii/Caribbean coverage
- Flights first seen above 2000ft are discarded (no mid-flight join recovery)
- adsb.lol ADS-B coverage is sparse in Hawaii and some remote areas — history enrichment mitigates this
- adsb.lol may require an API key in future (see their documentation)
- FlightAware personal tier rate limit: 10 requests/minute
To run on a free-tier cloud VM (Fly.io example):
# Install flyctl: https://fly.io/docs/hands-on/install-flyctl/
fly launch
fly volumes create flightscrapper_data --size 1
fly deployEnsure the persistent volume is mounted at /data and set DB_PATH, BUFFER_PATH, AIRPORTS_CSV_PATH, and TAILS_PATH in config.py to use /data/ as the base directory.
Set the API key as a Fly.io secret:
fly secrets set FLIGHTAWARE_API_KEY=your_key_here