Data for Canada

www.dataforcanada.org

Service Status

Check the current status of all our services here: 👉 status.dataforcanada.org

Mission

Data for Canada exists to bridge the gap between open data availability, resiliency, and usability. We curate, clean, and re-engineer high-value Canadian datasets into high-performance, analysis-ready formats for data engineers, researchers/scientists, developers, and systems.

The Problem

Canada creates incredible amounts of open data, from foundational road networks to federal census statistics and orthoimagery. However, these datasets are often locked in legacy formats, fragmented portals, or structures that require significant engineering effort to normalize. For our target audience, the "time-to-insight" is often bottlenecked by data preparation.

Data Stability: Beyond technical barriers, open data can be ephemeral. Links break, portals are reorganized, and priorities shift, causing valuable datasets to vanish from the public web. This instability makes it risky to build long-term research or software on top of data providers.

What Guides Us

We prioritize our work in a utilitarian manner, aiming to provide the greatest amount of good to the greatest amount of individuals. Our approach is guided by the principles of digital preservation and the need to keep public information accessible over the long term.

Our approach is guided by the following:

The Solution

We act as the transformation layer. We aggregate datasets with permissive licenses and process them into "digestible" standards optimized for modern downstream applications.

For Data Engineers, Researchers/Scientists, and Developers: Skip the cleaning phase. Access normalized, documented data ready for analysis.
For Systems: Standardized data structures designed to feed directly into pipelines, data warehouses, and downstream services.

Our Stewardship: Data for Canada takes ownership of the datasets we create, from start to finish. We ensure that data remains consistent and available, acting as a stable foundation for your work, and allowing for reliable analysis across time and space. By decoupling access from the original source, we ensure your pipelines don't break even if the upstream location changes or expires.

Target Software Ecosystem

We adopt an open-source first approach, while supporting proprietary solutions to the best of our ability to ensure maximum accessibility. We target the latest versions of these software packages (e.g., modern GDAL/OGR) to leverage the newest improvements.

Our data is optimized for:

Category	Recommended Stack & Libraries
Core & Desktop	GDAL/OGR, QGIS, QField
Python & Data	GeoPandas, Lonboard, DuckDB, SedonaDB
Database	PostgreSQL with PostGIS and pg_mooncake extensions
Serving	GeoServer, Martin, ZOO-Project
Serverless	Cloudflare Workers, AWS Lambda, Google Cloud Run functions
Enterprise	ArcGIS Pro, ArcGIS Enterprise

Explore Sample Datasets

See our processing pipeline in action. View samples and documentation for our current priority processes:

Statistical Products: Census data and other quantitative datasets.
Foundation: Core geospatial layers including address point, road networks, and buildings.
Orthoimagery: High-resolution orthoimagery.

High-Level Overview

Note: The data sources in the diagram below are prioritized from left to right, reflecting our current focus on processing high-value statistical, foundational, and orthoimagery datasets first.

Get Involved

We are actively looking for members and partners who want to help scale the Canadian open data ecosystem. Whether you represent an institution or are an individual developer, there are several ways to support the mission.

Infrastructure Support & Global Mirroring

To safeguard against data loss and improve global access speeds, we are seeking Global Infrastructure Partners to help host and distribute our datasets.

Selective Mirroring: We are looking for academic institutions, research organizations, or infrastructure partners worldwide interested in hosting mirrors of specific, high-value dataset subsets (e.g., Census, foundation, and orthoimagery data).
Build Infrastructure: We are seeking partners to host our build and ETL infrastructure. If you have available compute resources (High-CPU/RAM instances or specialized CI/CD runners) to help accelerate our data transformation layer, please contact us.

Contributing & Feedback

We welcome any type of feedback—from bug reports in our transformation logic to suggestions on data schema improvements. For technical feedback or to contribute to specific data pipelines, please visit the relevant repositories:

Join the conversation at #dataforcanada:matrix.org to chat, or provide feedback directly on the GitHub repositories listed above.

License

This project is licensed under the MIT License.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Data for Canada

Data for Canada

Service Status

Mission

The Problem

What Guides Us

The Solution

Target Software Ecosystem

Explore Sample Datasets

High-Level Overview

Get Involved

Infrastructure Support & Global Mirroring

Contributing & Feedback

License

Pinned Loading

Repositories

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

People

Top languages

Uh oh!

Most used topics

Uh oh!