- Haixin Liu
- Hanghai Li
This project analyzes MTA daily ridership trends in New York City to understand how different transit services have recovered since COVID-19. Our Streamlit dashboard compares subway, bus, LIRR, and Metro-North ridership over time and uses NYC COVID-19 case data as additional context.
- How have different MTA services recovered since COVID-19?
- How do weekday and weekend ridership patterns differ?
- How do changes in COVID-19 cases relate to changes in transit ridership?
-
MTA Daily Ridership Data
https://data.ny.gov/Transportation/MTA-Daily-Ridership-Data-Beginning-2020/vxuj-8kew -
NYC COVID-19 Daily Cases
https://data.cityofnewyork.us/Health/Coronavirus-Data/rc75-m7u3
Both datasets are loaded into BigQuery for the Streamlit app:
sipa-adv-c-bouncing-penguin.mta_data.daily_ridershipsipa-adv-c-bouncing-penguin.mta_data.nyc_covid_cases
streamlit_app.py- homepage and project introductionpages/1_MTA_Ridership.py- main MTA ridership analysispages/2_Second_Dataset.py- NYC COVID-19 context pageutils.py- helper functions for cleaning and plottingvalidation.py- Pandera schema validationtests/- unit tests for utility and validation codeload_data_to_bq.py- script for loading both datasets into BigQueryLAB_10_WRITEUP.md- Lab 10 notes on data loading and performance
- Clone this repository:
git clone https://github.com/advanced-computing/bouncing-penguin.git - Create virtual environment:
python -m venv .venv - Activate virtual environment:
- Mac/Linux:
source .venv/bin/activate - Windows:
.venv\Scripts\activate
- Mac/Linux:
- Install dependencies:
pip install -r requirements.txt - Load the BigQuery tables:
python load_data_to_bq.py --dataset all
Run the Streamlit app locally:
streamlit run streamlit_app.pyYou can still open mta_ridership_project.ipynb in Jupyter Notebook or VS Code for notebook-based exploration.
Lab 10 documentation lives in LAB_10_WRITEUP.md.