GPU Price Scraper & Dashboard
🎯 Objective
Build a Python-based web scraper that collects GPU pricing data across multiple cloud providers and online marketplaces, saves it into a CSV file, and displays the results on a simple web page dashboard.
🔹 Data Sources
Target Sites (Cloud Providers + Marketplaces)
The intern should check real, public-facing pricing pages such as:
-
AWS EC2 Pricing (NVIDIA A100, V100, T4, H100 instances)
-
Google Cloud GPU Pricing
-
Azure GPU Pricing
-
Paperspace / CoreWeave (GPU Cloud)
-
Optional – Retail Sites (for hardware GPUs)
🔹 Technical Tasks
1. Web Scraper (Python)
- Use
requests + BeautifulSoup (or selenium if needed).
- Extract GPU type, price/hour, region (if cloud), or price/unit (if retail).
- Normalize field names (e.g.,
provider, gpu_model, price_usd, unit).
2. CSV Output
3. Web Page Dashboard
🔹 Challenges for the Intern
- Dynamic pages → some sites require scraping HTML, others expose structured JSON.
- Currency normalization → convert all prices to USD if multiple currencies appear.
- Different units → hourly cloud rental vs one-time retail purchase.
- Automation → script should be runnable daily (via cron or Databricks job).
🔹 Deliverables
-
Python Scraper: gpu_scraper.py
- Configurable list of providers & URLs.
- Outputs
gpu_prices.csv.
-
Web Dashboard: app.py (Flask or Streamlit)
- Table of scraped GPU prices.
- Bar chart comparing providers.
-
README.md
- How to run the scraper.
- How to launch the dashboard.
🔹 Stretch Goals (Optional)
- Store results historically (append to CSV) → analyze price trends over time.
- Deploy dashboard to Heroku/Render/Databricks SQL + dashboarding.
- Add alerts: flag when GPU prices drop below a threshold.
GPU Price Scraper & Dashboard
🎯 Objective
Build a Python-based web scraper that collects GPU pricing data across multiple cloud providers and online marketplaces, saves it into a CSV file, and displays the results on a simple web page dashboard.
🔹 Data Sources
Target Sites (Cloud Providers + Marketplaces)
The intern should check real, public-facing pricing pages such as:
AWS EC2 Pricing (NVIDIA A100, V100, T4, H100 instances)
Google Cloud GPU Pricing
Azure GPU Pricing
Paperspace / CoreWeave (GPU Cloud)
Optional – Retail Sites (for hardware GPUs)
🔹 Technical Tasks
1. Web Scraper (Python)
requests+BeautifulSoup(orseleniumif needed).provider,gpu_model,price_usd,unit).2. CSV Output
Write to
gpu_prices.csvwith schema:Example row:
3. Web Page Dashboard
Build a simple Flask (or Streamlit) app.
Load
gpu_prices.csv.Show in:
🔹 Challenges for the Intern
🔹 Deliverables
Python Scraper:
gpu_scraper.pygpu_prices.csv.Web Dashboard:
app.py(Flask or Streamlit)README.md
🔹 Stretch Goals (Optional)