A production-style multi-source feedback analytics platform built for product managers. Aggregates real app store reviews, runs hybrid sentiment analysis, detects trending issues, and generates stakeholder-ready PDF reports — all from a clean Streamlit dashboard.
Main dashboard — 400 real Play Store reviews analysed across 6 tabs with KPI overview
- 4 data sources — Google Play Store (live API), Apple App Store (RSS), CSV/Excel upload, Mock data
- Hybrid sentiment analysis — VADER + TextBlob ensemble with per-review confidence scores
- Trend detection — Daily sentiment time series, anomaly detection, keyword spike analysis
- Issue prioritization — Weighted priority scoring with urgency flagging and category clustering
- 6-tab Streamlit dashboard — Trends, Issues, Reviews, Categories, PDF Report, Competitor Benchmark
- PDF report generation — Professional weekly report with 4 embedded charts via ReportLab
- Competitor benchmarking — Side-by-side sentiment comparison against any rival app
feedback_intelligence/
├── app.py # Streamlit dashboard (entry point)
├── config.py # Central configuration — thresholds, keywords, app IDs
├── requirements.txt
├── data/
│ └── samples/
│ └── sample_reviews.csv # Sample CSV for testing the upload feature
├── reports/ # Generated PDF reports saved here
└── src/
├── models.py # Shared data models (Feedback, Issue, TrendPoint)
├── pipeline.py # Orchestrator — wires all modules together
├── ingestion/
│ ├── playstore.py # Google Play Store scraper (google-play-scraper)
│ ├── appstore.py # Apple App Store RSS/JSON fetcher
│ ├── csv_importer.py # Flexible CSV/Excel importer (auto column detection)
│ └── mock_data.py # Realistic synthetic review generator
├── processing/
│ ├── analyzer.py # Hybrid sentiment engine (VADER + TextBlob)
│ └── categorizer.py # Auto-categorization, urgency detection, priority scoring
├── intelligence/
│ └── trend_detector.py # Daily trends, z-score anomaly, keyword spikes, issue clustering
└── actions/
└── reports.py # PDF report generator with embedded matplotlib charts
git clone <your-repo-url>
cd feedback_intelligencepython -m venv venv
# Windows
venv\Scripts\activate
# Mac / Linux
source venv/bin/activatepip install -r requirements.txtstreamlit run app.pyOpen http://localhost:8501 in your browser.
| Control | Description |
|---|---|
| Mock Data | Generates 50–500 synthetic reviews — no API needed, great for demos |
| Google Play Store | Enter comma-separated app IDs e.g. com.whatsapp, com.snapchat.android |
| ⚔️ Competitor App ID | Enter a rival app ID to compare side-by-side in the Benchmark tab |
| Apple App Store | Enter numeric App IDs e.g. 310633997 for Spotify |
| Upload CSV / Excel | Upload your own survey or feedback export |
| Run Analysis | Triggers data fetching + full sentiment + priority pipeline |
| Filters | Filter by date range, source, sentiment label, or urgency flag |
| Tab | What it shows |
|---|---|
| 📉 Trends | Sentiment over time line chart, daily sentiment mix, daily review volume |
| 🔥 Issues | Recurring problems ranked by priority score with sample reviews |
| 💬 Reviews | Full sortable table — date, source, rating, sentiment, score, category, priority |
| 🗂️ Categories | Pie chart by category, avg sentiment per category, source breakdown |
| 📄 Report | One-click professional PDF with 4 embedded charts |
| ⚔️ Benchmark | Side-by-side competitor comparison — KPIs, sentiment dist, rating dist, winner banner |
Trends tab — Daily sentiment time series (Jan–Mar), sentiment mix stacked bar, and review volume
The trend engine computes:
- Daily sentiment aggregates — average compound score per day across all reviews
- Sentiment direction — slope-based detection (improving / declining / stable)
- Z-score anomaly detection — flags days where review volume spikes beyond 2 standard deviations
- Keyword spike detection — identifies words that appear significantly more in recent reviews vs baseline
Issues tab — Top issues ranked by priority score with frequency, avg sentiment, and sample reviews
Each review is scored using a weighted formula:
priority = 0.4 × |negative_sentiment|
+ 0.1 × rating_penalty (lower stars → higher priority)
+ 0.2 × recency_weight (recent reviews weighted more)
Reviews containing urgent keywords (crash, bug, broken, error, freeze, refund…) with a low star rating receive a 1.5× priority boost and are flagged as urgent.
Issues are then clustered by category (App Crash, Performance, Login/Auth, Data Loss, etc.) and ranked by their aggregate priority score.
Benchmark tab — Snapchat vs Instagram: sentiment distribution, star ratings, avg sentiment score, winner banner
Enter your app ID and a competitor's Play Store ID in the sidebar. The Benchmark tab fetches up to 200 live reviews for the competitor and renders:
- Side-by-side KPI metrics (total reviews, positive %, negative %, avg sentiment, urgent count)
- Grouped sentiment distribution bar chart
- Star rating distribution comparison (1★ to 5★)
- Average sentiment score bar with the winner highlighted
- A summary banner announcing which app wins on sentiment
The system uses a two-stage ensemble:
- VADER (Valence Aware Dictionary and sEntiment Reasoner) — rule-based, runs on every review, fast
- TextBlob — polarity + subjectivity scoring, used to cross-validate VADER
Both analyzers vote on the label. The winner is chosen by weighted confidence score.
Output per review:
| Field | Description |
|---|---|
sentiment_label |
positive / neutral / negative |
confidence |
0.0 to 1.0 — how certain the ensemble is |
sentiment_score |
Compound score from -1.0 to +1.0 |
Thresholds: positive ≥ 0.05 · neutral -0.05 to 0.05 · negative ≤ -0.05
The weekly PDF report is generated with ReportLab + Matplotlib and includes:
- Executive summary with KPI table (total reviews, positive %, negative %, avg rating, urgent count)
- Fig 1 — Sentiment distribution donut chart
- Fig 2 — Daily sentiment trend line chart
- Fig 3 — Top issues by priority score (horizontal bar)
- Fig 4 — Reviews by source and sentiment (grouped bar)
- Source breakdown table
- Critical issues table (category, mentions, priority, trend)
- Recommended actions
Minimum requirement: a text / review column. All other columns are auto-detected by name.
| Standard Field | Accepted Column Names |
|---|---|
text (required) |
text, review, feedback, comment, content, body, message |
rating |
rating, score, stars, star_rating |
date |
date, timestamp, created_at, submitted, review_date |
author |
author, user, username, reviewer, customer |
title |
title, subject, headline |
app_id |
app_id, app, product, product_id |
Sample file: data/samples/sample_reviews.csv
All thresholds, keywords, and defaults are in one place:
# Default Play Store apps to fetch
PLAYSTORE_APP_IDS = ["com.yourapp"]
# Tune sentiment thresholds
SENTIMENT_POSITIVE_THRESHOLD = 0.05
SENTIMENT_NEGATIVE_THRESHOLD = -0.05
# Add your own urgent keywords
URGENT_KEYWORDS = {"crash", "bug", "broken", "error", "freeze", "refund", ...}
# Add or edit feedback categories
CATEGORY_KEYWORDS = {
"My Category": ["keyword1", "keyword2"],
...
}
# Number of top issues shown in PDF report
REPORT_TOP_ISSUES = 10Run this from the project root to verify the core pipeline works without the dashboard:
python -c "
from src.ingestion.mock_data import generate_mock_reviews
from src.processing.analyzer import get_engine
from src.processing.categorizer import enrich_feedback
reviews = generate_mock_reviews(10)
engine = get_engine()
reviews = engine.analyze_batch(reviews)
reviews = enrich_feedback(reviews)
for r in reviews:
print(r.sentiment_label, r.priority_score, '|', r.text[:60])
"| Criteria | Status | Evidence |
|---|---|---|
| Multi-Source Integration (2+ sources) | ✅ | Play Store + App Store + CSV + Mock — 4 sources |
| Sentiment Analysis with confidence scores | ✅ | VADER + TextBlob ensemble, per-review confidence |
| Trend Detection | ✅ | Daily time series, slope detection, z-score anomaly, keyword spikes |
| Issue Prioritization | ✅ | Weighted formula, urgency boost, category clustering |
| Streamlit Dashboard with filters | ✅ | 6 tabs, date / source / sentiment / urgency filters |
| PDF Reports with charts and issues | ✅ | 4 embedded matplotlib charts, KPI + issues tables |
| Modular code structure | ✅ | 12 separate modules, clean ingestion/processing/intelligence/actions split |
| Error handling | ✅ | try/except in all fetchers, ImportError guards, graceful empty returns |
| Documentation | ✅ | This README — install guide, CSV format, config, arch, smoke test |
- Word clouds — Visual keyword frequency per sentiment bucket
- Email alerts — SMTP notification when urgent count spikes above threshold
- Auto-responses — Use OpenAI API to draft replies to negative reviews
- Slack integration — Post urgent reviews to a Slack channel in real time
- Multi-language support — Add
langdetect+googletransbefore analysis - Scheduled reports — Auto-generate and email PDF every Monday morning
| Layer | Technology |
|---|---|
| Dashboard | Streamlit |
| Sentiment (fast) | VADER (vaderSentiment) |
| Sentiment (accuracy) | TextBlob |
| Visualization | Plotly (dashboard) + Matplotlib (PDF charts) |
| PDF Generation | ReportLab |
| Data Processing | Pandas, NumPy |
| Play Store | google-play-scraper |
| App Store | Apple RSS/JSON public API |
| Utilities | python-dateutil, openpyxl |
NITHIN R BHARADWAJ