A real-time web application that classifies text into six emotions using TF-IDF and Logistic Regression. Built with Streamlit, the app provides predictions, confidence scores, and performance metrics.
The dataset includes 3 CSV files:
training.csvtest.csvvalidation.csv
Each file contains:
text: the user’s messagelabel: an integer from 0 to 5 representing emotion
Emotion Mapping:
| Label | Emotion |
|---|---|
| 0 | 😢 Sadness |
| 1 | 😊 Joy |
| 2 | ❤️ Love |
| 3 | 😠 Anger |
| 4 | 😨 Fear |
| 5 | 😲 Surprise |
-
Preprocessing:
- Lowercasing
- Punctuation removal
- Extra whitespace cleanup
-
Vectorization:
- TF-IDF with 5000 features
- Removes English stopwords
-
Model:
LogisticRegression(max_iter=1000)from scikit-learn- Trained on the
training.csvset - Evaluated on the
test.csvset
-
Evaluation Metrics:
- Accuracy score
- Classification report
- Confusion matrix (visualized via Plotly)
-
Streamlit UI:
- Styled layout
- Real-time predictions
- Example sentence testing
- Confidence bar display
Install using pip:
pip install -r requirements.txt