This project demonstrates sentiment analysis using DistilBERT, a smaller and faster version of the BERT model. The goal is to analyze text and classify the sentiment into three categories: Positive, Neutral, or Negative.
- Sentiment Analysis: The task of identifying the sentiment expressed in a piece of text (positive, neutral, or negative).
- DistilBERT: A pre-trained transformer model that is smaller, faster, and more efficient than the original BERT, while still delivering strong performance for text classification tasks.
- Fine-tuning: The process of adapting a pre-trained model to a specific task (like sentiment analysis) by training it on labeled data.
- Python: The primary programming language used for model training and evaluation.
- Transformers Library: A popular library by Hugging Face that provides pre-trained transformer models (like BERT, DistilBERT, etc.) for NLP tasks.
- Gradio: A Python library that enables the creation of user-friendly interfaces for machine learning models, allowing users to easily input text and receive sentiment predictions.
The model is trained on Sp1786/multiclass-sentiment-analysis-dataset dataset from Hugging Face that includes text data with corresponding sentiment labels. This dataset contains examples of text messages classified as either positive, neutral, or negative.
-
Fine-Tuning DistilBERT:
We start with a pre-trained DistilBERT model, which has been trained on a large corpus of text data. We then fine-tune it using a labeled sentiment analysis dataset to improve its accuracy for this specific task. -
Training & Evaluation:
The model is trained using a supervised learning approach, where the sentiment of each piece of text is known in advance. After training, the model’s performance is evaluated using metrics like accuracy and F1 score. -
User Interface:
A simple and intuitive Gradio interface is built to allow users to input text and receive a sentiment prediction. This makes the model accessible to users who may not be familiar with coding or machine learning.
- User-Friendly Interface: Easily input text and get real-time sentiment predictions.
- Fast & Efficient: Thanks to the use of DistilBERT, the model is optimized for both performance and speed.
- High Accuracy: Fine-tuning DistilBERT ensures accurate sentiment classification results.
