Skip to content

prrattyushh/Study-Hour-Performance-Predictor

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 

Repository files navigation

📊 Study Hour Performance Predictor

Project Overview

This project implements a Simple Linear Regression model using Python and NumPy to predict a student's final marks based on the number of hours they spend studying.

This model serves as a practical introduction to the fundamental concepts of machine learning, including data standardization, implementing the Gradient Descent algorithm from scratch, and real-time prediction using the final trained weights.

Key Features

Custom Gradient Descent Implementation: The model's weights (slope and intercept) are learned using the Gradient Descent optimization algorithm, executed over 50 epochs.

Feature Scaling: Input data (StudyHours) is standardized using StandardScaler to ensure the Gradient Descent converges efficiently.

Training Visualization: Includes a real-time animation (using matplotlib.animation) to visualize the regression line converging on the data and the corresponding reduction in Mean Squared Error (MSE) loss during training.

Interactive Prediction: After training, the script enters an interactive mode, allowing the user to input new study hour values and receive an immediate prediction of the potential marks.

Results

After 50 epochs of training, the model converged to a low loss, yielding the following final prediction equation:

$$\text{Marks} = \theta_0 + \theta_1 \cdot (\text{Scaled Study Hours})$$

Final Mean Squared Error (MSE) Loss: (Insert your final loss value from the end of the history_loss array here, e.g., 2.54)

Final Weights ($\theta$): (Insert your final theta array value here, e.g., [75.12, 15.68])

This equation, once converted back to the original scale, provides the best linear fit to the data.

Data

The model is trained on the synthetic dataset Studyhours.csv.

Feature

Description

StudyHours

The input feature (X)

Marks

The target variable (y)

Dependencies

This project requires the following Python libraries:

numpy (for numerical operations and matrix math)

pandas (for data loading and handling)

matplotlib (for plotting and animation)

sklearn (specifically StandardScaler for preprocessing)

You can install these dependencies using pip:

pip install numpy pandas matplotlib scikit-learn

How to Run

Environment: Run the code in a Jupyter environment like Google Colab.

Upload Data: Ensure both the Colab notebook (.ipynb file) and the data file (Studyhours.csv) are in the same directory or uploaded to the Colab environment.

Execute Cells: Run all cells in the notebook.

Interactive Mode: After the Gradient Descent training loop completes, the script will prompt you for input:

Enter the number of study hours to predict marks (or type 'quit' to exit):

Predict: Enter a number (e.g., 6.0) to see the predicted marks. Type quit to continue to the animated plots.

About

This project implements a Simple Linear Regression model using Python and NumPy to predict a student's final marks based on the number of hours they spend studying. This model serves as a practical introduction to the fundamental concepts of machine learning, including data standardization, implementing the Gradient Descent algorithm from scratch,

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors