Skip to content

A machine learning project designed to predict the likelihood of converting educational leads. It helps institutions target prospective students more effectively, improving marketing strategies and enrollment rates.

Notifications You must be signed in to change notification settings

surajwate/Education-Lead-Conversion-Model

Repository files navigation

Lead Scoring Prediction App

This repository contains the code for a Lead Scoring Prediction application. The application predicts the likelihood of a lead converting based on input features using a logistic regression model. The project includes data preprocessing, model training, and a Streamlit-based web application for making predictions.

Table of Contents

Overview

The Lead Scoring Prediction App is designed to help businesses predict the likelihood of a lead converting based on various features such as time spent on the website, lead origin, lead source, and notable activities. The model was trained using logistic regression with Recursive Feature Elimination (RFE) to select the most important features. The application is built using Streamlit, allowing users to input feature values and get predictions directly from the trained model.

Features

  • Data Preprocessing: Handles missing values, encodes categorical variables, scales numerical features, and selects important features using RFE.
  • Model Training: Trains a logistic regression model using the processed features.
  • Streamlit Web App: Provides a user-friendly interface for making predictions based on user input.
  • Logging: Tracks the process of data cleaning, preprocessing, and model training for easy debugging and monitoring.

Installation

To run this project locally, follow these steps:

  1. Clone the repository:

    git clone https://github.com/yourusername/lead-scoring-prediction-app.git
    cd lead-scoring-prediction-app
  2. Install dependencies using Poetry:

    poetry install
  3. Activate the virtual environment:

    poetry shell
  4. Download or prepare the dataset:

    • Place the Leads.csv file in the appropriate directory or modify the code to point to your dataset.
  5. Run the Streamlit app:

    streamlit run streamlit_app/app.py

Project Structure

├── data_preprocessing.py         # Preprocessing script
├── data_cleaning.py              # Data cleaning script
├── log_reg_model.py              # Logistic regression model training script
├── log_reg_rfe_model.py          # Logistic regression with RFE model training script
├── logging_utils.py              # Logging utilities
├── model_dispatcher.py           # Model dispatcher script for easy model selection
├── model_utils.py                # Utility functions for model evaluation
├── streamlit_app/
│   └── app.py                    # Streamlit application script
├── models/                       # Directory containing saved models and preprocessing objects
├── README.md                     # Project documentation
└── requirements.txt              # Dependencies list (auto-generated by Poetry)

Usage

Training the Model

You can train the model by running the appropriate script:

python main.py --model_type log_reg_rfe --mode train_full

This script will clean the data, preprocess it, select the most important features using RFE, and train a logistic regression model. The model, along with the preprocessing objects (like the scaler and encoder), will be saved in the models/ directory.

Running the Web App

After training the model, you can run the Streamlit app to make predictions:

streamlit run streamlit_app/app.py

Making Predictions

  • Open the local URL provided by Streamlit in your browser.
  • Input the feature values in the provided fields.
  • Click the Predict button to see the prediction result.

Model and Preprocessing Details

Preprocessing

The preprocessing steps include:

  • Handling missing values by imputing or dropping columns as necessary.
  • Encoding categorical variables using one-hot encoding.
  • Scaling numerical features to standardize the data.
  • Selecting important features using Recursive Feature Elimination (RFE).

Model

The model used is a logistic regression classifier, which has been trained on the selected features. Recursive Feature Elimination (RFE) was used to select the top features that contribute most to the prediction.

License

This project is licensed under the MIT License - see the LICENSE file for details.

About

A machine learning project designed to predict the likelihood of converting educational leads. It helps institutions target prospective students more effectively, improving marketing strategies and enrollment rates.

Resources

Stars

Watchers

Forks