GitHub - muskan-110/Churn-app

Customer Churn Prediction using Machine Learning

An end-to-end Machine Learning project that predicts whether a telecom customer is likely to churn or stay, built using XGBoost for model training and Streamlit for a fully interactive web interface.

🌐 Live Demo : https://churn-app-3umtgwznfthd2epx8fyu88.streamlit.app/

🏗️ Project Overview

This project demonstrates a complete ML pipeline — from data preprocessing and feature encoding to model training, evaluation, and deployment through an intuitive Streamlit dashboard.

It aims to help telecom companies identify customers who are likely to discontinue services and take proactive retention actions.

🧩 Key Features

Interactive Streamlit Frontend – Clean, modern UI for easy data input and churn prediction
XGBoost Model – Trained for high accuracy and robustness
Label Encoding for Categorical Variables – Consistent mapping across train and inference stages
Imbalanced Data Handling – Applied SMOTE (Synthetic Minority Oversampling Technique) to balance churn vs. non-churn classes
Modular Codebase – Easy to maintain, modify, and extend

📊 Data Processing & Model Training

All data cleaning, preprocessing, and model training steps are documented in the Colab Notebook . Here’s a summary of what was done:

1️⃣ Data Cleaning

Handled missing values in TotalCharges

Converted data types where necessary (e.g., numeric conversion of strings)

Removed redundant or highly correlated columns

2️⃣ Feature Engineering

Encoded categorical features using LabelEncoder

Created derived features such as tenure groups

Normalized numeric values to improve model convergence

3️⃣ Handling Imbalanced Data

Used SMOTE to oversample minority churn cases and achieve balanced class distribution

4️⃣ Model Selection

Trained multiple models (Decision Tree, Random Forest, XGBoost)

Selected XGBoost due to superior performance on accuracy

5️⃣ Model Evaluation

Evaluated using:

Accuracy

Precision / Recall / F1-score

Confusion Matrix

6️⃣ Model Export

Saved final model and label encoders as:

model_xgboost.pkl

label_encoders.pkl

These are later used by the Streamlit app for real-time predictions.

🎨 Streamlit Frontend

The app allows users to input customer details and get instant predictions.

⚙️ Tech Stack

Frontend -> Streamlit
Backend / Model -> XGBoost, Scikit-learn
Data Handling -> Pandas, NumPy
Visualization -> Matplotlib, Seaborn
Deployment -> Streamlit Cloud
Environment -> Python 3.10+

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
.devcontainer		.devcontainer
app		app
training		training
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages