Customer Churn Prediction Model

Project

Customer Churn Prediction Model

Customer churn is a major challenge for telecom businesses. Using machine learning, I built a classification model that predicts whether a customer is likely to leave (churn) based on their service usage, tenure, and demographics.

Client

Website

Visit Website

Goal:
Predict customer churn for a telecom company using Machine Learning.

Tech Stack:

Python
Pandas & NumPy (data wrangling)
Scikit-learn (modeling & evaluation)
Matplotlib & Seaborn (visualization)

🔍 Project Overview

Dataset: 5,000+ customer records
Approach: Data cleaning → EDA → Feature Engineering → Model Training → Evaluation
Result: Achieved 85% accuracy with a tuned Random Forest classifier.

📂 Workflow

flowchart TD
A[Data Collection] --> B[Data Cleaning]
B --> C[Exploratory Data Analysis]
C --> D[Feature Engineering]
D --> E[Model Training]
E --> F[Model Evaluation]
F --> G[Deployment/Insights]

🛠️ Steps

1. Data Cleaning

Handled missing values in TotalCharges.
Encoded categorical variables (e.g., Gender, Contract type).
Standardized numerical features.

import pandas as pd

# Load dataset
df = pd.read_csv("telecom_churn.csv")

# Handle missing values
df['TotalCharges'] = pd.to_numeric(df['TotalCharges'], errors='coerce')
df['TotalCharges'].fillna(df['TotalCharges'].median(), inplace=True)

# Encode categorical features
df = pd.get_dummies(df, drop_first=True)

2. Exploratory Data Analysis (EDA)

Churn rate was ~26% overall.
Higher churn among customers with month-to-month contracts.
Longer tenure customers were less likely to churn.

import seaborn as sns
import matplotlib.pyplot as plt

sns.countplot(x="Churn", data=df)
plt.title("Churn Distribution")
plt.show()

📊 Insight: Customers on shorter contracts + multiple support calls were more likely to leave.

3. Model Training & Tuning

from sklearn.model_selection import train_test_split, GridSearchCV
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import classification_report, confusion_matrix

# Split data
X = df.drop("Churn", axis=1)
y = df["Churn"]
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train model
rf = RandomForestClassifier(random_state=42)
rf.fit(X_train, y_train)

# Evaluate
y_pred = rf.predict(X_test)
print(classification_report(y_test, y_pred))

Best Result:

Accuracy: 85%
Precision: 82%
Recall: 80%

4. Outcome

✅ Business Value:

Helps identify at-risk customers before they leave.
Enables targeted retention campaigns → reduces churn, increases revenue.

🚀 Links

GitHub: View Code
Demo: See Live

✨ This project highlights my skills in data cleaning, exploratory analysis, and building ML models that solve real-world business problems.

👋 Work With Me

Customer Churn Prediction Model

Visit Website

🔍 Project Overview

📂 Workflow

🛠️ Steps

1. Data Cleaning

2. Exploratory Data Analysis (EDA)

3. Model Training & Tuning

4. Outcome

🚀 Links

Leave a comment