Customer Churn Prediction Project
Reporting Week Group C
29-05-2021 to 04-06-2021 Nehalkumar Jesadiya c0793769
Rohit Nanawati c0796684
Shah Razzakh Mohammed c0794302
Sharjeel Ahmed c0806695
F a c u l t y S u p e r v i s o r : D r. P a r i s a N a r a e i
Machine Learning Lifecycle
It has the following stages:
1. Business Problem
2. Data Sourcing & ETL
3. Exploratory Data Analysis
4. Data Preparation
5. Model Training and Selection
6. Deployment and monitoring
Obtained dataset with 7043 records and 21 features. We have performed Data Wrangling
and initiated the EDA(Exploratory Data Analysis) to enable:
• Pro- active retention strategy as opposed to reactive one.
• Insights on subscriber’ churn behavior.
• Enable the company to move from business rules-based campaigns to analytics led
campaigns.
Completed Tasks • Enable reduction in marketing spend.
in previous Week We plan to proceed further with the EDA to gain more insights from the Data and use
some Data Visualizations to find the dependencies of ‘Churn’ on features
• This week, we completed data cleaning and began exploratory data analysis on the
dataset we had received.
• We removed the ‘Customer ID' field from our dataset because it only serves as a
unique identifier and has no bearing on our investigation.
Completed Tasks • This is the reason why we have indexing in our Data Frame.
• Surprisingly , our dataset has no missing values.
in this Week
• As a result , no processing was required.
When we started looking into the data, we discovered that:
• ‘Senior Citizen' is a categorical term; thus the 25% -50% -75 percent distribution is
incorrect.
• About 75% of consumers have been with the company for less than 55 months.
• The average monthly fee is USD 64.76, with 25% of consumers paying
The outcomes more than USD 89.85 per month.
On analyzing the ‘Churn’, we reached the following conclusion:
• Data is highly imbalanced, ratio = 73:27
• So, we analyze the data with other features while taking the target values separately to get
some insights.
Post
Analyzation
Post
Analyzation
Post
Analyzation
Post
Analyzation
The analysis till now has provided us with trivial links of features to the churning. Although
it has helped us understand the customer base, nature of the customers and their choice
preferences as well as behavior to some extent. We could not detect any specific pattern
with the monthly charges. The following are the problem we need to find solution for.
We are planning to answer the following questions for the service operator:
Difficulties Encountered
in Reporting Week 1. Figure out the pattern with the monthly charges.
2. How to analyze the customers that are newly acquired.
3. Find concrete links of these features to our target variable i.e Churning.
For next week, our team has planned for EDA(Exploratory Data Analysis) with help
of Data visualization, so we can figure out the reasons behind customer churning.
1. Perform the EDA.
2. Detect the link between Churning and the features.
Tasks to Be Completed 3. Use Data Visualization to extract the relation between Churning and each feature.
in Next Week
4. Use the extracted relations to perform Feature Extraction.
Thanks