A Comparison of Machine Learning Algorithms for Customer Churn Prediction
A Comparison of Machine Learning Algorithms for Customer Churn Prediction
Abstract— Today's fiercely competitive business customer’s future with respect to their company, the fact that
environment has given significant importance to customer they cease their involvement with any of the company’s
churn, a term used for the loss of customers, which possesses a products or services is called Churn.
significant challenge to organizations across various industries.
To mitigate revenue loss and sustain growth, companies are
increasingly turning to machine learning (ML) algorithms for Companies may create their own datasets to keep track of
customer churn prediction. This review paper provides a customers who “churned” or stopped using their products or
concise examination of ML algorithms' role in predicting services. There are traditional statistical methods which were
customer churn, a pivotal concern for businesses seeking to used for quite some time in the analysis of churn. However,
sustain growth and profitability. The review begins by today’s world is blessed with advancements in computing
underlining the significance of customer churn in today's
competitive landscape, highlighting the impact of data-driven
technology, as well as the rapid increase of ML algorithms
approaches in this context. The paper then explores various ML for churn prediction. These advanced algorithms enable
algorithms suitable for churn prediction and comparing the businesses to not only identify churn patterns but also to
results to find out the most optimal algorithm for a few real- harness the power of predictive analytics, allowing for more
world scenarios, namely telecommunication, banking and e- proactive and targeted retention efforts. Additionally, the
commerce. The review found that Decision Tree Classification, scalability and adaptability of ML models make them
Random Forest Classification, AdaBoost and XGBoost
Classification algorithms were optimal for churn prediction.
invaluable in handling vast and complex datasets, providing
Additionally, the review covers the implementation of the businesses with a competitive edge in customer retention
findings in a churn prediction application. strategies. There are also Deep Learning Models to take
advantage of. However, owing to their high computational
Index Terms— Machine Learning, Churn Prediction, Data- power requirements, as well as higher model training time, it
driven Approaches, Gradient Boosting Algorithms, Customer, was decided not to include them in this comparative study, as
churn
all other models were comparatively less demanding.
I. INTRODUCTION
This paper intends to discover the impact of various ML
Contemporary world businesses have loads of data to work algorithms on real-world scenarios. This paper compares
and grow from. Every move of a human in this age generates accuracy and time required for each of the nine algorithms to
data, straight from their smartwatches to their choice of classify a new data item. The analysis made by this study will
turning on the ceiling fan at their homes. However what be utilized in a churn predictor application.
matters is how these companies handle this data. Data being
the new oil, has numerous uses, which only need to be II. LITERATURE REVIEW
uncovered with innovative analytics, insightful While looking for ML algorithms to process the real-world
interpretation, and strategic application to unlock its full scenarios on, this study took care of two factors: first is
potential for driving business growth and societal accuracy of the model, for obvious reasons, and second is the
advancements. time taken by the model to train. The latter was of equal
It is widely recognized that maintaining a current customer importance as the ultimate intent was of the creation a client-
is more cost-effective than acquiring a new one. [2, 5]. One centered portal for predicting customer churn.
of the key metrics in this regard is Customer Churn. In simple
words, customer churn refers to the portion of your customer The recent years have witnessed the use of Decision Trees
base that stops to engage with your products or services (DT) based algorithms, as well as Ensemble Learning
within a specified timeframe. Thus, when predicting a methods for Churn prediction [1]. Decision Tree algorithms
III. ML CLASSIFICATION MODELS 4.4 Class Imbalance: In some cases, some variables are
Based on the review of ML algorithms for classification, imbalanced. These variables will be balanced using the
the following models were chosen to classify the datasets OverSampling method, so the size of the minority values is
with: increased to a size similar to the majority before balancing.
1. Logistic Regression (LR)
2. Random Forest Classification (RF) 4.5 Feature Selection: In the final stage of data
3. Support Vector Machines (SVM) preprocessing, the task is to choose the most suitable features
4. AdaBoost (ADAB) that serve as indicators for churn.
5. XGBoost (XGB)
6. Decision Tree Classification (DT) V. DATASETS
7. Naïve Bayes Classification (NB)
8. K-Nearest Neighbors (KNN) Classification This review analysed the aforementioned ML algorithms on
9. A basic artificial neural network (ANN) the following datasets
A. Telecom Company Dataset
A basic artificial neural network was also created to
classify the datasets to compare the accuracy score as well as Telecom companies today have to keep up with huge
the time it takes to classify data on a deep learning model, both competition, as a lot of companies have sprung up, providing
of which were key factors in the choice of models for the end- services and programs at prices which aim to capture the
user application. price-sensitive consumer. They need to be aware of the
patterns of modern-day consumers and adapt their strategies
IV. DATA PREPROCESSING in order to stay afloat in this dynamic industry.
The Telecom company dataset [16] has 7043 rows and 21
The customer churn datasets have columns like Price, columns of customer data about their usage of the company’s
Geography, Tenure and so on. These columns are expected to
438
Authorized licensed use limited to: BANGALORE INSTITUTE OF TECHNOLOGY. Downloaded on March 05,2025 at 04:04:15 UTC from IEEE Xplore. Restrictions apply.
2023 6th International Conference on Advances in Science and Technology (ICAST)
phone and internet services. This dataset includes features The aforementioned selection of ML algorithms
(columns) like “PhoneService”, “InternetService”, performed as expected on this dataset.
“StreamingTV”, “StreamingMovies” etc., which gives the
description of the services a customer has subscribed for, 1. Naïve Bayes and KNN Classification took the least
from the company. The only irrelevant data column in this time to train but provided low accuracies on testing.
dataset was the “customerID” column, hence it was dropped.
2. A similar result was observed for Logistic
Regression, in terms of training time and accuracy.
B. Bank Customer Dataset
Banks benefit from understanding the factors that 3. Gradient Boosting algorithms (AdaBoost and
influence a client's decision to depart from the company. XGBoost) did take slightly more time to train (Fig.
Churn prevention allows banks to develop loyalty programs 2), however, they provided good accuracy scores
and retention campaigns to keep as many customers as (Fig. 1).
possible.
This dataset [17] has 10000 values and 18 columns, 4. From TABLE I, it is evident that the best performer
RowNumber, CustomerId, Surname, CreditScore, out of the ML algorithms is Random Forest.
Geography, Gender, Age, Tenure, Balance, NumOfProducts,
5. Even though the ANN provided the highest
HasCrCard, IsActiveMember, EstimatedSalary, Exited,
accuracy, it took a comparatively large amount of
Complain, Satisfaction, Score, Card Type, Point Earn. The time to train, which reinforced our pre-assumptions
variables, RowNumber, CustomerId, Surname will be about its performance.
dropped as these will not useful for model training.
Below mentioned are the accuracy and runtime Figure 2 shows the model training time of different
comparison of the 3 datasets, and their analysis. algorithms on telecom company dataset.
When compared with the accuracy scores on the Telecom The performance of ML algorithms (TABLE III) on E-
Churn dataset (Fig.1), the selection of ML algorithms have commerce dataset was comparatively different than those on
shown a similar pattern of performance as observed in Fig. 3, Telecom dataset and bank dataset.
with the exception of Logistic Regression. This pattern isn’t 1. While Decision Tree Classification couldn’t
observed in the next dataset. provide a good accuracy score on the previous
datasets, it gave an impressive 93% accuracy
score on this dataset.
2. Random Forest Classification and Gradient
Boosting Algorithms (AdaBoost and
XGBoost) displayed consistent levels of
accuracies, when compared to their previous
performances.
3. Support Vector Machines (SVM) showed an
improved accuracy when compared to its
performance on the Bank dataset.
4. A tremendous improvement was recorded by
Naïve Bayes and KNN Classification (Fig. 5).
5. Logistic Regression, which wasn’t among the
top models for the earlier cases, showcased an
Fig. 3. Accuracy scores of various algorithms on Bank Churn
exceptional improvement in this scenario.
dataset.
440
Authorized licensed use limited to: BANGALORE INSTITUTE OF TECHNOLOGY. Downloaded on March 05,2025 at 04:04:15 UTC from IEEE Xplore. Restrictions apply.
2023 6th International Conference on Advances in Science and Technology (ICAST)
Overall, the ML algorithms performed at par with Once the data is entered, the application moves on to the final
the ANN (DL) model (Fig. 6). result, providing the churn status as shown in Fig. 8, as well
as the probability that churn occurs.
441
Authorized licensed use limited to: BANGALORE INSTITUTE OF TECHNOLOGY. Downloaded on March 05,2025 at 04:04:15 UTC from IEEE Xplore. Restrictions apply.
2023 6th International Conference on Advances in Science and Technology (ICAST)
used in today’s world. This would help in the realization of Identification in Telecom Sector," in IEEE Access, vol. 7, pp. 60134-
better and faster predictions, for Customer churn as well as 60149, 2019, doi: 10.1109/ACCESS.2019.2914999.
for many other purposes. DL models would even enable the [11] A. Alamsyah and N. Salma, "A Comparative Study of Employee
Churn Prediction Model," 2018 4th International Conference on
client to use many more parameters for calculation, making Science and Technology (ICST), Yogyakarta, Indonesia, 2018, pp. 1-
predictions as real as possible. 4, doi: 10.1109/ICSTC.2018.8528586.
In the meantime, there could be studies for optimization [12] K. Gupta, A. Hardikar, D. Gupta and S. Loonkar, "Forecasting
of DL models to meet the time limitations as well as Customer Churn in the Telecommunications Industry," 2022 IEEE
performance benchmarks. Bombay Section Signature Conference (IBSSC), Mumbai, India,
2022, pp. 1-5, doi: 10.1109/IBSSC56953.2022.10037334.
REFERENCES [13] A. Raj and D. Vetrithangam, "Machine Learning and Deep Learning
technique used in Customer Churn Prediction: - A Review," 2023
International Conference on Computational Intelligence and
[1] Wang, Xing & Nguyen, Khang & Nguyen, Binh. (2020). Churn Sustainable Engineering Solutions (CISES), Greater Noida, India,
Prediction using Ensemble Learning. 56-60. 2023, pp. 139-144, doi: 10.1109/CISES58720.2023.10183530.
10.1145/3380688.3380710. [14] Y. Y. Win and C. G. Vung, "Churn Prediction Models Using Gradient
[2] A. De Caigny, K. Coussement, and K.W. De Bock. 2018. A new Boosted Tree and Random Forest Classifiers," 2023 IEEE Conference
hybrid classification algorithm for customer churn prediction based on Computer Applications (ICCA), Yangon, Myanmar, 2023, pp.
on logistic regression and decision trees. 271-275, doi: 10.1109/ICCA51723.2023.10181933.
[3] European Journal of Operational Research 269, 2 (2018), 760–772Hu, [15] D. Dasari and P. S. Varma, "Employing Various Data Cleaning
Xin & Yang, Yanfei & Chen, Lanhua & Zhu, Siru. (2020). Research Techniques to Achieve Better Data Quality using Python," 2022 6th
on a Customer Churn Combination Prediction Model Based on International Conference on Electronics, Communication and
Decision Tree and Neural Network. 129-132. Aerospace Technology, Coimbatore, India, 2022, pp. 1379-1383, doi:
10.1109/ICCCBDA49378.2020.9095611. 10.1109/ICECA55336.2022.10009079.
[4] Martínez-García, M. et al. (2023). Learning Logistic Regression with [16] Rodan, Ali & Faris, Hossam & Al-sakran, Jamal & Al-Kadi, Omar.
Unknown Features. In IEEE CAI 2023, pp. 298-299. doi: (2014). A Support Vector Machine Approach for Churn Prediction in
10.1109/CAI54212.2023.00133. Telecom Industry. International journal on information.
[5] Rani, K. Sandhya and., Shaik Thaslima and., N.G.L. Prasanna and ., [17] D. T. Barus, R. Elfarizy, F. Masri and P. H. Gunawan, "Parallel
R.Vindhya and ., P. Srilakshmi, Analysis of Customer Churn Programming of Churn Prediction Using Gaussian Naïve Bayes,"
Prediction in Telecom Industry Using Logistic Regression (JUNE 10, 2020 8th International Conference on Information and
2021). International Journal of Innovative Research in Computer Communication Technology (ICoICT), Yogyakarta, Indonesia, 2020,
Science & Technology (IJIRCST) ISSN: 2347-5552, Volume-9, pp. 1-4, doi: 10.1109/ICoICT49345.2020.9166319.
Issue-4, July 2021. [18] https://www.kaggle.com/datasets/blastchar/telco-customer-churn
[6] Hassonah, M. A. et al. (2019). Churn Prediction: KNN vs. Decision [19] https://www.kaggle.com/datasets/radheshyamkollipara/bank-
Trees. In Sixth HCT ITT 2019, pp. 182-186. doi: customer-churn
10.1109/ITT48889.2019.9075077. [20] https://www.kaggle.com/datasets/ankitverma2010/ecommerce-
[7] Feng, L. (2022). Customer Churn Prediction: Borderline-SMOTE and customer-churn-analysis-and-prediction
Random Forest. In IEEE ICPICS 2022, pp. 803-807. doi:
10.1109/ICPICS55264.2022.9873702.
[8] Zhang, J., & Dong, Y. (2022). Customer Loss Identification and
Factor Analysis in Mobile Operators with XGBoost. In 2022 NetCIT,.
[9] Wu, X., & Meng, S. (2016). E-commerce Customer Churn Prediction
with Enhanced SMOTE and AdaBoost. In 2016 ICSSSM.
[10] I. Ullah, B. Raza, A. K. Malik, M. Imran, S. U. Islam and S. W. Kim,
"A Churn Prediction Model Using Random Forest: Analysis of
Machine Learning Techniques for Churn Prediction and Factor
442
Authorized licensed use limited to: BANGALORE INSTITUTE OF TECHNOLOGY. Downloaded on March 05,2025 at 04:04:15 UTC from IEEE Xplore. Restrictions apply.