Algorithms 17 00231
Algorithms 17 00231
Article
Prediction of Customer Churn Behavior in the
Telecommunication Industry Using Machine Learning Models
Victor Chang 1, * , Karl Hall 2 , Qianwen Ariel Xu 1 , Folakemi Ololade Amao 2 , Meghana Ashok Ganatra 1
and Vladlena Benson 1
1 Department of Operations and Information Management, Aston Business School, Aston University,
Birmingham B4 7ET, UK; [email protected] (M.A.G.); [email protected] (V.B.)
2 School of Computing, Engineering and Digital Technologies, Teesside University,
Middlesbrough TS1 3BX, UK; [email protected] (K.H.); [email protected] (F.O.A.)
* Correspondence: [email protected] or [email protected]
Abstract: Customer churn is a significant concern, and the telecommunications industry has the
largest annual churn rate of any major industry at over 30%. This study examines the use of ensem-
ble learning models to analyze and forecast customer churn in the telecommunications business.
Accurate churn forecasting is essential for successful client retention initiatives to combat regular
customer churn. We used innovative and improved machine learning methods, including Decision
Trees, Boosted Trees, and Random Forests, to enhance model interpretability and prediction accuracy.
The models were trained and evaluated systematically by using a large dataset. The Random Forest
model performed best, with 91.66% predictive accuracy, 82.2% precision, and 81.8% recall. Our results
highlight how well the model can identify possible churners with the help of explainable AI (XAI)
techniques, allowing for focused and timely intervention strategies. To improve the transparency of
the decisions made by the classifier, this study also employs explainable artificial intelligence methods
such as LIME and SHAP to illustrate the results of the customer churn prediction model. Our results
demonstrate how crucial it is for customer relationship managers to implement strong analytical
tools to reduce attrition and promote long-term economic viability in fiercely competitive market-
places. This study indicates that ensemble learning models have strategic implications for improving
Citation: Chang, V.; Hall, K.; Xu, Q.A.; consumer loyalty and organizational profitability in addition to confirming their performance.
Amao, F.O.; Ganatra, M.A.; Benson, V.
Prediction of Customer Churn Keywords: customer churn prediction; machine learning; explainable AI; ensemble learning; predictive
Behavior in the Telecommunication analytics
Industry Using Machine Learning
Models. Algorithms 2024, 17, 231.
https://doi.org/10.3390/a17060231
Even though mobile phones account for over 75% of all potential phone calls world-
wide, the mobile telephone market is one of the most rapidly growing segments of the
telecom sector. For the same reasons that every other competitive market has witnessed
a movement in the competitive landscape from customer acquisition to customer reten-
tion, retail has also seen a shift in the means of competition from customer acquisition to
customer retention. In the telecom industry, churn refers to a company losing customers
to other service providers [3]. Like many businesses that work with long-term clientele,
telecom firms utilize customer churn research as one of their primary business indicators [4].
According to a survey presented by [5], 30–35% of clients leave their telecom company
annually post-COVID. This churn rate may continue to rise with market growth or the
emergence of new, big telecom players in the future. In addition, for cellular corporations,
client acquisition costs can be comparable to 5–10 times the amount spent on customer
retention or satisfaction costs [6].
Therefore, it is essential to learn the reasons why customers leave to reduce the harm
churn has on a business’s bottom line. Predictive marketing analytics and churn analysis
can help identify factors influencing consumers’ voluntary churn by using advanced
machine learning algorithms [7]. By investigating the existing studies, it was discovered
that ensemble learning, including classical machine learning algorithms like Random Forest,
Decision Trees, and Naïve Bayes [8–10], and deep learning methods, such as particle-
classification-optimization-based BP networks [11] and Deep-BP-ANN [12], have been
employed to improve the accuracy of churn prediction.
However, the models should not only focus on accuracy in predicting churning [9]
but also be comprehensible, which means it should provide reasons for churning so that
experts can validate its results and check that it predicts intuitively and correctly. If the
company had understandable and transparent models to work with, it would increase the
understanding of what was causing the churn and how to enhance customer happiness to
boost retention.
As a result, this research aims to construct an accurate, efficient, responsible, and
explainable prediction model for customer attrition in the telecom industry using ensemble
learning methods. Various data mining methods like Decision trees, Random Forests,
and Logistic Regression were utilized by experts to build the predictive model. The
performance of the models was evaluated using the accuracy measures, area under the
curve, and sensitivity and specificity measures. In addition, this study also aims to improve
the interpretability of customer churn predictive models through the use of LIME and
SHAP to provides decision-makers with an overall explanation of the factors affecting
the customer’s decision to churn, as well as a specific analysis for every single customer.
Following a brief introduction to customer churn amplification in the telecom business, the
existing literature is reviewed and summarized in detail.
The remainder of this paper is structured as follows: Section 2 focuses on reviewing
the contribution of predictive marketing to Customer Relationship Management, different
machine learning-based churn analysis models for the telecom industry, as well as the
explainable AI. In Section 3, the most acceptable attributes are determined, and the tech-
niques used in this study are introduced. Section 4 is concerned with the research analysis
and findings. Finally, the conclusion and several alternate interpretations are presented in
Section 5.
2. Literature Review
The current section is divided into three distinct components. The first section will
introduce Customer Relationship Management (CRM) with its core ideas. The next part
discusses the most well-known and significant ensemble learning models in CRM, which
were also employed in the model design phase of this examination. Finally, the third
segment discusses current research on the importance of explainability and transparency
of the methods used in this regard.
Algorithms 2024, 17, 231 3 of 21
the purpose of extracting insights from social media data. Random Forest outperformed
the other models used with 98.46% accuracy and K-means helped to realize the eWoM
communication balance.
While gaining new clients is necessary for a company to develop, customer retention
should not be disregarded. Companies employing CRM strategies should take particular
care to analyze the factors contributing most to improving customer retention rates. Ac-
cording to [21], such key factors involve the interplay between relationship management
practices, such as customer trust, employee commitment, and conflict handling. However,
it is also noted that more research needs to be conducted in these areas. Predictive analytics
not only enhances segmentation techniques but also enables the implementation of churn
analysis, which predicts the likelihood of customers leaving the company for competitors.
This information helps the company take initiative-taking measures to prevent customer
churn. Predictive analytics algorithms can estimate the likelihood of customers switching
to competitors and identify the factors that contribute to that probability.
these ensemble models with more classical models such as Naïve Bayes. From their results,
they found ensemble classifiers performed favorably, with Random Forest returning the
highest accuracy of 91.66% and all other ensemble classifiers returning results above 90%.
Researchers [19] used a particle classification optimization-based BP network for tele-
com customer churn prediction. The PBCCP algorithm is based on particle classification
optimization and particle fitness calculation. In this case, particles refer to vectors contain-
ing the thresholds and weights within a BP neural network. The particles are classified
into categories using their fitness values and are updated using distinct equations. They
found that increasing the number of layers can improve the performance of the algorithm
at the cost of training time. As a result of this, they opted to use one hidden layer in the
neural network. They used a balanced dataset made up of 50% churn customers and 50%
non-churn customers, resulting in the PBCCP network returning an overall accuracy of
73.3%. By comparison, the PSO-BP network returned an overall accuracy of 69.6% and the
BP model had 63.6%.
Similarly, researchers [9] devised the Logit Leaf model, an ensemble model using
aspects of Logistic Regression and Decision Trees, and compared their results with standard
Decision Tree, Logistic Model Tree, Logistic Regression, and Random Forest models. The
AUC performance criteria were one of the metrics used to evaluate performance—a metric
commonly used for evaluating the performance of binary classification systems, such as
customer churn prediction. In this regard, the Logit Leaf model performed the best, slightly
better than Random Forest. The same was true when the models were evaluated using the
TDL (10%) performance criteria. The Logit Leaf model was also more efficient at making
predictions, taking less time than the other models used. When real-time predictions are
required, this can be a crucial factor when deciding which model to use for a given problem.
Finally, scientists [10] applied gravitational search algorithms to perform effective
feature selection, resulting in a dimensionality reduction in the dataset used for customer
churn prediction. After this, they applied a selection of ML models for comparison: Logistic
Regression, Naïve Bayes, Support Vector Machine, Decision Trees, Random Forest, and
Extra Tree Classifiers and boosting algorithms such as Adaboost, XGBoost, and CatBoost.
They found that, when comparing the models used, the boosting algorithms performed the
best, with CatBoost achieving the highest accuracy, recall, and precision with scores of 81.8%,
82.2%, and 81.2%, respectively. When comparing their AUC scores, boosting algorithms
performed the best again, with XGBoost and Adaboost scoring the highest at 84%.
One of the most important applications of ML with regard to CRM is to give companies
the ability to predict customer churn. This is particularly important given that retaining
existing customers is significantly more valuable than acquiring new customers, as more
resources are needed for new customer acquisition. It follows, therefore, that customer
retention is one of the key priorities of the CRM strategy [32]. The ability to harness ML to
significantly improve customer churn rates is of particular importance.
In this study, we will make use of predictive machine learning models to identify clients
who are likely to churn after categorizing current customers using clear machine learning.
3. Research Methodology
The telecommunications sector has long struggled with churn. This research aims to
create and deploy a cost-effective system for predicting client churn in the telecommunica-
tions sector. Addressing this issue is expected to yield a deeper comprehension of churning
customers, enabling the identification of such customers and providing a foundation for
The telecommunications sector has long struggled with churn. This research aims to
create and deploy a cost-effective system for predicting client churn in the telecommuni-
cations sector. Addressing this issue is expected to yield a deeper comprehension of
churning customers, enabling the identification of such customers and providing a foun-
Algorithms 2024, 17, 231 7 of 21
dation for future initiatives aimed at reducing the sector’s churn rate. The methodology
section discusses both the research approach selected and the machine learning models
adopted.
future initiatives aimed at reducing the sector’s churn rate. Section 3 discusses both the
research approach selected and the machine learning models adopted.
3.1. The Crisp Model
The
3.1.CRoss-Industry
The Crisp ModelStandard Process for Data Mining (CRISP-DM) model is referred
to as a standardized way of obtaining
The CRoss-Industry Standarda Process
good process viaMining
for Data data mining across model
(CRISP-DM) businesses
is referred
and industries where dataway
to as a standardized and ofmodeling
obtainingare a priority
a good [44].
process Researchers
via data miningadvise
acrossthat, after and
businesses
twentyindustries
years of developing CRISP-DM,
where data and modelingthe areemphasis
a priorityis on Researchers
[44]. data scienceadvise
and the method-
that, after twenty
ologies should accommodate the need for data release, data architecting, data simulation,
years of developing CRISP-DM, the emphasis is on data science and the methodologies
and data acquisition
should [45]. Business
accommodate processes
the need for dataand demands
release, data can be centered
architecting, based
data on the and
simulation,
data-driven approach. In other words, we can check work progress, evaluate our outputs,
data acquisition [45]. Business processes and demands can be centered based on the data-
and make decisions
driven in real-time.
approach. By doing
In other words, weso,canour efficiency
check and accuracy
work progress, in our
evaluate tasks
our can and
outputs,
be significantly improved.
make decisions in real-time. By doing so, our efficiency and accuracy in our tasks can be
Thus, CRISP-DM
significantly models are an apparent methodological way of directing the re-
improved.
search’s procedure. The
Thus, CRISP-DM process diagram
models of the CRISP-DM
are an apparent modelway
methodological is depicted in Figure
of directing 1
the research’s
below.procedure. The process diagram of the CRISP-DM model is depicted in Figure 1 below.
Wedefine
We first first define a specific
a specific business
business problem,
problem, i.e., customer
i.e., customer churnchurn
in theintelecommuni-
the telecommunica-
tion industry, and set explicit goals to mitigate this problem through
cation industry, and set explicit goals to mitigate this problem through predictive predictive modeling.
Following the CRISP-DM framework, we collected and preprocessed customer data from
the telecom industry, focusing on the characteristics that may indicate customer churn.
Throughout the modeling phase, we applied various data mining techniques such as Deci-
sion Trees, Random Forests, and Logistic Regression. After modeling, we evaluated the
performance of the models using rigorous metrics such as accuracy and area under the
curve to ensure that they meet the operational requirements of the business environment.
Finally, we translated the findings from the models into actionable strategies aimed at
customer retention.
3.2. Dataset
The dataset used in this study is a publicly available, large dataset. The aim was to
find a sufficiently large and recent dataset regarding churn within the telecommunication
industry. The dataset in question was selected based on meeting certain search criteria—it
could meet a representative size and was presumed to contain relevant explanatory vari-
ables such as demographics. Finally, a Telecom CUSTOMER Churn dataset from the Maven
Analytics website platform was selected. It consists of customer activity data (features),
Algorithms 2024, 17, 231 8 of 21
along with a churn label specifying whether a customer canceled the subscription, which
we used to develop predictive models. The dataset consists of 7043 rows and 38 attributes.
The attributes consist of different pieces of information about the individual customer,
including the status of the customer’s subscription, which is categorized as churn and
not churn.
A high-quality dataset is required for further analysis; therefore, this study examined
each variable in the dataset for missing values. In order to select a predictive model with
optimal predictive accuracy, the research data was divided into two groups—the Train
(75%) and Test (25%) datasets—by putting into consideration the great ratio of 2:1 (non-
churner: churner) based on Wei and Chiu ’s [31] research. After testing data quality and
selecting variables, none of the observations or rows were deleted.
Observed 0 Observed 1
Estimated 0 TN FN
Estimated 1 FP TP
The four boxes in the classification table are all assigned a name: FN, FP, TP, and TN.
• TN stands for true negative. Here, the customers are observed as not being churners,
and the model has also classified the customers as non-churners.
• FP stands for false positive. Here, customers are observed as being non-churners, but
the model has classified the customers as churners.
• FN stands for false negative. Customers are observed as being churners, but the model
has classified the customers as non-churners.
• TP stands for true positive. Here, customers are both observed and classified as churners.
The Confusion Matrix is not only a visually advantageous way of measuring the
model’s ability to classify correctly. Several calculations can also be made based on the four
values. These calculations are all targets for more specific evaluations of the model and can,
therefore, be used to identify the model’s strengths and weaknesses.
TP + TN
Accuracy = . (1)
TP + TN + FP + FN
Error rate: As opposed to accuracy, the error rate is the proportion of incorrectly
classified observations out of all customers classified. The error rate is calculated as follows:
Specificity: This is the true negative rate. The rate, thus, indicates how large a pro-
portion of the customers estimated as non-churners are correctly classified. Specificity is
determined as follows:
Specificity = TN/(TN + FP). (3)
Sensitivity: This is the true positive rate. The rate indicates what percentage of the
customers were estimated as churners and classified correctly. Sensitivity is calculated
as below:
Sensitivity = TP/(TP + FN).
Figure 2.2.ROC
Figure ROCcurves.
curves.
The ROC curve in the graph to the left indicates a model that correctly predicts 50%
The ROC curve in the graph to the left indicates a model that correctly predicts 50%
of the time and, therefore, also indicates random classification. The graph in the middle
of the time
indicates and, with
a model therefore, alsopredictive
improved indicatesability
randombut classification.
that still leads to The graph in the middle
misclassification.
indicates a model with improved predictive ability but that still
In the graph to the right, the ROC curve indicates that the model classifies closeleads to misclassification
to perfect,
In the curve
as the graphisto theclose
very right,tothe
the ROC
uppercurve indicates that the model classifies close to perfect
left corner.
as the
AUCcurve is very
stands for close to the the
area under upper
curve leftand
corner.
measures the area under the Receiver
Operating
AUC stands for area under the curve andtomeasures
Characteristic (ROC) curve. It is one way obtain a more accurate
the area undermeasurement
the Receiver Op-
of the model’s performance. A perfect model will have an AUC
erating Characteristic (ROC) curve. It is one way to obtain a more accurate of 1, after which the
measurement
model’s quality decreases as the AUC value decreases.
of the model’s performance. A perfect model will have an AUC of 1, after which the
model’s quality
3.5. Explainable AIdecreases
Techniques as the AUC value decreases.
Due to the black-box nature of machine learning algorithms, the operating principles of
3.5. Explainable
the algorithms areAI Techniques
difficult to understand and cannot be easily explained to decision-makers
in theDue telecom
to the industry,
black-boxespecially
nature ifofthey do notlearning
machine have a computer
algorithms, science background.
the operating principles
As a result, although ML algorithms excel in terms of accuracy, they
of the algorithms are difficult to understand and cannot be easily explained to decision- may not gain the
trust of decision-makers
makers or users orespecially
in the telecom industry, be accepted in real-world
if they do not haveenterprise management.
a computer science back-
While aiming to address this issue, after evaluating the classifiers, this paper introduces the
ground. As a result, although ML algorithms excel in terms of accuracy, they may not gain
concept of explainable AI in the study and illustrates the results of the best customer churn
the trust of decision-makers or users or be accepted in real-world enterprise management
prediction models through two visualization tools, including LIME and SHAP.
While aiming
First, LocaltoInterpretable
address thisModel-Agnostic
issue, after evaluating
Explanationsthe classifiers,
(LIMEs) arethis paper introduces
employed to
the
optimize the local interpretability of the Decision Tree classifiers and Random best
concept of explainable AI in the study and illustrates the results of the Forestcustomer
churn prediction
classifiers. models
It emphasizes through
training twointerpretable
locally visualization tools,
models including
that LIME
may be used to and SHAP.
explain
First, Local Interpretable Model-Agnostic Explanations (LIMEs) are employed to op-
specific predictions and help decision-makers understand why a particular class was
predicted
timize thefor a certain
local instance [46].of the Decision Tree classifiers and Random Forest classi-
interpretability
fiers.However,
It emphasizesthe purely locallocally
training character of the LIMEmodels
interpretable explanation,
that maywhich
be only
usedpredicts
to explain spe-
how the present prediction will change in response to minute changes in the input values,
cific predictions and help decision-makers understand why a particular class was pre-
is one of its limitations. As a result, rather than serving as an interpretation of why the
dicted for a certain instance [46].
forecast was made in the first place, it acts more like a sensitivity study. Therefore, this
studyHowever,
also employs theanother
purelyexplanation
local character of the
technique, LIME
SHAP, explanation,
to show which
the overall only predicts
importance
how
of thethe present prediction will change in response to minute changes in the input values
factors.
is one The ofSHapley
its limitations.
AdditiveAs a result, (SHAP)
exPlanation rather than servingofas
is a method an interpretation
interpreting the resultsofofwhy the
forecast
any ML model was made in the first
by attributing place, it acts
an importance more
score likefeature
to each a sensitivity study.
in the data [48].Therefore,
In this this
study, summary plots are drawn to allow visualization of the importance of features and
their influence on prediction. The features are ranked according to the sum of the SHAP
values of all samples. The colors indicate the high and low SHAP values of the features,
i.e., blue indicates low importance and red indicates high importance. It makes use of the
SHAP values to show the impact distribution of each feature as well.
Algorithms 2024, 17, 231 11 of 21
From the over 4601 customer samples used in this research study, the number of
referrals is 1.95, with an SD of 2.96, which means that, on average, each customer referred
this telecom company to two friends. The average download volume in gigabytes to the
end of the second quarter is 26.12 GB, and the customer’s total charges for additional data
downloads for the same quarter is 8.94, with an SD of 28.62. Table 2 also shows details
of the customers’ charge history, where the average values are 3042.59, 2.16, 8.94, and
888.91 for total charges, total refunds, total extra data charges, and total long-distance
charges, respectively.
The bar chart presented below was computed to examine the distribution of the target
variable across the customer service call variable. It can be observed from the chart that
the dataset is unbalanced, with almost twice as many churning samples than not churning
samples. See Figure 3.
To examine the inter-correlation among the features, a correlation matrix was com-
puted and displayed in the form of a heatmap to indicate the pairwise relationship among
the call activities and features of the telecom customers. See Figure 4.
that the dataset is unbalanced, with almost twice as many churning samples than not
churning samples. See Figure 3.
The bar chart presented below was computed to examine the distribution of the tar-
get variable across the customer service call variable. It can be observed from the chart
Algorithms 2024, 17, 231 that the dataset is unbalanced, with almost twice as many churning samples than 12 not
of 21
churning samples. See Figure 3.
To examine the inter-correlation among the features, a correlation matrix was com-
puted and displayed in the form of a heatmap to indicate the pairwise relationship among
the call
Figure
Figure
activities
Barplot
3.3.Bar
and
plotfor
for features
customer
customer
of the
churn
churn
telecom customers. See Figure 4.
distribution.
distribution.
To examine the inter-correlation among the features, a correlation matrix was com-
puted and displayed in the form of a heatmap to indicate the pairwise relationship among
the call activities and features of the telecom customers. See Figure 4.
Figure 4.
Figure 4. Heatmap
Heatmap of
of Correlation Matrix.
Correlation Matrix.
109
Random Forest 670
88
641
157
Decision Tree 614
144
593
133
Naïve Bayes 587
171
617
115
KNN Model 440
318
635
195
Logistic Regression 584
174
555
FP TP FN TN
Figure5.5.Confusion
Figure ConfusionMatrix
Matrixvalues
valuesfor
forthe
thefive
fivealgorithms.
algorithms.
A A clustered
clustered bar bar chart was drawn drawn to toillustrate
illustratethetheperformance
performanceofofthese thesealgorithms
algorithms in
terms
in terms of of
these
thesefourfourmetrics.
metrics.RandomRandomForest appeared
Forest to have
appeared the highest
to have number
the highest of TP
number
and
of TPTN andandTN theand
lowest numbernumber
the lowest of FN andof FP,
FN indicating that it maythat
and FP, indicating be the best-performing
it may be the best-
model of the
performing five. Itofcan
model thebe observed
five. that
It can be 641 customers
observed that 641from the testfrom
customers dataset
the are
testcorrectly
dataset
are correctly classified as non-churner, 109 of the customers who
classified as non-churner, 109 of the customers who are non-churners are incorrectly clas- are non-churners are
incorrectly classified88asofchurners,
sified as churners, 88 of the
the customers thatcustomers
are churners thatare
aremisclassified
churners areas misclassified
non-churners, as
non-churners,
and 670 of theand 670 of the
customers arecustomers
classifiedare classified
correctly correctly as
as churners. Thechurners.
predictionTheaccuracy
prediction of
accuracy
the Random of theForest
Random Forest algorithm
algorithm was computed
was computed to be to
to be equal equal to 0.8694.
0.8694. This This implies
implies that
that 86.94%
86.94% of the
of the customers
customers areare correctlyclassified
correctly classifiedby bythe
theRandom
Random Forest model. model. Random
Random
Forest
Forest is effective in identifying both customers at risk of churn (TP) and those unlikelyto
is effective in identifying both customers at risk of churn (TP) and those unlikely to
churn
churn(TN),
(TN),while
whileavoiding
avoidingmisclassification
misclassification ofofnon-churning
non-churning customers,
customers, which
whichcancan
lead to
lead
customer
to customer dissatisfaction,
dissatisfaction, or churning
or churningcustomers
customersas non-churning
as non-churning and missed opportunities
and missed opportu-
to taketo
nities action to retain
take action to them,
retain leading to loss to
them, leading of loss
profit.
of profit.
InIn terms
terms of ofthetheTP TPmetric,
metric,Decision
Decision Tree
Treewaswasthethesecond-best
second-bestchoicechoiceafter
afterRandom
Random
Forest,
Forest, indicating that it was excellent at identifying customers at risk of churn (TP: 614).
indicating that it was excellent at identifying customers at risk of churn (TP: 614).
However,
However,its itsrelatively
relativelylower lowerTN TNvalues
valuesof of593
593compared
comparedto tothe
theother
otheralgorithms
algorithmsindicate
indicate
that
thatititperforms
performspoorly poorlyin inidentifying
identifyingnon-churning
non-churningcustomers.
customers.Furthermore,
Furthermore,its itsrelatively
relatively
low FN values (144) and high FP
low FN values (144) and high FP values (157) confirm that the Decision Treea has
values (157) confirm that the Decision Tree has tendency
a ten-
to classify
dency customers
to classify as thoseasatthose
customers risk of
at churn. Therefore,
risk of churn. telecom
Therefore, companies
telecom may want
companies may
to carefully assess the accuracy of the algorithm’s positive predictions before taking any
retention action.
Naïve Bayes and Logistic Regression performed similarly with moderate levels of
performance, with high numbers of true negatives (TNNB: 617; TNLR: 555) and true
positives (TPNB: 587; TPLR: 584) as well as high numbers of false negatives (FNNB: 171;
FNLR: 174) and false positives (FPNB: 133; FPLR: 195). This suggests that these algorithms
can effectively identify customers who are likely to churn (TP) and those who are unlikely
to churn (TN), but it may also misclassify some customers as either those who will not
churn (FN) or those who will churn (FP).
The KNN model is the worst performer of all the algorithms. Although it performs
well in identifying non-churning customers, as evidenced by the high number of TNs
(635 customers), it performs rather poorly in identifying customers at risk of churning (TP:
440 customers). In addition, it has a high number of FN (318 customers) and a relatively
low number of FP (115 customers) compared to the other models, which could indicate
a tendency to misclassify positive instances (i.e., customers who will churn) as negative
Algorithms 2024, 17, 231 14 of 21
(i.e., customers who will not churn). This could potentially lead to missed opportunities for
telecom companies to take retention action.
The area under the Receiver Operating Characteristic (ROC) curve is a metric widely
used to assess the overall performance of binary classification models. The results of
AUC-ROC for all algorithms tested in this study are shown in Figure 6. Random Forest
had the highest score of 0.95, followed closely by Naïve Bayes, with a score of 0.88. These
two models were the most effective at predicting customer churn in the telecom industry in
Algorithms 2024,this
17, xstudy.
FOR PEER REVIEW
Logistic Regression also performed well, with an AUC score of 0.84, indicating 15
that it is a reliable algorithm for predicting customer churn. KNN and Decision Tree had
lower AUC scores of 0.81 and 0.8, respectively, suggesting that they may be less accurate in
distinguishing between the customers at risk of churn and customers who will not churn.
See Figure 6.
(a) Logistic Regression Model. (b) KNN Model. (c) Naïve Bayes Model.
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
Logistic regression KNN Naïve bayes Decision tree Random forest
(a)
(b)
Figure 8. Local
Figure 8. Local explanation for explanation for (a) the
(a) the first sample first sample
(upper) and (b) (upper) andsample
the second (b) the (lower)
second of
sample
the (lower)
Random Forest Random
classifier.Forest classifier.
However,
However, the LIME the LIME
explanation explanation
only onlythe
predicts how predicts how
present the present
prediction willprediction
change will ch
in response toinminute
response to minute
changes changes
in the in the input
input values. values. this
Therefore, Therefore, this study
study also employsalso employ
other explanation
another explanation technique,
technique, SHAP, SHAP,
to show to showimportance
the overall the overallofimportance
the factors.of the factors
The
results
results are shown arefollowing
in the shown in subsection.
the following subsection.
4.4.2. SHapley4.4.2.
Additive exPlanation
SHapley Additive(SHAP)
exPlanation (SHAP)
The summaryThe plots for the random
summary plots forforest classifierforest
the random computed using
classifier the SHAP
computed tech-
using the SHAP
nique are shown in Figure 9. In terms of the overall importance of the features, ‘Contract’,
nique are shown in Figure 9. In terms of the overall importance of the features, ‘Con
‘Number of Referrals’,
‘Number‘Tenure in Months’,
of Referrals’, ‘Monthly
‘Tenure Charge’,
in Months’, and ‘Online
‘Monthly Security’
Charge’, are the Securit
and ‘Online
five factors that make the most contribution to the prediction outcomes of the Random
the five factors that make the most contribution to the prediction outcomes of the Ra
Forest classifier. To beclassifier.
Forest specific, high
To bevalues of ‘Contract’,
specific, high values ‘Number of Referrals’,
of ‘Contract’, ‘Number ‘Tenure in
of Referrals’, ‘T
Months’, and ‘Online Security’
in Months’, increaseSecurity’
and ‘Online the possibility
increaseof customer churnofpredicted
the possibility customerby the predict
churn
classifiers, while
thethe low value
classifiers, of ‘Monthly
while Charge’
the low value increases Charge’
of ‘Monthly the possibility.
increases the possibility.
Algorithms 2024, 17, x FOR PEER REVIEW 18 of 22
Algorithms 2024, 17, x FOR PEER REVIEW
Algorithms 2024, 17, 231 17 of 21 18
In addition,
In addition, toto provide
provide aa global
In addition, global explanation,
explanation,
to provide a global SHAP
SHAP values can
values
explanation, can
SHAP alsovalues
also be
be used canto
used explore
toalso
explore
be used to exp
the prediction
the prediction mademade for an individual
for an individual
the prediction sample.
made forsample. Therefore,
Therefore,
an individual sample. this paper
thisTherefore, computes
paper computesthis paper the SHAP the SH
thecomputes
SHAP
explanation for
explanation the same two
forexplanation
the same two samples,
forsamples,
the sameas as
two investigated
samples, asusing
investigated using LIME in
LIME
investigated inusing
Section
SectionLIME 4.4.1,
4.4.1, and
and
in Section 4.4.1,
conducts aa comparison.
conducts comparison.
conducts a comparison.
According to to Figure
Figure 10,
According10,thetomost
the most
Figure significant
significant
10, the mostfactors
factors forfor
significant the the
firstfirst
factorscustomer
customer
for sample
the first that sample
sample
customer
push the prediction
that push the prediction toward churning
toward churning
push the prediction are ‘Payment Method’,
are ‘Payment
toward churning are ‘Payment ‘Age’,
Method’, and ‘Monthly
‘Age’,‘Age’,
Method’, Charge’,
and ‘Monthly
and ‘Monthly Cha
and the ones
Charge’, that
and the and push
ones thepush
thethat
ones prediction
thatthe toward
prediction
push the not churning
toward
prediction not churning
toward are
not ‘Contract’, ‘Online
are ‘Contract’,
churning are Secu-‘Online S
‘Online
‘Contract’,
rity’, ‘City’,
Security’, andrity’,
‘City’,‘Number
and‘City’, ofand
‘Number Referrals’.
of Although
Referrals’.
‘Number ‘Age’Although
Although
of Referrals’. is‘Age’
not included
is ‘Age’ isin
not includedthe
not top 10 groups
in the
included top 10top 10 gro
in the
of LIME,ofLIME
groups LIME,and
of SHAP
LIME
LIME, and are
LIME able
SHAP to
and SHAParecorroborate
ableareto each
corroborate
able to other’s
corroborate eachjudgment
other’s
each about
judgment
other’s the signif-
judgment about
about the si
icance
the and direction
significance and of
icance the direction
and factors
direction of thefor the
thefirst
offactors sample.
for
factors the theFor
forfirst the
sample.
first second
sample. ForForcustomer
the second
the sample,
second customer as sampl
customer
shown inasFigure
sample, shown 10,
shown inSHAP
Figure shows
in Figure10, 10, that
SHAP SHAPthe
shows ‘Number
showsthat of
the
that the Referrals’,
‘Number
‘Number of‘Contract’,
ofReferrals’,
Referrals’, ‘Number
‘Contract’, of ‘Numb
‘Contract’,
Dependents’,
‘Number and ‘Monthly
of Dependents’,
Dependents’, and Charge’
‘Monthly
and are Charge’
‘Monthly theCharge’
mostaresignificant
thethe
are most factors
most that factors
significant
significant push the
factors predic-
that
thatpush
push the pre
the
tionprediction toward
tion toward
toward churning, churning,
which which
churning, is consistent
which
is consistent LIMEwith
is consistent
with LIME
with
explanation explanation
LIME well. as well.
asexplanation as well.
(a) (a)
(b)
(b)
Figure 10. SHAP explanation for (a) the first sample (upper) and (b) the second sample (low
Figure 10.
Figure 10. SHAP
SHAP explanation
explanation
the Randomfor for(a)
(a)the
Forest thefirst
firstsample
classifier. sample (upper)
(upper) and
and (b)(b)
thethe second
second sample
sample (lower)
(lower) of
of the
the Random Forest classifier.
Random Forest classifier.
Algorithms 2024, 17, 231 18 of 21
In summary, the results of the model explanations presented by the LIME and SHAP
techniques, respectively, are consistent. By showing decision-makers in the telecom indus-
try how customer-related factors affect customer retention through these interpretation
figures, decision-makers can understand how AI makes predictions in a way they under-
stand, even if they do not understand the complex algorithms of AI.
churn model, which could act as an alert system for organizations and help them to spend
their retention money effectively.
Author Contributions: Conceptualization, V.C. and F.O.A.; methodology, V.C. and F.O.A.; software,
V.C., F.O.A. and M.A.G.; validation, V.C. and Q.A.X.; formal analysis, V.C. and K.H.; investigation,
V.C.; resources, V.C.; data curation, F.O.A. and M.A.G.; writing—original draft preparation, V.C. and
F.O.A.; writing—review and editing, V.C., K.H., Q.A.X. and V.B.; visualization, K.H., Q.A.X., F.O.A.
and M.A.G.; supervision, V.C.; project administration, V.C.; funding acquisition, V.C. All authors
have read and agreed to the published version of the manuscript.
Funding: This research is partly supported by VC Research (VCR 0000183) for Prof. Chang.
Data Availability Statement: The authors do not own the data.
Acknowledgments: Thank you to reviewers and editor for their reviews.
Conflicts of Interest: The authors declare no conflicts of interest.
References
1. Eklof, J.; Podkorytova, O.; Malova, A. Linking customer satisfaction with financial performance: An empirical study of Scandina-
vian banks. Total Qual. Manag. Bus. Excell. 2020, 31, 1684–1702. [CrossRef]
2. Madhani, P.M. Enhancing Customer Value Creation with Market Culture: Developing 7Cs Framework. IUP J. Manag. Res. 2018,
17, 46–64.
Algorithms 2024, 17, 231 20 of 21
3. Chouiekh, A. Deep convolutional neural networks for customer churn prediction analysis. Int. J. Cogn. Inform. Nat. Intell. 2020,
14, 1–16. [CrossRef]
4. Duval, A. Explainable Artificial Intelligence (XAI); MA4K9 Scholarly Report; Mathematics Institute, The University of Warwick:
Coventry, UK, 2019; pp. 1–53.
5. Reilly, J. The Machine Learning Revolution: Telco Customer Churn Prediction. Technical Paper. 2023. Available online:
https://www.akkio.com/post/telecom-customer-churn (accessed on 13 April 2024).
6. Yulianti, Y.; Saifudin, A. Sequential feature selection in customer churn prediction based on Naive Bayes. IOP Conf. Ser. Mater. Sci.
Eng. 2020, 879, 012090. [CrossRef]
7. Leung, C.K.; Pazdor, A.G.; Souza, J. Explainable artificial intelligence for data science on customer churn. In Proceedings of the
2021 IEEE 8th International Conference on Data Science and Advanced Analytics (DSAA), Porto, Portugal, 6–9 October 2021;
pp. 1–10.
8. Mishra, A.; Reddy, U.S. A comparative study of customer churn prediction in telecom industry using ensemble based classifiers.
In Proceedings of the 2017 International Conference on Inventive Computing and Informatics (ICICI), Coimbatore, India, 23–24
November 2017; pp. 721–725.
9. De Caigny, A.; Coussement, K.; De Bock, K.W. A new hybrid classification algorithm for customer churn prediction based on
logistic regression and decision trees. Eur. J. Oper. Res. 2018, 269, 760–772. [CrossRef]
10. Lalwani, P.; Mishra, M.K.; Chadha, J.S.; Sethi, P. Customer churn prediction system: A machine learning approach. Computing
2022, 104, 271–294. [CrossRef]
11. Yu, R.; An, X.; Jin, B.; Shi, J.; Move, O.A.; Liu, Y. Particle classification optimization-based BP network for telecommunication
customer churn prediction. Neural Comput. Appl. 2018, 29, 707–720. [CrossRef]
12. Fujo, S.W.; Subramanian, S.; Khder, M.A. Customer churn prediction in telecommunication industry using deep learning. Inf. Sci.
Lett. 2022, 11, 24.
13. Kampani, N.; Jhamb, D. Analyzing the role of e-crm in managing customer relations: A critical review of the literature. J. Crit.
Rev. 2020, 7, 221–226.
14. Nelson, C.A.; Walsh, M.F.; Cui, A.P. The role of analytical CRM on salesperson use of competitive intelligence. J. Bus. Ind. Mark.
2020, 35, 2127–2137. [CrossRef]
15. Al-Homery, H.; Asharai, H.; Ahmad, A. The core components and types of CRM. Pak. J. Human. Soc. Sci. 2019, 7, 121–145.
[CrossRef]
16. Calzada-Infante, L.; Óskarsdóttir, M.; Baesens, B. Evaluation of customer behavior with temporal centrality metrics for churn
prediction of prepaid contracts. Expert Syst. Appl. 2020, 160, 113553. [CrossRef]
17. Hendriyani, C.; Auliana, L. Transformation from relationship marketing to electronic customer relationship management: A
literature study. Rev. Integr. Bus. Econ. Res. 2018, 7, 116–124.
18. Chagas, B.N.R.; Viana, J.A.N.; Reinhold, O.; Lobato, F.; Jacob, A.F.; Alt, R. Current applications of machine learning techniques in
CRM: A literature review and practical implications. In Proceedings of the 2018 IEEE/WIC/ACM International Conference on
Web Intelligence (WI), Santiago, Chile, 3–6 December 2018; pp. 452–458.
19. Eachempati, P.; Srivastava, P.R.; Kumar, A.; de Prat, J.M.; Delen, D. Can customer sentiment impact firm value? An integrated
text mining approach. Technol. Forecast. Soc. Chang. 2022, 174, 121265. [CrossRef]
20. Lamrhari, S.; El Ghazi, H.; Oubrich, M.; El Faker, A. A social CRM analytic framework for improving customer retention,
acquisition, and conversion. Technol. Forecast. Soc. Chang. 2022, 174, 121275. [CrossRef]
21. Mahmoud, M.A.; Hinson, R.E.; Adika, M.K. The effect of trust, commitment, and conflict handling on customer retention: The
mediating role of customer satisfaction. J. Relat. Mark. 2018, 17, 257–276. [CrossRef]
22. De Caigny, A.; Coussement, K.; De Bock, K.W.; Lessmann, S. Incorporating textual information in customer churn prediction
models based on a convolutional neural network. Int. J. Forecast. 2020, 36, 1563–1578. [CrossRef]
23. De Caigny, A.; Coussement, K.; Verbeke, W.; Idbenjra, K.; Phan, M. Uplift modeling and its implications for B2B customer churn
prediction: A segmentation-based modeling approach. Ind. Mark. Manag. 2021, 99, 28–39. [CrossRef]
24. Seymen, O.F.; Dogan, O.; Hiziroglu, A. Customer Churn Prediction Using Deep Learning. In Proceedings of the 12th International
Conference on Soft Computing and Pattern Recognition (SoCPaR 2020). SoCPaR 2020; Advances in Intelligent Systems and Computing;
Springer: Cham, Switzerland, 2021; Volume 1383. [CrossRef]
25. Guo, W.W.; Xue, H. Crop yield forecasting using artificial neural networks: A comparison between spatial and temporal models.
Math. Probl. Eng. 2014, 2014, 857865. [CrossRef]
26. Shahi, T.B.; Shrestha, A.; Neupane, A.; Guo, W. Stock price forecasting with deep learning: A comparative study. Mathematics
2020, 8, 1441. [CrossRef]
27. Aldunate, Á.; Maldonado, S.; Vairetti, C.; Armelini, G. Understanding customer satisfaction via deep learning and natural
language processing. Expert Syst. Appl. 2022, 209, 118309. [CrossRef]
28. Raza, K. Improving the prediction accuracy of heart disease with ensemble learning and majority voting rule. In U-Healthcare
Monitoring Systems; Academic Press: Devon, UK, 2019; pp. 179–196.
29. Ahmad, I.; Yousaf, M.; Yousaf, S.; Ahmad, M.O. Fake news detection using machine learning ensemble methods. Complexity 2020,
2020, 8885861. [CrossRef]
Algorithms 2024, 17, 231 21 of 21
30. Sahoo, A.K.; Pradhan, C.; Das, H. Performance evaluation of different machine learning methods and deep-learning based
convolutional neural network for health decision making. Nat. Inspired Comput. Data Sci. 2020, 871, 201–212.
31. Wei, C.P.; Chiu, I.T. Turning telecommunications call details to churn prediction: A data mining approach. Expert Syst. Appl. 2002,
23, 103–112. [CrossRef]
32. Anees, R.T.; Nordin, N.A.; Anjum, T.; Cavaliere, L.P.L.; Heidler, P. Evaluating the Impact of Customer Relationship Management
(CRM) Strategies on Customer Retention. Bus. Manag. Strateg. 2020, 11, 117–133. [CrossRef]
33. Shrestha, Y.R.; Ben-Menahem, S.M.; Von Krogh, G. Organizational decision-making structures in the age of artificial intelligence.
Calif. Manag. Rev. 2019, 61, 66–83. [CrossRef]
34. Buchanan, B.G.; Wright, D. The impact of machine learning on UK financial services. Oxf. Rev. Econ. Pol. 2021, 37, 537–563.
[CrossRef] [PubMed]
35. Fernández-Rovira, C.; Valdés, J.Á.; Molleví, G.; Nicolas-Sans, R. The digital transformation of business. Towards the datafication
of the relationship with customers. Technol. Forecast. Soc. Chang. 2021, 162, 120339. [CrossRef]
36. Bauer, K.; von Zahn, M.; Hinz, O. Expl(AI)ned: The impact of explainable artificial intelligence on users’ information processing.
Inf. Syst. Res. 2023, 34, 1582–1602. [CrossRef]
37. Teng, Y.; Zhang, J.; Sun, T. Data-driven decision-making model based on artificial intelligence in higher education system of
colleges and universities. Expert Syst. 2023, 40, e12820. [CrossRef]
38. Djeffal, C. The Normative Potential of the European Rule on Automated Decisions: A New Reading for Art. 22 GDPR. Z.
Ausländisches Öffentl. Recht Völkerr. 2020, 81, 847–879.
39. Nkolele, R.; Wang, H. Explainable Machine Learning: A Manuscript on the Customer Churn in the Telecommunications Industry.
In Proceedings of the 2021 Ethics and Explainability for Responsible Data Science (EE-RDS), Johannesburg, South Africa, 27–28
October 2021; pp. 1–7.
40. Shapley, L.S. A value for n-person games. In Contributions to the Theory of Games; Princeton University Press: Princeton, NJ, USA,
1953; Volum 2, pp. 307–317.
41. Štrumbelj, E.; Kononenko, I. Explaining prediction models and individual predictions with feature contributions. Knowl. Inf. Syst.
2014, 41, 647–665. [CrossRef]
42. Gramegna, A.; Giudici, P. SHAP and LIME: An evaluation of discriminative power in credit risk. Front. Artif. Intell. 2021,
4, 752558. [CrossRef] [PubMed]
43. Kwon, O.; Lee, N.; Shin, B. Data quality management, data usage experience and acquisition intention of big data analytics. Int. J.
Inf. Manag. 2014, 34, 387–394. [CrossRef]
44. Provost, F.; Fawcett, T. Data science and its relationship to big data and data-driven decision making. Big Data 2013, 1, 51–59.
[CrossRef] [PubMed]
45. Martínez-Plumed, F.; Contreras-Ochando, L.; Ferri, C.; Hernández-Orallo, J.; Kull, M.; Lachiche, N.; Ramírez-Quintana, M.J.;
Flach, P. CRISP-DM twenty years later: From data mining processes to data science trajectories. IEEE Trans. Knowl. Data Eng.
2019, 33, 3048–3061. [CrossRef]
46. Hakkoum, H.; Idri, A.; Abnane, I. Artificial neural networks interpretation using LIME for breast cancer diagnosis. In Trends and
Innovations in Information Systems and Technologies; Springer: Cham, Switzerland, 2020; pp. 15–24.
47. Hutter, F.; Xu, L.; Hoos, H.H.; Leyton-Brown, K. Algorithm runtime prediction: Methods & evaluation. Artif. Intell. 2014, 206,
79–111.
48. Antwarg, L.; Miller, R.M.; Shapira, B.; Rokach, L. Explaining anomalies detected by autoencoders using Shapley Additive
Explanations. Expert Syst. Appl. 2021, 186, 115736. [CrossRef]
49. Lessmann, S.; Baesens, B.; Seow, H.V.; Thomas, L.C. Benchmarking state-of-the-art classification algorithms for credit scoring: An
update of research. Eur. J. Oper. Res. 2015, 247, 124–136. [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual
author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to
people or property resulting from any ideas, methods, instructions or products referred to in the content.