AI vs Traditional Fraud Detection Cost-Effectiveness
AI vs Traditional Fraud Detection Cost-Effectiveness
Business practice and various industry reports all show that automobile insurance
fraud is very common, which is why effective fraud detection is so important. In
our study, we investigate whether today’s widespread AI-based fraud detection
methods are more effective from a financial (cost-effectiveness) point of view than
methods based on traditional statistical-econometric tools. Based on our results, we
came to the unexpected conclusion that the current AI-based automobile insurance
fraud detection methods tested on a real database found in the literature are less
cost-effective than traditional statistical-econometric methods.
1. Introduction
The consequences of insurance fraud have a serious impact on the insurance sector.
Fraud creates distrust of the industry, causes economic damage and affects the
overall cost of living. The Insurance Information Institute (III) in the USA (III 2021)
reports that the total cost of insurance fraud in the USA between 2015 and 2019
amounted to between USD 38 billion and USD 83 billion per year. This means that
the average American family has an additional expenditure on insurance fraud
between USD 800 and USD 1,400 a year. The Association of British Insurers (ABI)
highlights that in 2020 the value of fraudulent claims in the UK was GBP 1.1 billion
(ABI 2021). Looking specifically at automobile insurance fraud, 7–10 per cent of
insurance policies in the USA and Western Europe, 10–20 per cent in the Central
* The papers in this issue contain the views of the authors which are not necessarily the same as the official
views of the Magyar Nemzeti Bank.
Botond Benedek: Babeş-Bolyai University, Cluj-Napoca, Romania, Assistant Professor.
Email: [Link]@[Link]
Bálint Zsolt Nagy: Babeş-Bolyai University, Cluj-Napoca, Romania, Associate Professor.
Email: [Link]@[Link]
The research was supported by the Hungarian Academy of Sciences under scientific research grant number
86/12/2022/HTMT.
The first version of the Hungarian manuscript was received on 2 November 2022.
DOI: [Link]
77
Botond Benedek – Bálint Zsolt Nagy
and Eastern European regions, and 18–20 per cent in China are affected (ABI 2021;
III 2019).
After a review of the relevant literature (Section 2), our theoretical framework is
presented (Section 3) in detail, together with the calculation method for cost savings
proposed by Benedek et al. (forthcoming). In Section 4, we focus on the selected
fraud detection methods and their cost-effectiveness, comparing traditional
statistical and machine-learning-based fraud detection methods, and present the
results of our detailed sensitivity analysis prepared using heatmaps. In the final
section, the conclusions are drawn.
78 Study
Traditional versus AI-Based Fraud Detection
Belhadji et al. (2000) presented an “expert system” that assists insurance company
employees in decision-making. The tool is not directly applicable to a specific insurer
because the parameters used are derived from calculations based on industry data,
but it was an important step towards the data mining and artificial-intelligence-
based fraud detection models that are prevalent today.
The novel approach (discrete choice model) presented by Artı́s et al. (1999; 2002)
tested the effect of the characteristics of the insured and the circumstances of the
accident on the probability of committing fraud. In addition, these studies also
focused on the problem of misclassification. Due to the nature of the model used
and the characteristics of the real automobile insurance data series, fraudulent
79
Botond Benedek – Bálint Zsolt Nagy
claims had to be overweighted in the estimation. This paved the way for the
examination of asymmetric data series (such as automobile insurance fraud) using
various overweighting or underweighting techniques. In parallel, Viaene et al. (2002)
compared the performance of different fraud detection methods. The authors of
the study used only indicators for property damage, as these are the only ones
available at an early stage of the assessment process.
After Artı́s et al. (1999; 2002) opened the door to oversampling or undersampling
techniques and Viaene et al. (2002) introduced the use of early stage indicators,
several authors presented some form of classification method based on
oversampling or undersampling (especially for property damage). For example,
Pérez et al. (2005) compared the performance of their consolidated tree algorithm
with that of the well-known C4.5 algorithms on an oversampled real automobile
insurance database. Bermúdez et al. (2008) proposed an asymmetric logit model
that was able to handle unbalanced data sets. A few years later, the researchers
proposed two new approaches for the undersampling of the majority class to
improve the performance of classifiers in unbalanced datasets. In the first approach,
Sundarkumar et al. (2015) proposed the one-class support vector machine (OCSVM)-
based undersampling, while in the second approach Sundarkumar – Ravi (2015)
proposed the combined use of k-nearest neighbour (KNN) and OCSVM.
Šubelj et al. (2011) presented a novel expert system using social network analysis
to identify groups of fraudsters, rather than a few isolated cases of automobile
insurance fraud. Farquad et al. (2012) used a modified active-learning-based
approach in order to construct “if..., then” type rules from a support vector machine
“black box” for customer relationship management. Gepp et al. (2012) compared
the decision tree, survival analysis and discriminant analysis methodology with the
logistic regression used by Wilson (2009). The novelty of the approach proposed
by Tao et al. (2012) was that each insurance claim could be classified into two
categories (lawful and fraudulent) with two different probabilities at the same time.
80 Study
Traditional versus AI-Based Fraud Detection
The first economic and financial AI applications appeared in the field of corporate
bankruptcy prediction models: a combination of logistic regressions and factor
analysis was used by Hámori (2001), whose model had a classification accuracy
of 95.3 per cent. Virág – Kristóf (2005) applied a neural-network-based model for
bankruptcy prediction, using the advantage offered by multiple neural layers (4) and
the backpropagation algorithm. The accuracy of the results obtained with neural
networks exceeded the results obtained with linear discriminant analysis and logistic
regression by a few percentage points. Virág and Nyitrai (2013) were the first to
apply the support vector machine (SVM) method to data from Hungarian companies.
Using different kernel functions, they achieved 5-per cent better performance
with SVM than with neural networks. Virág and Nyitrai (2014) compared the
performance of ensemble methods, AdaBoost and bootstrap aggregating, using
C4.5 decision trees with data from nearly a thousand Hungarian companies between
2001 and 2012. Their results showed that bootstrap aggregating performed better,
but very slightly ahead of AdaBoost. Among the more recent applications, we
mention the study by Ágoston (2022), which applies SVM, bootstrap aggregating
and random forest algorithms to bankruptcy prediction using a sample of firms in
the Budapest and Pécs urban regions. Based on the accuracy of the out-of-sample
classification indicators, the random forest seems to be the winner.
81
Botond Benedek – Bálint Zsolt Nagy
Among the AI studies outside the bankruptcy forecast but within the economy, the
following are also worth mentioning: Muraközy (2018) argues that machine learning,
which focuses on prediction, and econometrics, which studies causal relations,
are not substitutes but rather complementary empirical disciplines. Farkas et al.
(2020) discusses the potential applications of machine learning in agriculture. The
application of AI can also be seen in the fields of business economics (management,
marketing): Benedek (1999) analyses the efficiency of marketing actions using
statistical and data mining methods, while Danyi (2019) looks at the likely effects
of artificial intelligence in pricing policies and strategies. Bánkúty-Balog (2020)
assesses the geo-economic impacts of the spread of AI in Hungary in the context
of international competitiveness. Finally, Csillag et al. (2022) used machine-learning-
based structural topic modelling (STM) to evaluate the prevalence of environmental
topics in the media.
Table 1
Binary classifier confusion matrix and performance indicators used in the evaluation
Predicted value
Benedek – Fraudulent
Benedek – Nagy tanulmány Nagy tanulmány képletek:
képletek: Lawful
Performance
indicators
1. táblázatban:
claim
1. táblázatban: Benedek – Nagy tanulmány képletek:
claim
Benedek – Nagy tanulmány képletek: 𝑇𝑇𝑇𝑇
𝑇𝑇𝑇𝑇
Sensitivity
1. táblázatban:
Benedek
1. táblázatban:Fraudulent –True
Nagy tanulmányFalse
positive képletek:
𝑇𝑇𝑇𝑇 negative
+ 𝐹𝐹𝐹𝐹
𝑇𝑇𝑇𝑇 + 𝐹𝐹𝐹𝐹
(TPR):
claim (TP) (FN) 𝑇𝑇𝑇𝑇
ek – Nagy tanulmány 1.
képletek: táblázatban: 𝑇𝑇𝑇𝑇
𝑇𝑇𝑇𝑇
𝑇𝑇𝑇𝑇
𝑇𝑇𝑇𝑇 + 𝐹𝐹𝐹𝐹
Actual 𝑇𝑇𝑇𝑇 + 𝐹𝐹𝐹𝐹 𝐹𝐹𝐹𝐹𝑇𝑇𝑇𝑇
+ 𝑇𝑇𝑇𝑇
ázatban: value 𝐹𝐹𝐹𝐹 + 𝑇𝑇𝑇𝑇 Specificity
𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇
+ 𝐹𝐹𝐹𝐹
Lawful False 𝑇𝑇𝑇𝑇
positive 𝑇𝑇𝑇𝑇
True negative 𝑇𝑇𝑇𝑇
(TNR):
𝑇𝑇𝑇𝑇 𝐹𝐹𝐹𝐹 + 𝑇𝑇𝑇𝑇
claim 𝑇𝑇𝑇𝑇(FP)
+ 𝐹𝐹𝐹𝐹 𝐹𝐹𝐹𝐹(TN)
+ 𝑇𝑇𝑇𝑇 𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇
+ 𝐹𝐹𝐹𝐹
𝑇𝑇𝑇𝑇 + 𝐹𝐹𝐹𝐹
𝐹𝐹𝐹𝐹 𝑇𝑇𝑇𝑇
+ 𝑇𝑇𝑇𝑇
𝑇𝑇𝑇𝑇 𝑇𝑇𝑇𝑇 𝑇𝑇𝑇𝑇
Precision 𝑇𝑇𝑇𝑇 Estimation𝑇𝑇𝑇𝑇 + accuracy
𝐹𝐹𝐹𝐹
𝐹𝐹𝐹𝐹 + 𝑇𝑇𝑇𝑇 Negative𝑇𝑇𝑇𝑇 +predictive
𝐹𝐹𝐹𝐹 𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇
+ 𝐹𝐹𝐹𝐹
(PPV): 𝑇𝑇𝑇𝑇 +(NPV):
value 𝐹𝐹𝐹𝐹 (ACC):
𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇
+ 𝐹𝐹𝐹𝐹
𝑇𝑇𝑇𝑇 𝑇𝑇𝑇𝑇 𝑇𝑇𝑇𝑇 + 𝑇𝑇𝑇𝑇
Performance 𝑇𝑇𝑇𝑇 + 𝑇𝑇𝑇𝑇 𝑇𝑇𝑇𝑇 + 𝐹𝐹𝐹𝐹
𝑇𝑇𝑇𝑇 + 𝐹𝐹𝐹𝐹 𝑇𝑇𝑇𝑇 + 𝐹𝐹𝐹𝐹 𝑇𝑇𝑇𝑇 + 𝐹𝐹𝐹𝐹 𝑇𝑇𝑇𝑇+ 𝑇𝑇𝑇𝑇 + 𝐹𝐹𝐹𝐹
indicators 𝑇𝑇𝑇𝑇 + 𝐹𝐹𝐹𝐹 + 𝑇𝑇𝑇𝑇 + 𝐹𝐹𝐹𝐹
F-score 𝑇𝑇𝑇𝑇 + 𝑇𝑇𝑇𝑇
𝐹𝐹𝐹𝐹
𝑇𝑇𝑇𝑇 𝑇𝑇𝑇𝑇 + 𝑇𝑇𝑇𝑇 !
(1 + 𝛽𝛽 ) ∗ 𝑇𝑇𝑇𝑇𝑇𝑇 ∗ 𝑃𝑃𝑃𝑃𝑃𝑃
(1 + 𝛽𝛽 !) ∗ 𝑇𝑇𝑇𝑇𝑇𝑇 ∗ 𝑃𝑃𝑃𝑃𝑃𝑃 𝑇𝑇𝑇𝑇 + 𝐹𝐹𝐹𝐹 + 𝑇𝑇𝑇𝑇 + 𝐹𝐹𝐹𝐹
𝑇𝑇𝑇𝑇 + 𝐹𝐹𝐹𝐹 𝑇𝑇𝑇𝑇 + 𝐹𝐹𝐹𝐹 + 𝑇𝑇𝑇𝑇 + 𝐹𝐹𝐹𝐹 𝛽𝛽 ! ∗𝑇𝑇𝑇𝑇 + 𝑇𝑇𝑇𝑇
𝑇𝑇𝑇𝑇𝑇𝑇 + 𝑃𝑃𝑃𝑃𝑃𝑃
𝛽𝛽 ! ∗ 𝑇𝑇𝑇𝑇𝑇𝑇 + 𝑃𝑃𝑃𝑃𝑃𝑃
𝑇𝑇𝑇𝑇++𝛽𝛽𝐹𝐹𝐹𝐹
(1 ! ) + 𝑇𝑇𝑇𝑇 + 𝐹𝐹𝐹𝐹
∗ 𝑇𝑇𝑇𝑇𝑇𝑇 ∗ 𝑃𝑃𝑃𝑃𝑃𝑃
𝑇𝑇𝑇𝑇 + 𝑇𝑇𝑇𝑇 (1 + 𝛽𝛽 !) ∗ 𝑇𝑇𝑇𝑇𝑇𝑇 ∗ 𝑃𝑃𝑃𝑃𝑃𝑃
Note: In the case of A F-score,
táblázatβ így
is a nézne
coefficientmajd ki: 𝛽𝛽! the
to adjust relative importance 𝛽𝛽 ! ∗of𝑇𝑇𝑇𝑇𝑇𝑇
precision
+ 𝑃𝑃𝑃𝑃𝑃𝑃and
A táblázat így nézne majd ki:𝑇𝑇𝑇𝑇 + 𝐹𝐹𝐹𝐹 + 𝑇𝑇𝑇𝑇 + 𝐹𝐹𝐹𝐹
sensitivity. ∗ 𝑇𝑇𝑇𝑇𝑇𝑇 + 𝑃𝑃𝑃𝑃𝑃𝑃 (1 + 𝛽𝛽 !) ∗ 𝑇𝑇𝑇𝑇𝑇𝑇 ∗ 𝑃𝑃𝑃𝑃𝑃𝑃
𝛽𝛽Várható
! ∗ 𝑇𝑇𝑇𝑇𝑇𝑇 +érték
𝑃𝑃𝑃𝑃𝑃𝑃
A táblázat(1 +így
𝛽𝛽 !)nézne 𝑃𝑃𝑃𝑃𝑃𝑃 ki: Várható érték
∗ 𝑇𝑇𝑇𝑇𝑇𝑇 ∗majd Te
A táblázat így nézne majd ki: 𝛽𝛽! ∗ 𝑇𝑇𝑇𝑇𝑇𝑇 + 𝑃𝑃𝑃𝑃𝑃𝑃 Jogtalan Teljesítmény-
Jogos
A theory
táblázat így nézne Jogtalan
majdcanki:
Jogos Várható érték mutatók
1
The methodology and of confusion matrices Várható
be traced back to értékkövetelés
the work of Green – Swets követelés
(1966). Te
követelés követelés Teljesítmény-
JogtalanVárható érték
Jogos
zat így nézne majd ki: Jogtalan Jogos S
Te
követelés mutatók
Szenzi5vitás
követelés
Várható követelés
érték Jogtalan Jogtalan
követelés
Valódi poziBv Jogos
ÁlnegaBv
82 Study Jogtalan Valódi poziBv ÁlnegaBv
Teljesítmény- (TPR): S
Jogtalan Jogos Követelés követelés
(TP) követelés
(FN)
Szenzi5vitás
Követelés (TP)Jogtalan (FN)
mutatók
Valódi poziBv 𝑇𝑇𝑇𝑇
ÁlnegaBv
követelés Tényleges
Jogtalan Valódi poziBv
követelés ÁlnegaBv (TPR): S
Tényleges Követelés (TP) (FN)+ 𝐹𝐹𝐹𝐹
𝑇𝑇𝑇𝑇
KövetelésÉrték (TP) Jogtalan Valódi
(FN)poziBv
Szenzi5vitás ÁlnegaBv
𝑇𝑇𝑇𝑇 S
Traditional versus AI-Based Fraud Detection
Various performance indicators can be derived from the confusion matrix. The most
widely used measures of classifier performance are accuracy (ACC), sensitivity (TPR),
specificity (TNR) and F-score. However, these measures also have their limitations,
especially on asymmetric data sets such as automobile insurance fraud. A detailed
description of each performance indicator and a discussion of possible limitations
can be found in the work of Benedek et al. (forthcoming).
However, from a business perspective, one possible way to overcome all the
problems with performance indicators is to quantify the operating costs of individual
classifiers rather than looking at the performance of different classifiers. This
approach allows for easy comparability and can take into account the costs of
various misrepresentations. In addition, most insurers consider it more important
to minimise the costs of the detection process than to minimise the error rate of
the classifier.
To quantify the cost savings of a (semi-)automated fraud detection system, two key
factors need to be considered: (1) the cost of continued use of the systems; and
(2) the cost of operating the alternative system. Part of the cost of the ongoing use
of the systems is the cost of the manpower needed to carry out the new tasks of
the fraud analysis department. However, the most important item here is the cost
arising from false signalling by the system. If a lawful claim is deemed fraudulent by
the system, the insurer pays for the unnecessary investigation (because the system
only flags a potential fraudulent claim, but this has to be verified and proven by an
expert). Likewise, if a fraudulent claim is deemed lawful by the system, the insurer
pays the fraudster. Considering the large number of claims processed by insurers,
the costs of false signalling by the system can be very significant. In determining
the operational costs of an alternative system, Phua et al. (2004) suggest that the
alternative where the insurer takes no action to verify the legitimacy of claims and
simply pays out all claims should be considered. Thus, the approach to quantifying
the cost savings of any system (CSDM – cost saving of the decision method) given
by equation (1) proposed by Phua et al. (2004) is as follows:
where NA is the “no action cost”, i.e. the cost of the alternative where the insurer
takes no action to verify the legitimacy of the claims. Furthermore, the misses cost
(MC), false alarms cost (FAC), normals cost (NC) and hits cost (HC) are as follows:
MC = NFN * ACA;
NC = NTN * ACA;
HC = NTP * ACI,
83
Botond Benedek – Bálint Zsolt Nagy
where NFN is the number of false negative cases, NFP is the number of false positive
cases, NTN is the number of true negative cases, NTP is the number of true positive
cases, ACA is the average claim amount and ACI is the average cost per investigation.
Viaene et al. (2007) did not define the cost savings of a system, but its operating
costs (OC) given by equation (2); however, the way of defining the inputs is the
same as presented by Phua et al. (2004).
OC = MC + FAC + NC + HC (2)
What is important from a business perspective is that in both cases the authors work
under the assumption that true negative (TN) cases do not impose an additional
cost (i.e. the additional cost of a true negative case is 0) for insurers, since in these
cases it is about the normal claims process. However, during our interviews,2
industry experts highlighted that in practice these true negative cases also have
an additional cost. There is a similar discrepancy between business practice and the
literature when it comes to calculating the costs of true positive cases. According
to the literature, in true positive cases, the insurer does not pay the insured, i.e.
the only costs incurred are those related to the investigation. In business practice,
however, the situation is different. As several previous studies showed (e.g. Derrig
– Ostaszewski 1995; Weisberg – Derrig 1998), the vast majority of automobile
insurance fraud consists of so-called build-up3 claims. Our interviewees pointed
out that, contrary to the literature, in practice it is rare for an insurer to completely
refuse to pay. They usually offer less than the amount requested for identified build-
up claims. There are many reasons for this, such as the lengthy and costly court
process or negative marketing.
In view of the differences between the literature and the business practice
described above, we recommend the calculation method proposed by Benedek et
al. (forthcoming), given by equation (3), to determine the real costs of detecting
automobile insurance fraud:
2
e conducted three in-depth interviews with Romanian insurance company executives and experts from
W
multinational insurance companies on automobile insurance fraud. A 22-question questionnaire was
then prepared and sent to all partner institutions of UNSAR (the National Association of Insurance and
Reinsurance Companies in Romania).
3
Cases where the insured or the professional repairer claims more than the actual cost of the repair.
4
In this paper, we use the approach presented by Phua et al. (2004), but the costing method we propose
would also work perfectly well if we used the operating costs of an alternative system instead of the “no
action cost”.
84 Study
Traditional versus AI-Based Fraud Detection
where average administrative cost (AAC) and average savings in the case of
identified fraudulent claims (ASCIFC) are denoted.
4. Results
The logic behind the initially selected 24 journal articles and 3 conference papers
was twofold. On the one hand, only the studies indexed by the Web of Science
were considered, and on the other hand, we also kept in mind that we wanted to
compare the performance of models using a traditional statistical-econometric
approach from 1999–2012 with the performance of AI-based models from 2012–
2022 tested on real data sets. However, some of the 27 studies identified were
purely theoretical and offered no concrete fraud identification method. The authors
of other studies (e.g. Pathak et al. 2005; Padmaja et al. 2007; Bhowmik 2011; Xu et
al. 2011; Karamizadeh – Zolfagharifar 2016; Badriyah et al. 2018) conducted their
research without using real company datasets. Finally, there were several studies
in which the authors did not present the confusion matrix, so for these studies we
were not able to determine the inputs necessary for our costing method.
Taking into account the above limitations, there are only 12 studies left in our
sample with all the data needed to determine the cost-saving potential of each
model. In the 12 articles, the authors propose and compare a total of 35 different
methods, the full list of which can be found in Table 4 in the Appendix.
As the percentages of fraudulent claims in the analysed studies are different, the
sizes of the databases are very different, and, moreover, 2 of the 7 databases used
85
Botond Benedek – Bálint Zsolt Nagy
are from the United States, 1 from Canada, 2 from Spain, 1 from Russia and 1
from Slovenia, we first built a general framework where we assume that an insurer
processes 10,000 claims, of which 10 per cent are fraudulent. Metavariables such as
the average cost per investigation or the average claim amount were determined on
the basis of the questionnaire survey mentioned earlier. The questionnaire was fully
completed by five Romanian insurance companies with a combined market share of
nearly 70 per cent. In this study, we used a market-share-weighted average of the
values provided by the five insurers. They showed an average cost per investigation
of USD 145, an average claim amount of USD 2,420, an average saving of USD 485
for identified fraudulent claims, and an average administrative cost of USD 12.
Table 2 summarises the cost-saving potential of the 35 methods for three different
scenarios. Rows 2 to 7 of the table show the input parameters of the given scenario.
These are the input meta-parameters whose values come from industry experts
and which are always constant for each classical statistical or AI-based method.
Row 8 is the most important row, the output, since it is obtained by interacting
and processing the meta-parameters with specific algorithm parameters. That
is, the final operating cost of an algorithm is equal to the number of claims in
the different categories (false positive, false negative) defined by the confusion
matrix multiplied by the constant value of the meta-parameter (average cost per
investigation, average claim amount) associated with that category. In economic
language, row 8 shows how many of the 35 methods had a higher operating cost
than that of the alternative, i.e. if the insurer did not investigate the validity of the
claims and simply paid out the claims received. Counter-intuitively, the best-case
scenario here is the one with the highest rate of fraudulent claims, since in this case
even a less efficient method can achieve higher cost savings.
Table 2
Cost-effectiveness of methods used to identify fraudulent claims
Most likely Worst case
Best scenario
scenario scenario
35 models 35 models 35 models
Proportion of fraudulent claims (%) 10 5 20
Average claim amount (USD) 2,420 2,420 2,420
Average cost per investigation (USD) 145 193 97
Average administrative cost (USD) 12 12 12
Average savings for identified fraudulent
485 315 1,213
claims (USD)
Number of methods with an operating cost
27 31 0
higher than the “no action cost”
Note: For the worst and best case scenarios, we used the extreme values provided by the insurance
companies.
86 Study
Traditional versus AI-Based Fraud Detection
We emphasise that the data summarised in Table 2 well illustrate the importance of
the proposed cost savings calculation method from a business perspective. While
the cost-saving calculation method proposed by Phua et al. (2004) classifies almost
all models as cost-effective, our proposed method (which takes into account the
costs incurred in the real fraud detection process) classifies only 22.85 per cent of
the models as cost-effective even in the most likely scenario, while only 11.42 per
cent of the methods can be classified as cost-effective in the worst-case scenario,
compared to the 94.28 (and 68.57) per cent of the methods proposed by Phua et
al. (2004).
In order to take into account as much as possible the specific characteristics of the
fraud detection methods and to perform the meta-analysis with a wide range of
input parameters, we ran 3 different simulations to investigate the performance of
the methods and created heat maps to visualize the results.
In the first simulation, a fixed investigation cost of USD 145 was assumed, while the
percentage of fraudulent claims and the average savings in the case of identified
fraudulent claims were varied. This approach can be very useful for insurance
companies that work with a fixed cost per investigation (for example, by hiring
a specialised external company to carry out the investigation and paying a pre-
determined price for each claim), as they can easily decide which method is the
most efficient for them in the given market circumstances. For example, if an
insurance company is unable to use the fraud detection methods proposed by
Tao et al. (2012) or Bermúdez et al. (2008) because it does not have the input
parameters necessary to apply the model, but operates in a market with a high
percentage of fraudulent claims and low average savings in the case of identified
fraudulent claims, the multinomial logit model proposed by Artís et al. (1999)
may be an optimal choice (Figure 1), as it performs almost as well as the method
proposed by Tao et al. (2012). Likewise, any insurance company can easily choose
the most appropriate method based on the percentage of fraudulent claims and
the average savings in the case of fraudulent claims. For companies operating in
a market with a low percentage of fraudulent claims and low average savings, the
method proposed by Zelenkov (2019) seems to be better than the one proposed
by Sundarkumar et al. (2015), see Figure 2. 87
Botond Benedek – Bálint Zsolt Nagy
Figure 1
Cost-saving ability of the models proposed by Tao et al. (2012) and Artís et al. (1999)
on a heat map
30 30
20 20
Ran
Ran
k
k
g
g
10 10
vin
vin
sa
sa
Fr Fr
au au
st
st
dr dr
co
co
at at
e
e
e e
ag
ag
er
er
Av
Av
1 1
Note: The cost-saving ability of the linear discriminant analysis model proposed by Tao et al. (2012) and
the multinomial logit model proposed by Artís et al. (1999) compared to the cost-saving ability of the 35
models analysed under different scenarios.
Figure 2
Cost-saving ability of the models proposed by Zelenkov (2019) and Sundarkumar et al.
(2015) on a heat map
30 30
20 20
Ran
Ran
k
k
ing
ing
10 10
av
av
Fr Fr
ts
ts
au au
s
dr dr
co
co
at at
e
e e
ag
ag
er
er
Av
Av
1 1
Note: The cost-saving ability of the example-dependent cost-sensitive AdaBoost (EDAB.C1) model
proposed by Zelenkov (2019) and the support vector machine model proposed by Sundarkumar et al.
(2015) compared to the cost-saving ability of the 35 models analysed under different scenarios.
88 Study
Traditional versus AI-Based Fraud Detection
For the second simulation, the savings from identified fraudulent claims were held
constant (USD 485) and the cost of investigation and the percentage of fraudulent
claims were varied. In the third simulation, the percentage of fraudulent claims
was held constant (10%) and the cost of investigation and the average savings in
the case of identified fraudulent claims were varied.
Perhaps the most interesting question in the study is whether AI-based detection
methods are significantly more cost-effective than traditional statistical-econometric
tools.
Obviously, AI and traditional statistical econometric methods are all parts of the
same discipline generically called data science, and as such, the boundary between
them is rather subjective and fluid, especially given the dynamic evolution of AI that
is taking place before our eyes. For example, most machine learning courses start
with the methodology of linear and logistic regression, which is also part of any
standard econometrics curriculum. However, in our study, the following distinction
was made: Any method developed after the emergence of the AI terminology in
the literature was considered an AI or machine learning method. Therefore, e.g.
linear and logistic regression as well as linear discriminant analysis were classified
in the traditional category (since they do not require big data or neural nets) while
genetic algorithms, neural nets, etc. were classified in the AI category.
As a first step, the differences in average cost savings between these two groups
of methods were calculated, and the statistical significance of the differences was
tested using the Mann–Whitney non-parametric test. These comparisons were
performed on a wide range of combinations of input parameters (10,780 in total),
resulting in a synthetic cross-tabulation between the average cost per test and the
average savings of the identified fraudulent claims.
Table 5 in the Appendix clearly shows that the average cost savings for the vast
majority of combinations are higher for traditional statistical methods5 (the
5
lthough it is not the purpose of this study to examine the implementation costs of traditional statistical
A
and AI methods, it is highly likely that the cost implications of traditional methods in this area are also lower,
which further supports the conclusions observed in Table 5 in the Appendix.
89
Botond Benedek – Bálint Zsolt Nagy
differences are positive and significant) than for AI-based methods, and we
concluded that, surprisingly, there is no justification for insurance companies to
invest heavily in AI-based fraud detection algorithms at this stage. This does not
mean, of course, that these companies do not need software support in their
operations, only that in most cases the traditional statistical software is sufficient.
5. Conclusions
In our research, we pointed out that there is a lack of literature examining the
cost-effectiveness of methods for detecting automobile insurance fraud. Moreover,
in the case of emerging markets, there is a complete lack of literature on the
detection of automobile insurance fraud. Therefore, in this study, we applied
the method proposed by Benedek et al. (forthcoming) to correctly calculate the
cost-saving potential of automobile insurance fraud identification. The proposed
method takes into account all costs incurred in a real fraud detection process
(with particular emphasis on the fact that in the case of a fraudulent or partially
fraudulent claim, the insurer will usually not deny payment completely but offer
partial compensation).
The most important limitation of the research, which is also an opportunity for
further development, is that the input parameters in the meta-analysis are based on
previous algorithms trained and tested on different datasets. The really convincing
proof would be to run the same algorithms one by one on the same sample.
90 Study
Traditional versus AI-Based Fraud Detection
References
Badriyah, T. – Rahmaniah, L. – Syarif, I. (2018): Nearest neighbour and statistics method based
for detecting fraud in auto insurance. International Conference on Applied Engineering
(ICAE), Batam, Indonesia, pp. 1–5. [Link]
Belhadji, E.B. – Dionne, G. – Tarkhani, F. (2000): A Model for the Detection of Insurance Fraud.
The Geneva Papers on Risk and Insurance - Issues and Practice, 25(4): 517–538. https://
[Link]/10.1111/1468-0440.00080
Benedek, B. – Ciumas, C. – Nagy, B.Z. (2022): Automobile insurance fraud detection in the
age of big data – a systematic and comprehensive literature review. Journal of Financial
Regulation and Compliance, 30(4): 503–523. [Link]
91
Botond Benedek – Bálint Zsolt Nagy
Bhowmik, R. (2011): Detecting auto insurance fraud by data mining techniques. Journal of
Emerging Trends in Computing and Information Sciences, 2(4): 156–162.
Derrig, R.A. – Ostaszewski, K.M. (1995): Fuzzy techniques of pattern recognition in risk
and claim classification. Journal of Risk and Insurance, 62(3): 447–482. [Link]
org/10.2307/253819
Farquad, M.A. – Ravi, V. – Raju, S.B. (2012): Analytical CRM in banking and finance using
SVM: a modified active learning-based rule extraction approach. International Journal of
Electronic Customer Relationship Management, 6(1): 48–73. [Link]
IJECRM.2012.046470
Green, D.M. – Swets, J.A. (1966): Signal detection theory and psychophysics (1 ed., Vol. 1).
New York: Wiley.
He, H. – Bai, Y. – Garcia, E. – Li, S. (2008): ADASYN: Adaptive synthetic sampling approach
for imbalanced learning. IEEE International Joint Conference on Neural Networks (IEEE
World Congress On Computational Intelligence), Hong Kong, pp. 1322–1328. [Link]
org/10.1109/IJCNN.2008.4633969
92 Study
Traditional versus AI-Based Fraud Detection
III (2019): Insurance Fact Book. Insurance Information Institute. Insurance Information
Institute. [Link]
III (2021): Background on: Insurance fraud. Insurance Information Institute. [Link]
org/article/background-on-insurance-fraud. Downloaded: 20 November 2021.
Karamizadeh, F. – Zolfagharifar, S.A. (2016): Using the Clustering Algorithms and Rule-based
of Data Mining to Identify Affecting Factors in the Profit and Loss of Third Party Insurance,
Insurance Company Auto. Indian Journal of Science and Technology, 9(7): 1–9. https://
[Link]/10.17485/ijst/2016/v9i7/87846
LII (2023): Insurance Fraud. Legal Information Institute, Cornell Law School. [Link]
[Link]/wex/insurance_fraud. Downloaded: 26 April 2023
Li, Y. – Yan, C. – Liu, W. – Li, M. (2018): A principle component analysis-based random forest
with the potential nearest neighbor method for automobile insurance fraud identification.
Applied Soft Computing, 70(September): 1000–1009. [Link]
asoc.2017.07.027
Nian, K. – Zhang, H. – Tayal, A. – Coleman, T. – Li, Y. (2016): Auto insurance fraud detection
using unsupervised spectral ranking for anomaly. The Journal of Finance and Data Science,
2(1): 58–75. [Link]
Padmaja, T.M. – Dhulipalla, N. – Bapi, R.S. – Krishna, P.R. (2007): Unbalanced data
classification using extreme outlier elimination and sampling techniques for fraud
detection. 15th International Conference on Advanced Computing and Communications
(ADCOM 2007), Guwahati, India, pp. 511–516. [Link]
93
Botond Benedek – Bálint Zsolt Nagy
Shaeiri, Z. – Kazemitabar, S.J. (2020): Fast unsupervised automobile insurance fraud detection
based on spectral ranking of anomalies. International Journal of Engineering, 33(7): 1240–
1248. [Link]
Šubelj, L. – Furlan, Š. – Bajec, M. (2011): An expert system for detecting automobile insurance
fraud using social network analysis. Expert Systems with Applications, 38(1): 1039–1052.
[Link]
Subudhi, S. – Panigrahi, S. (2017): Use of optimized Fuzzy C-Means clustering and supervised
classifiers for automobile insurance fraud detection. Journal of King Saud University
– Computer and Information Sciences, 32(5): 568–575. [Link]
jksuci.2017.09.010
Sundarkumar, G.G. – Ravi, V. (2015): A novel hybrid undersampling method for mining
unbalanced datasets in banking and insurance. Engineering Applications of Artificial
Intelligence, 37(January): 368–377. [Link]
94 Study
Traditional versus AI-Based Fraud Detection
Virág, M. – Nyitrai, T. (2013): Application of support vector machines on the basis of the
first Hungarian bankruptcy model. Society and Economy, 35(2): 227–248. [Link]
org/10.1556/SocEc.35.2013.2.6
Wang, Y. – Xu, W. (2018): Leveraging deep learning with LDA-based text analytics to detect
automobile insurance fraud. Decision Support Systems, 105(January): 87–95. [Link]
org/10.1016/[Link].2017.11.001
Weisberg, H. – Derrig, R. (1991): Fraud and Automobile Insurance: A Report on Bodily Injury
Liability Claims in Massachusetts. Journal of Insurance Regulation, 9(4): 497–541.
Wilson, J.H. (2009): An analytical approach to detecting insurance fraud using logistic
regression. Journal of Finance and Accountancy, 85(150): 1–15.
Xu, W. – Wang, S. – Zhang, D. – Yang, B. (2011): Random rough subspace based neural
network ensemble for insurance fraud detection. Fourth International Joint Conference on
Computational Sciences and Optimization, Kunming and Lijiang City, China, pp. 1276–1280.
[Link]
Yan, C. – Li, Y. (2015): The Identification Algorithm and Model Construction of Automobile
Insurance Fraud Based on Data Mining. Fifth International Conference on Instrumentation
and Measurement, Computer, Communication and Control (IMCCC), Qinhuangdao, China,
pp. 1922–1928. [Link]
95
Botond Benedek – Bálint Zsolt Nagy
Appendix
Table 3
Spearman’s rank correlation coefficients between rankings based on different
parameters
Negative
Total Estimation
Sensitivity Specificity Precision predictive F-score
savings accuracy
value
Total
1.000
savings
0.069
Sensitivity 1.000
(0.731)
0.831 –0.346
Specificity 1.000
(49.41)*** (–2.57)**
0.924 0.047 0.871
Precision 1.000
(24.56)*** (0.48) (19.15)***
Negative
0.254 0.951 –0.028 0.252
predictive 1.000
(2.47) (33.51)*** (–0.41) (2.78)**
value
Estimation 0.947 –0.081 0.957 0.942 0.135
1.000
accuracy (98.34)*** (–0.62) (38.93)*** (25.87)*** (1.57)
0.828 0.278 0.599 0.792 0.616 0.732
F-score 1.000
(19.11)*** (4.01)*** (6.85)*** (15.68)*** (6.29)*** (11.03)***
Note: The formula used to determine the negative predictive value is: TN/(FN+TN). Student t-statistics in
parentheses. *Significant at 10% level; **Significant at 5% level; ***Significant at 1% level.
96 Study
Traditional versus AI-Based Fraud Detection
Table 4
The 35 fraud detection methods tested and their sensitivity and specificity
Author Method Sensitivity Specificity
Artís et al. (1999) multinomial logit model 0.6614 0.9065
nested multinomial logit model 0.3209 0.8132
Belhadji et al. probit regression – threshold 10% 0.6940 0.9145
(2000) probit regression – threshold 20% 0.5373 0.9596
Artís et al. (2002) logit regression with omission error 0.7793 0.6994
logit regression without omission error 0.7703 0.7094
Bermúdez et al. Bayesian skewed logit model 0.8515 0.9968
(2008) standard logit and Bayesian logit models 0.8515 0.6043
Wilson (2009) logit regression 0.5918 0.8163
Šubelj et al. (2011) social network analysis 0.8913 0.8667
Tao et al. (2012) linear discriminant analysis 0.7392 0.9738
quadratic discriminant analysis 0.7933 0.9767
naive Bayesian 0.8351 0.9815
Farquad et al. MALBA (logistic) – 1,000 extra instances 0.8838 0.5534
(2012) MALBA (normal) – 1,000 extra instances 0.8811 0.5588
ALBA – 1,000 extra instances 0.8784 0.5656
MALBA – 1,000 extra instances 0.8848 0.5560
Sundarkumar et al. decision tree 0.9552 0.5658
(2015) multi-layer perceptron 0.4859 0.7889
support vector machine 0.9400 0.5639
probabilistic neural network 0.9173 0.5533
group method of data handling 0.7362 0.7148
Sundarkumar – probabilistic neural network 0.8750 0.5894
Ravi (2015) multi-layer perceptron 0.6458 0.7189
decision tree 0.9074 0.5869
group method of data handling 0.5686 0.8020
support vector machine 0.9189 0.5839
Subudhi – GAFCM – DT 0.6625 0.8765
Panigrahi (2017) GAFCM – SVM 0.6970 0.8471
GAFCM – MLP 0.6107 0.8400
GAFCM – GMDH 0.5727 0.7976
Zelenkov (2019) example-dependent cost-sensitive Ada-Boost (EDAB.C1) 0.2510 0.9301
example-dependent cost-sensitive Ada-Boost (EDAB.C2) 0.5900 0.7327
example-dependent cost-sensitive Ada-Boost (EDAB.
0.4477 0.8050
C2-ROC)
example-dependent cost-sensitive Ada-Boost (EDAB.C3) 0.2510 0.9301
Note: indicated in bold for traditional statistical econometric models
97
Botond Benedek – Bálint Zsolt Nagy
Table 5
Average cost savings differences between traditional statistical and AI-based identifi-
cation methods
ACI
ASCIFC
100 110 120 130 140 150 160 170 180 190 200
160 73,100 87,426 101,753 116,079 130,405 144,732 159,058 173,384 187,711 202,037 216,363
(46)*** (45)*** (47)*** (48)*** (50)*** (51)*** (51)*** (51)*** (53)*** (53)*** (53)***
180 73,283 87,610 101,936 116,262 130,589 144,915 159,241 173,568 187,894 202,221 216,547
(45)*** (45)*** (46)*** (47)*** (48)*** (50)*** (51)*** (51)*** (51)*** (53)*** (53)***
200 73,467 87,793 102,120 116,446 130,772 145,099 159,425 173,751 188,078 202,404 216,730
(46)*** (46)*** (45)*** (46)*** (47)*** (48)*** (50)*** (51)*** (49)*** (51)*** (51)***
220 73,650 87,977 102,303 116,629 130,956 145,282 159,608 173,935 188,261 202,588 216,914
(44)*** (46)*** (46)*** (45)*** (46)*** (47)*** (48)*** (50)*** (51)*** (50)*** (51)***
240 73,834 88,160 102,487 116,813 131,139 145,466 159,792 174,118 188,445 202,771 217,097
(49)*** (46)*** (46)*** (45)*** (45)*** (47)*** (47)*** (48)*** (50)*** (50)*** (51)***
260 74,017 88,344 102,670 116,996 131,323 145,649 159,975 174,302 188,628 202,955 217,281
(49)*** (43)*** (47)*** (46)*** (45)*** (45)*** (47)*** (47,5)*** (48)*** (50)*** (50)***
280 74,201 88,527 102,853 117,180 131,506 145,833 160,159 174,485 188,812 203,138 217,464
(44)*** (49)*** (46)*** (46)*** (45)*** (45)*** (46)*** (47)*** (48)*** (48)*** (50)***
300 74,384 88,711 103,037 117,363 131,690 146,016 160,342 174,669 188,995 203,322 217,648
(42)*** (46,5)*** (43)*** (47)*** (46)*** (45)*** (45)*** (46)*** (47)*** (48)*** (48)***
320 74,568 88,894 103,220 117,547 131,873 146,200 160,526 174,852 189,179 203,505 217,831
(41)*** (48)*** (47)*** (46)*** (45)*** (46)*** (45)*** (45)*** (46)*** (47)*** (48)***
340 74,751 89,078 103,404 117,730 132,057 146,383 160,709 175,036 189,362 203,689 218,015
(42)*** (44)*** (47)*** (43)*** (46)*** (46)*** (45)*** (45)*** (45)*** (47)*** (47)***
360 74,935 89,261 103,587 117,914 132,240 146,567 160,893 175,219 189,546 203,872 218,198
(43)*** (42)*** (49)*** (46)*** (46)*** (45)*** (46)*** (45)*** (45)*** (46)*** (47)***
380 75,118 89,445 103,771 118,097 132,424 146,750 161,076 175,403 189,729 204,056 218,382
(47)*** (41)*** (46)*** (49)*** (44)*** (46)*** (46)*** (46)*** (45)*** (45)*** (46)***
400 75,302 89,628 103,954 118,281 132,607 146,934 161,260 175,586 189,913 204,239 218,565
(50)*** (42)*** (44)*** (46,5)*** (44)*** (46)*** (46)*** (46)*** (45)*** (45)*** (45)***
420 75,485 89,812 104,138 118,464 132,791 147,117 161,443 175,770 190,096 204,423 218,749
(51)*** (43)*** (42)*** (49)*** (49)*** (44)*** (47)*** (46)*** (46)*** (45)*** (45)***
440 75,669 89,995 104,321 118,648 132,974 147,301 161,627 175,953 190,280 204,606 218,932
(54)*** (43)*** (41)*** (45)*** (47)*** (44)*** (46)*** (46)*** (46)*** (46)*** (45)***
460 75,852 90,179 104,505 118,831 133,158 147,484 161,810 176,137 190,463 204,790 219,116
(60)*** (47)*** (41)*** (44)*** (49)*** (47)*** (44)*** (47)*** (45)*** (46)*** (45)***
480 76,036 90,362 104,688 119,015 133,341 147,668 161,994 176,320 190,647 204,973 219,299
(61)*** (50)*** (42)*** (42)*** (48)*** (49)*** (44)*** (46)*** (45)*** (46)*** (46)***
500 76,219 90,546 104,872 119,198 133,525 147,851 162,177 176,504 190,830 205,157 219,483
(61)*** (52)*** (43)*** (41)*** (44)*** (46,5)*** (47)*** (44)*** (47)*** (45)*** (46)***
520 76,403 90,729 105,055 119,382 133,708 148,035 162,361 176,687 191,014 205,340 219,666
(62)*** (53)*** (45)*** (41)*** (44)*** (49)*** (49)*** (43)*** (46)*** (47)*** (46)***
540 76,586 90,913 105,239 119,565 133,892 148,218 162,544 176,871 191,197 205,523 219,850
(65)*** (57)*** (47)*** (42)*** (42)*** (47,5)*** (47)*** (46)*** (44)*** (47)*** (45)***
560 76,770 91,096 105,422 119,749 134,075 148,402 162,728 177,054 191,381 205,707 220,033
(66)*** (60)*** (50)*** (43)*** (41)*** (44)*** (48)*** (49)*** (43)*** (46)*** (47)***
580 76,953 91,280 105,606 119,932 134,259 148,585 162,911 177,238 1915,64 205,890 220,217
(73)*** (60)*** (51)*** (44)*** (41)*** (44)*** (49)*** (47)*** (45)*** (44)*** (47)***
600 77,137 91,463 105,789 120,116 134,442 148,769 163,095 177,421 191,748 206,074 220,400
(73)*** (61)*** (51,5)*** (45)*** (42)*** (42)*** (46)*** (46,5)*** (47)*** (43)*** (46)***
620 77,320 91,647 105,973 120,299 134,626 148,952 163,278 177,605 191,931 206,257 220,584
(76)** (62)*** (54)*** (47)*** (42)*** (41)*** (44)*** (49)*** (49)*** (44)*** (44)***
640 77,504 91,830 106,156 120,483 134,809 149,136 163,462 177,788 192,115 206,441 220,767
(77)** (65)*** (60)*** (50)*** (43)*** (41)*** (44)*** (48)*** (47)*** (47)*** (43)***
660 77,687 92,014 106,340 120,666 134,993 149,319 163,645 177,972 192,298 206,624 220,951
(82)** (66)*** (60)*** (51)*** (43)*** (41)*** (42)*** (45)*** (46)*** (49)*** (44)***
680 77,871 92,197 106,523 120,850 135,176 149,503 163,829 178,155 192,482 206,808 221,134
(87)** (68)*** (60)*** (52)*** (46,5)*** (42)*** (41)*** (44)*** (49)*** (47)*** (47)***
700 78,054 92,381 106,707 121,033 135,360 149,686 164,012 178,339 192,665 206,991 221,318
(90)** (73)*** (61)*** (54)*** (47)*** (43)*** (41)*** (44)*** (48)*** (46,5)*** (49)***
Note: ASCIFC: average savings for identified fraudulent claims; ACI: average cost per investigation.
Mann-Whitney U-statistics in parentheses. *Significant at 10% level; **Significant at 5% level;
***Significant at 1% level.
98 Study
Factors contributing to differences in model cost-effectiveness include the type of detection method used, the specific input parameters available, and the operational costs associated with each model. The chosen methodology affects how models are ranked in terms of cost savings potential. Real-world costs such as administrative and investigation costs also play a significant role in the cost-effectiveness of fraud detection methods .
Future research should explore the potential integration of hybrid models that combine elements of both AI-based and traditional methods to enhance cost-effectiveness and adaptability in various market conditions. Additionally, studies could focus on the development of AI algorithms that require fewer specific input parameters, making them more accessible to a wider range of companies. Research should also address the gap in literature concerning cost-effectiveness in emerging markets .
Challenges in implementing AI-based fraud detection algorithms include high costs of investment and potentially lower cost-effectiveness compared to traditional methods. Additionally, AI models require large datasets and specific data inputs, which may not be available to all insurance companies. This can limit the applicability and effectiveness of AI methods in certain settings or markets .
Input parameter availability significantly impacts the usability of fraud detection methods for insurance companies. Some methods require specific data inputs such as accident characteristics and reports, which may not be available to all insurers. This variability necessitates the evaluation of available input parameters in each insurance company to decide the most efficient fraud detection methods under given market conditions .
An insurance company might use simulations to test different fraud detection methods under varied scenarios, such as altering the percentage of fraudulent claims and average savings in identified fraudulent claims while keeping investigation costs fixed. This strategy allows the company to visualize cost-effectiveness through heat maps and select the most efficient method under current market conditions and available input parameters .
The boundary between AI and traditional statistical methods is fluid, impacting the evaluation by potentially blurring distinctions in methodology categorization. AI methods are generally developed after the emergence of the AI terminology, while traditional methods involve techniques like linear regression not requiring big data. This distinction shapes the evaluation in cost-effectiveness analysis, affecting decisions regarding which methodology offers better savings or fits specific company needs .
The average savings for identified fraudulent claims vary significantly between different detection methods and scenarios. For example, savings can range from USD 315 to USD 1,213 depending on the method and the specific conditions like the proportion of fraudulent claims and average cost per investigation. These variances underscore the importance of context-specific analysis when selecting fraud detection methods .
The primary difference in cost-effectiveness between traditional statistical methods and AI-based methods lies in their average cost savings. According to the study, traditional statistical methods generally offer higher average cost savings than AI-based methods. The meta-analysis highlights that 22.85% of models are considered cost-effective using the proposed cost-saving calculation method, compared to 94.28% with the method proposed by Phua et al. (2004). Moreover, AI-based methods are currently not justified for heavy investment by insurance companies due to the lack of significant cost-saving advantages .
The method proposed by Phua et al. (2004) classifies almost all models as cost-effective, whereas the new proposed method takes into account the costs incurred in the real fraud detection process and classifies only 22.85% of the models as cost-effective in the most likely scenario, and only 11.42% in the worst-case scenario. This highlights a discrepancy in the classification of cost-effectiveness between both methods due to differing consideration of real operational costs .
Traditional statistical methods might still be preferred over AI-based methods due to their generally higher cost-effectiveness and lower implementation costs. The study finds that the average cost savings for the majority of scenarios are higher for traditional methods. Furthermore, AI-based methods may not offer sufficient cost-saving advantages to justify the investment needed for their deployment, making traditional methods more practical and financially viable for many insurance companies .