0% found this document useful (0 votes)
33 views22 pages

AI vs Traditional Fraud Detection Cost-Effectiveness

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
33 views22 pages

AI vs Traditional Fraud Detection Cost-Effectiveness

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Financial and Economic Review, Vol. 22 Issue 2, June 2023, 77–98.

Traditional versus AI-Based Fraud Detection:


Cost Efficiency in the Field of Automobile
Insurance*

Botond Benedek – Bálint Zsolt Nagy

Business practice and various industry reports all show that automobile insurance
fraud is very common, which is why effective fraud detection is so important. In
our study, we investigate whether today’s widespread AI-based fraud detection
methods are more effective from a financial (cost-effectiveness) point of view than
methods based on traditional statistical-econometric tools. Based on our results, we
came to the unexpected conclusion that the current AI-based automobile insurance
fraud detection methods tested on a real database found in the literature are less
cost-effective than traditional statistical-econometric methods.

Journal of Economic Literature (JEL) codes: G22, C14, C45


Keywords: automobile insurance, insurance fraud, fraud detection, cost-sensitive
decision-making, data mining

1. Introduction

The consequences of insurance fraud have a serious impact on the insurance sector.
Fraud creates distrust of the industry, causes economic damage and affects the
overall cost of living. The Insurance Information Institute (III) in the USA (III 2021)
reports that the total cost of insurance fraud in the USA between 2015 and 2019
amounted to between USD 38 billion and USD 83 billion per year. This means that
the average American family has an additional expenditure on insurance fraud
between USD 800 and USD 1,400 a year. The Association of British Insurers (ABI)
highlights that in 2020 the value of fraudulent claims in the UK was GBP 1.1 billion
(ABI 2021). Looking specifically at automobile insurance fraud, 7–10 per cent of
insurance policies in the USA and Western Europe, 10–20 per cent in the Central

* The papers in this issue contain the views of the authors which are not necessarily the same as the official
views of the Magyar Nemzeti Bank.
Botond Benedek: Babeş-Bolyai University, Cluj-Napoca, Romania, Assistant Professor.
Email: [Link]@[Link]
Bálint Zsolt Nagy: Babeş-Bolyai University, Cluj-Napoca, Romania, Associate Professor.
Email: [Link]@[Link]
The research was supported by the Hungarian Academy of Sciences under scientific research grant number
86/12/2022/HTMT.
The first version of the Hungarian manuscript was received on 2 November 2022.
DOI: [Link]

77
Botond Benedek – Bálint Zsolt Nagy

and Eastern European regions, and 18–20 per cent in China are affected (ABI 2021;
III 2019).

There is therefore no doubt that the identification of insurance fraud is an


economically very important area of investigation. In our study, 24 academic
journal articles and 3 conference proceedings on the detection of automobile
insurance fraud indexed by the Web of Science database between 1990 and
2022 were analysed. This suggests that this area of research is still very much
underdeveloped. There is an extensive literature on classical statistical-econometric
fraud identification methods as well as on models based on artificial intelligence
(AI) and machine learning, but there is a lack of systematic comparison and a lack
of research on the cost-effectiveness of fraud identification. The literature in
the Hungarian language is also incomplete in this area, and there is no generally
accepted definition of insurance fraud, nor of fraudulent automobile insurance
claims.

Therefore, this study aims to contribute to the development of our knowledge on


automobile insurance fraud identification in three areas:

• After a comprehensive analysis of the international and Hungarian literature,


we argue that the performance of any fraud detection system should be judged
in terms of its cost-effectiveness. In other disciplines where the use of artificial
intelligence (AI) has become widespread, such as healthcare, these cost-effective
approaches have already become dominant (Lee et al. 2017; Hill et al. 2020).

• Considering the spread of numerous AI models available in the literature, we


believe that there is a pressing need for a systematic meta-analysis that can
present a ranking of these models and compare them in terms of their financial
performance. Not least, we could not find any study that examined whether
today’s widespread AI-based methods are more financially (regarding their
cost-effectiveness) efficient than the methods based on traditional statistical-
econometric tools.

• Finally, we would like to contribute to the (quasi non-existent) literature in


Hungarian on the subject, at least with generally acceptable definitions that will
make it clear to the reader what insurance fraud or a fraudulent automobile
insurance claim is.

After a review of the relevant literature (Section 2), our theoretical framework is
presented (Section 3) in detail, together with the calculation method for cost savings
proposed by Benedek et al. (forthcoming). In Section 4, we focus on the selected
fraud detection methods and their cost-effectiveness, comparing traditional
statistical and machine-learning-based fraud detection methods, and present the
results of our detailed sensitivity analysis prepared using heatmaps. In the final
section, the conclusions are drawn.

78 Study
Traditional versus AI-Based Fraud Detection

2. Overview of the literature

We begin our literature review by defining what is meant by insurance fraud


and fraudulent automobile insurance claim. As defined by the Legal Information
Institute (LII 2023) and under Massachusetts Regulation (MR) (which are the most
widely accepted sources in the English-language literature), insurance fraud is any
act done with the intent to obtain a fraudulent payment from an insurer. Police
and prosecutors generally distinguish between two forms of insurance fraud:
(1) intentional damage to the insured property (hard fraud) and (2) forgery of
documents (soft fraud). Hard fraud is the less common of the two forms, when
the perpetrator intentionally causes the destruction of property with the aim of
obtaining the amount of damages later. A soft fraud occurs when the contracting
party exaggerates an otherwise legitimate claim, or when he or she makes untrue
statements and/or conceals certain conditions and circumstances. If we look
specifically at the automobile insurance market, a fraudulent claim is one where the
insured (1) makes a claim for an accident that did not happen; (2) makes multiple
claims for a single accident; (3) submits a claim other than those resulting from
the car automobile accident; (4) falsely reports lost wages/medical treatment costs
for injuries; or (5) reports higher car repair costs than the repair actually cost (LII
2023; MR 1993).

2.1. International literature


In one of the earliest studies, Weisberg and Derrig (1991) listed potential fraud
indicators (red flags) according to their relative frequency. In this study, 18 objective
characteristics (out of 65 possible characteristics) of claims for bodily injury
insurance were used to identify fraudulent claims. Despite this, the simplicity of
the method used has led to only limited success. Derrig and Ostaszewski’s (1995)
study of red flags and the problem of classifying fraudulent claims also shows that
there is no consensus among experts regarding fraudulent claims. They therefore
propose a fuzzy classification technique for the insurers. Weisberg and Derrig (1998)
tested the usefulness of potential red flags, quantified the effectiveness of standard
investigative techniques and mapped the ability of firms to further detect fraud.

Belhadji et al. (2000) presented an “expert system” that assists insurance company
employees in decision-making. The tool is not directly applicable to a specific insurer
because the parameters used are derived from calculations based on industry data,
but it was an important step towards the data mining and artificial-intelligence-
based fraud detection models that are prevalent today.

The novel approach (discrete choice model) presented by Artı́s et al. (1999; 2002)
tested the effect of the characteristics of the insured and the circumstances of the
accident on the probability of committing fraud. In addition, these studies also
focused on the problem of misclassification. Due to the nature of the model used
and the characteristics of the real automobile insurance data series, fraudulent

79
Botond Benedek – Bálint Zsolt Nagy

claims had to be overweighted in the estimation. This paved the way for the
examination of asymmetric data series (such as automobile insurance fraud) using
various overweighting or underweighting techniques. In parallel, Viaene et al. (2002)
compared the performance of different fraud detection methods. The authors of
the study used only indicators for property damage, as these are the only ones
available at an early stage of the assessment process.

After Artı́s et al. (1999; 2002) opened the door to oversampling or undersampling
techniques and Viaene et al. (2002) introduced the use of early stage indicators,
several authors presented some form of classification method based on
oversampling or undersampling (especially for property damage). For example,
Pérez et al. (2005) compared the performance of their consolidated tree algorithm
with that of the well-known C4.5 algorithms on an oversampled real automobile
insurance database. Bermúdez et al. (2008) proposed an asymmetric logit model
that was able to handle unbalanced data sets. A few years later, the researchers
proposed two new approaches for the undersampling of the majority class to
improve the performance of classifiers in unbalanced datasets. In the first approach,
Sundarkumar et al. (2015) proposed the one-class support vector machine (OCSVM)-
based undersampling, while in the second approach Sundarkumar – Ravi (2015)
proposed the combined use of k-nearest neighbour (KNN) and OCSVM.

Šubelj et al. (2011) presented a novel expert system using social network analysis
to identify groups of fraudsters, rather than a few isolated cases of automobile
insurance fraud. Farquad et al. (2012) used a modified active-learning-based
approach in order to construct “if..., then” type rules from a support vector machine
“black box” for customer relationship management. Gepp et al. (2012) compared
the decision tree, survival analysis and discriminant analysis methodology with the
logistic regression used by Wilson (2009). The novelty of the approach proposed
by Tao et al. (2012) was that each insurance claim could be classified into two
categories (lawful and fraudulent) with two different probabilities at the same time.

Yan – Li (2015) approached the detection of automobile insurance fraud as


a problem of detecting outliers. Therefore, an improved outlier identification
method based on a version of the nearest neighbour algorithm completed with
pruning rules was proposed. Nian et al. (2016) suggested an unsupervised spectral
ranking algorithm (SRA) method to detect anomalies. Shaeiri and Kazemitabar
(2020) further developed the SRA approach and presented an implementation
methodology that allowed real-time application of SRA on large datasets. Li et al.
(2018) combined individual classifiers into multiple classifier systems to increase
classification accuracy. Wang and Xu (2018) proposed a text analysis based on deep
neural network and latent Dirichlet allocation (LDA).

Finally, some authors have approached the problem of detecting automobile


insurance fraud from a strictly financial perspective, with a strong emphasis on

80 Study
Traditional versus AI-Based Fraud Detection

cost-sensitive classification of damage. For example, Phua et al. (2004) compared


the performance of their proposed approach with various widely used techniques
and demonstrated the superior performance of the proposed method in terms of
cost savings. Viaene et al. (2007) focused on the cost of the examination process
rather than on minimising the error rate (misclassification) and showed that cost-
sensitive fraud screening can be a profitable approach for property and casualty
insurance companies. Finally, Zelenkov (2019) also proposed a cost-sensitivity-based
approach, but with an example-dependent cost-sensitive meta-algorithm, AdaBoost
(adaptive boosting), which assigned different costs not only to different classification
errors (as in previous studies) but also to different compensation cases.

For a more comprehensive review of the related international literature, including


the most important indicators used to identify fraud, the most commonly used
databases and the most current challenges in fraud identification, see Benedek et
al. (2022).

2.2. Literature in Hungarian


The use of fraud detection methods, or even insurance fraud as a scientific research
topic, is completely absent from the Hungarian literature. In this respect, this study
is certainly of premier value.

As there is a complete lack of scientific research on insurance fraud in Hungarian,


we briefly review some literature in Hungarian where artificial intelligence and
machine learning methods are applied to economic-financial problems.

The first economic and financial AI applications appeared in the field of corporate
bankruptcy prediction models: a combination of logistic regressions and factor
analysis was used by Hámori (2001), whose model had a classification accuracy
of 95.3 per cent. Virág – Kristóf (2005) applied a neural-network-based model for
bankruptcy prediction, using the advantage offered by multiple neural layers (4) and
the backpropagation algorithm. The accuracy of the results obtained with neural
networks exceeded the results obtained with linear discriminant analysis and logistic
regression by a few percentage points. Virág and Nyitrai (2013) were the first to
apply the support vector machine (SVM) method to data from Hungarian companies.
Using different kernel functions, they achieved 5-per cent better performance
with SVM than with neural networks. Virág and Nyitrai (2014) compared the
performance of ensemble methods, AdaBoost and bootstrap aggregating, using
C4.5 decision trees with data from nearly a thousand Hungarian companies between
2001 and 2012. Their results showed that bootstrap aggregating performed better,
but very slightly ahead of AdaBoost. Among the more recent applications, we
mention the study by Ágoston (2022), which applies SVM, bootstrap aggregating
and random forest algorithms to bankruptcy prediction using a sample of firms in
the Budapest and Pécs urban regions. Based on the accuracy of the out-of-sample
classification indicators, the random forest seems to be the winner.

81
Botond Benedek – Bálint Zsolt Nagy

Among the AI studies outside the bankruptcy forecast but within the economy, the
following are also worth mentioning: Muraközy (2018) argues that machine learning,
which focuses on prediction, and econometrics, which studies causal relations,
are not substitutes but rather complementary empirical disciplines. Farkas et al.
(2020) discusses the potential applications of machine learning in agriculture. The
application of AI can also be seen in the fields of business economics (management,
marketing): Benedek (1999) analyses the efficiency of marketing actions using
statistical and data mining methods, while Danyi (2019) looks at the likely effects
of artificial intelligence in pricing policies and strategies. Bánkúty-Balog (2020)
assesses the geo-economic impacts of the spread of AI in Hungary in the context
of international competitiveness. Finally, Csillag et al. (2022) used machine-learning-
based structural topic modelling (STM) to evaluate the prevalence of environmental
topics in the media.

3. Conceptual and theoretical background

The identification of automobile insurance fraud is a binary classification problem,


so the performance of any classification algorithm can be described by the confusion
matrix in Table 1.1

Table 1
Binary classifier confusion matrix and performance indicators used in the evaluation
Predicted value
Benedek – Fraudulent
Benedek – Nagy tanulmány Nagy tanulmány képletek:
képletek: Lawful
Performance
indicators
1. táblázatban:
claim
1. táblázatban: Benedek – Nagy tanulmány képletek:
claim
Benedek – Nagy tanulmány képletek: 𝑇𝑇𝑇𝑇
𝑇𝑇𝑇𝑇
Sensitivity
1. táblázatban:
Benedek
1. táblázatban:Fraudulent –True
Nagy tanulmányFalse
positive képletek:
𝑇𝑇𝑇𝑇 negative
+ 𝐹𝐹𝐹𝐹
𝑇𝑇𝑇𝑇 + 𝐹𝐹𝐹𝐹
(TPR):
claim (TP) (FN) 𝑇𝑇𝑇𝑇
ek – Nagy tanulmány 1.
képletek: táblázatban: 𝑇𝑇𝑇𝑇
𝑇𝑇𝑇𝑇
𝑇𝑇𝑇𝑇
𝑇𝑇𝑇𝑇 + 𝐹𝐹𝐹𝐹
Actual 𝑇𝑇𝑇𝑇 + 𝐹𝐹𝐹𝐹 𝐹𝐹𝐹𝐹𝑇𝑇𝑇𝑇
+ 𝑇𝑇𝑇𝑇
ázatban: value 𝐹𝐹𝐹𝐹 + 𝑇𝑇𝑇𝑇 Specificity
𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇
+ 𝐹𝐹𝐹𝐹
Lawful False 𝑇𝑇𝑇𝑇
positive 𝑇𝑇𝑇𝑇
True negative 𝑇𝑇𝑇𝑇
(TNR):
𝑇𝑇𝑇𝑇 𝐹𝐹𝐹𝐹 + 𝑇𝑇𝑇𝑇
claim 𝑇𝑇𝑇𝑇(FP)
+ 𝐹𝐹𝐹𝐹 𝐹𝐹𝐹𝐹(TN)
+ 𝑇𝑇𝑇𝑇 𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇
+ 𝐹𝐹𝐹𝐹
𝑇𝑇𝑇𝑇 + 𝐹𝐹𝐹𝐹
𝐹𝐹𝐹𝐹 𝑇𝑇𝑇𝑇
+ 𝑇𝑇𝑇𝑇
𝑇𝑇𝑇𝑇 𝑇𝑇𝑇𝑇 𝑇𝑇𝑇𝑇
Precision 𝑇𝑇𝑇𝑇 Estimation𝑇𝑇𝑇𝑇 + accuracy
𝐹𝐹𝐹𝐹
𝐹𝐹𝐹𝐹 + 𝑇𝑇𝑇𝑇 Negative𝑇𝑇𝑇𝑇 +predictive
𝐹𝐹𝐹𝐹 𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇
+ 𝐹𝐹𝐹𝐹
(PPV): 𝑇𝑇𝑇𝑇 +(NPV):
value 𝐹𝐹𝐹𝐹 (ACC):
𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇
+ 𝐹𝐹𝐹𝐹
𝑇𝑇𝑇𝑇 𝑇𝑇𝑇𝑇 𝑇𝑇𝑇𝑇 + 𝑇𝑇𝑇𝑇
Performance 𝑇𝑇𝑇𝑇 + 𝑇𝑇𝑇𝑇 𝑇𝑇𝑇𝑇 + 𝐹𝐹𝐹𝐹
𝑇𝑇𝑇𝑇 + 𝐹𝐹𝐹𝐹 𝑇𝑇𝑇𝑇 + 𝐹𝐹𝐹𝐹 𝑇𝑇𝑇𝑇 + 𝐹𝐹𝐹𝐹 𝑇𝑇𝑇𝑇+ 𝑇𝑇𝑇𝑇 + 𝐹𝐹𝐹𝐹
indicators 𝑇𝑇𝑇𝑇 + 𝐹𝐹𝐹𝐹 + 𝑇𝑇𝑇𝑇 + 𝐹𝐹𝐹𝐹
F-score 𝑇𝑇𝑇𝑇 + 𝑇𝑇𝑇𝑇
𝐹𝐹𝐹𝐹
𝑇𝑇𝑇𝑇 𝑇𝑇𝑇𝑇 + 𝑇𝑇𝑇𝑇 !
(1 + 𝛽𝛽 ) ∗ 𝑇𝑇𝑇𝑇𝑇𝑇 ∗ 𝑃𝑃𝑃𝑃𝑃𝑃
(1 + 𝛽𝛽 !) ∗ 𝑇𝑇𝑇𝑇𝑇𝑇 ∗ 𝑃𝑃𝑃𝑃𝑃𝑃 𝑇𝑇𝑇𝑇 + 𝐹𝐹𝐹𝐹 + 𝑇𝑇𝑇𝑇 + 𝐹𝐹𝐹𝐹
𝑇𝑇𝑇𝑇 + 𝐹𝐹𝐹𝐹 𝑇𝑇𝑇𝑇 + 𝐹𝐹𝐹𝐹 + 𝑇𝑇𝑇𝑇 + 𝐹𝐹𝐹𝐹 𝛽𝛽 ! ∗𝑇𝑇𝑇𝑇 + 𝑇𝑇𝑇𝑇
𝑇𝑇𝑇𝑇𝑇𝑇 + 𝑃𝑃𝑃𝑃𝑃𝑃
𝛽𝛽 ! ∗ 𝑇𝑇𝑇𝑇𝑇𝑇 + 𝑃𝑃𝑃𝑃𝑃𝑃
𝑇𝑇𝑇𝑇++𝛽𝛽𝐹𝐹𝐹𝐹
(1 ! ) + 𝑇𝑇𝑇𝑇 + 𝐹𝐹𝐹𝐹
∗ 𝑇𝑇𝑇𝑇𝑇𝑇 ∗ 𝑃𝑃𝑃𝑃𝑃𝑃
𝑇𝑇𝑇𝑇 + 𝑇𝑇𝑇𝑇 (1 + 𝛽𝛽 !) ∗ 𝑇𝑇𝑇𝑇𝑇𝑇 ∗ 𝑃𝑃𝑃𝑃𝑃𝑃
Note: In the case of A F-score,
táblázatβ így
is a nézne
coefficientmajd ki: 𝛽𝛽! the
to adjust relative importance 𝛽𝛽 ! ∗of𝑇𝑇𝑇𝑇𝑇𝑇
precision
+ 𝑃𝑃𝑃𝑃𝑃𝑃and
A táblázat így nézne majd ki:𝑇𝑇𝑇𝑇 + 𝐹𝐹𝐹𝐹 + 𝑇𝑇𝑇𝑇 + 𝐹𝐹𝐹𝐹
sensitivity. ∗ 𝑇𝑇𝑇𝑇𝑇𝑇 + 𝑃𝑃𝑃𝑃𝑃𝑃 (1 + 𝛽𝛽 !) ∗ 𝑇𝑇𝑇𝑇𝑇𝑇 ∗ 𝑃𝑃𝑃𝑃𝑃𝑃
𝛽𝛽Várható
! ∗ 𝑇𝑇𝑇𝑇𝑇𝑇 +érték
𝑃𝑃𝑃𝑃𝑃𝑃
A táblázat(1 +így
𝛽𝛽 !)nézne 𝑃𝑃𝑃𝑃𝑃𝑃 ki: Várható érték
∗ 𝑇𝑇𝑇𝑇𝑇𝑇 ∗majd Te
A táblázat így nézne majd ki: 𝛽𝛽! ∗ 𝑇𝑇𝑇𝑇𝑇𝑇 + 𝑃𝑃𝑃𝑃𝑃𝑃 Jogtalan Teljesítmény-
Jogos
A theory
táblázat így nézne Jogtalan
majdcanki:
Jogos Várható érték mutatók
1
The methodology and of confusion matrices Várható
be traced back to értékkövetelés
the work of Green – Swets követelés
(1966). Te
követelés követelés Teljesítmény-
JogtalanVárható érték
Jogos
zat így nézne majd ki: Jogtalan Jogos S
Te
követelés mutatók
Szenzi5vitás
követelés
Várható követelés
érték Jogtalan Jogtalan
követelés
Valódi poziBv Jogos
ÁlnegaBv
82 Study Jogtalan Valódi poziBv ÁlnegaBv
Teljesítmény- (TPR): S
Jogtalan Jogos Követelés követelés
(TP) követelés
(FN)
Szenzi5vitás
Követelés (TP)Jogtalan (FN)
mutatók
Valódi poziBv 𝑇𝑇𝑇𝑇
ÁlnegaBv
követelés Tényleges
Jogtalan Valódi poziBv
követelés ÁlnegaBv (TPR): S
Tényleges Követelés (TP) (FN)+ 𝐹𝐹𝐹𝐹
𝑇𝑇𝑇𝑇
KövetelésÉrték (TP) Jogtalan Valódi
(FN)poziBv
Szenzi5vitás ÁlnegaBv
𝑇𝑇𝑇𝑇 S
Traditional versus AI-Based Fraud Detection

Various performance indicators can be derived from the confusion matrix. The most
widely used measures of classifier performance are accuracy (ACC), sensitivity (TPR),
specificity (TNR) and F-score. However, these measures also have their limitations,
especially on asymmetric data sets such as automobile insurance fraud. A detailed
description of each performance indicator and a discussion of possible limitations
can be found in the work of Benedek et al. (forthcoming).

However, from a business perspective, one possible way to overcome all the
problems with performance indicators is to quantify the operating costs of individual
classifiers rather than looking at the performance of different classifiers. This
approach allows for easy comparability and can take into account the costs of
various misrepresentations. In addition, most insurers consider it more important
to minimise the costs of the detection process than to minimise the error rate of
the classifier.

To quantify the cost savings of a (semi-)automated fraud detection system, two key
factors need to be considered: (1) the cost of continued use of the systems; and
(2) the cost of operating the alternative system. Part of the cost of the ongoing use
of the systems is the cost of the manpower needed to carry out the new tasks of
the fraud analysis department. However, the most important item here is the cost
arising from false signalling by the system. If a lawful claim is deemed fraudulent by
the system, the insurer pays for the unnecessary investigation (because the system
only flags a potential fraudulent claim, but this has to be verified and proven by an
expert). Likewise, if a fraudulent claim is deemed lawful by the system, the insurer
pays the fraudster. Considering the large number of claims processed by insurers,
the costs of false signalling by the system can be very significant. In determining
the operational costs of an alternative system, Phua et al. (2004) suggest that the
alternative where the insurer takes no action to verify the legitimacy of claims and
simply pays out all claims should be considered. Thus, the approach to quantifying
the cost savings of any system (CSDM – cost saving of the decision method) given
by equation (1) proposed by Phua et al. (2004) is as follows:

CSDM = NA – (MC + FAC + NC + HC) (1)

where NA is the “no action cost”, i.e. the cost of the alternative where the insurer
takes no action to verify the legitimacy of the claims. Furthermore, the misses cost
(MC), false alarms cost (FAC), normals cost (NC) and hits cost (HC) are as follows:

MC = NFN * ACA;

FAC = NFP * (ACI + ACA);

NC = NTN * ACA;

HC = NTP * ACI,

83
Botond Benedek – Bálint Zsolt Nagy

where NFN is the number of false negative cases, NFP is the number of false positive
cases, NTN is the number of true negative cases, NTP is the number of true positive
cases, ACA is the average claim amount and ACI is the average cost per investigation.

Viaene et al. (2007) did not define the cost savings of a system, but its operating
costs (OC) given by equation (2); however, the way of defining the inputs is the
same as presented by Phua et al. (2004).

OC = MC + FAC + NC + HC (2)

What is important from a business perspective is that in both cases the authors work
under the assumption that true negative (TN) cases do not impose an additional
cost (i.e. the additional cost of a true negative case is 0) for insurers, since in these
cases it is about the normal claims process. However, during our interviews,2
industry experts highlighted that in practice these true negative cases also have
an additional cost. There is a similar discrepancy between business practice and the
literature when it comes to calculating the costs of true positive cases. According
to the literature, in true positive cases, the insurer does not pay the insured, i.e.
the only costs incurred are those related to the investigation. In business practice,
however, the situation is different. As several previous studies showed (e.g. Derrig
– Ostaszewski 1995; Weisberg – Derrig 1998), the vast majority of automobile
insurance fraud consists of so-called build-up3 claims. Our interviewees pointed
out that, contrary to the literature, in practice it is rare for an insurer to completely
refuse to pay. They usually offer less than the amount requested for identified build-
up claims. There are many reasons for this, such as the lengthy and costly court
process or negative marketing.

In view of the differences between the literature and the business practice
described above, we recommend the calculation method proposed by Benedek et
al. (forthcoming), given by equation (3), to determine the real costs of detecting
automobile insurance fraud:

CSDM = NA – (MC + FAC + NC + HC) (3)

where NA is “no action cost”.4 Furthermore:

MC = NFN * (ACA + AAC);

FAC = NFP * (ACI + ACA);

2
 e conducted three in-depth interviews with Romanian insurance company executives and experts from
W
multinational insurance companies on automobile insurance fraud. A 22-question questionnaire was
then prepared and sent to all partner institutions of UNSAR (the National Association of Insurance and
Reinsurance Companies in Romania).
3
Cases where the insured or the professional repairer claims more than the actual cost of the repair.
4
In this paper, we use the approach presented by Phua et al. (2004), but the costing method we propose
would also work perfectly well if we used the operating costs of an alternative system instead of the “no
action cost”.

84 Study
Traditional versus AI-Based Fraud Detection

NC = NTN * (ACA + AAC);

HC = NTP * (ACA – ASCIFC + ACI);

where average administrative cost (AAC) and average savings in the case of
identified fraudulent claims (ASCIFC) are denoted.

Finally, we should mention the preventive effect of fraud prevention programmes,


because without effective prevention, over time premiums will have to be increased
to a level that will “cope” with fraudulent payments, so that sooner or later the
premium will reach a level that will no longer be competitive in the market. In this
study, the prevention effect, which is much harder to quantify, is not explicitly
included, but it would not even affect the results significantly, since the cost of
prevention reduces the profitability of both classical statistical methods and AI-
based methods in the short run.

4. Results

4.1. Cost-effectiveness meta-analysis of the selected methods


After reviewing the literature and identifying the research gaps, we conducted
a meta-analysis of the selected methods, which enables us to rank and compare the
methods of automobile insurance fraud identification. First, the cost saving potential
of these methods was calculated using the proposed cost saving calculation method.

The logic behind the initially selected 24 journal articles and 3 conference papers
was twofold. On the one hand, only the studies indexed by the Web of Science
were considered, and on the other hand, we also kept in mind that we wanted to
compare the performance of models using a traditional statistical-econometric
approach from 1999–2012 with the performance of AI-based models from 2012–
2022 tested on real data sets. However, some of the 27 studies identified were
purely theoretical and offered no concrete fraud identification method. The authors
of other studies (e.g. Pathak et al. 2005; Padmaja et al. 2007; Bhowmik 2011; Xu et
al. 2011; Karamizadeh – Zolfagharifar 2016; Badriyah et al. 2018) conducted their
research without using real company datasets. Finally, there were several studies
in which the authors did not present the confusion matrix, so for these studies we
were not able to determine the inputs necessary for our costing method.

Taking into account the above limitations, there are only 12 studies left in our
sample with all the data needed to determine the cost-saving potential of each
model. In the 12 articles, the authors propose and compare a total of 35 different
methods, the full list of which can be found in Table 4 in the Appendix.

As the percentages of fraudulent claims in the analysed studies are different, the
sizes of the databases are very different, and, moreover, 2 of the 7 databases used

85
Botond Benedek – Bálint Zsolt Nagy

are from the United States, 1 from Canada, 2 from Spain, 1 from Russia and 1
from Slovenia, we first built a general framework where we assume that an insurer
processes 10,000 claims, of which 10 per cent are fraudulent. Metavariables such as
the average cost per investigation or the average claim amount were determined on
the basis of the questionnaire survey mentioned earlier. The questionnaire was fully
completed by five Romanian insurance companies with a combined market share of
nearly 70 per cent. In this study, we used a market-share-weighted average of the
values provided by the five insurers. They showed an average cost per investigation
of USD 145, an average claim amount of USD 2,420, an average saving of USD 485
for identified fraudulent claims, and an average administrative cost of USD 12.

Table 2 summarises the cost-saving potential of the 35 methods for three different
scenarios. Rows 2 to 7 of the table show the input parameters of the given scenario.
These are the input meta-parameters whose values come from industry experts
and which are always constant for each classical statistical or AI-based method.
Row 8 is the most important row, the output, since it is obtained by interacting
and processing the meta-parameters with specific algorithm parameters. That
is, the final operating cost of an algorithm is equal to the number of claims in
the different categories (false positive, false negative) defined by the confusion
matrix multiplied by the constant value of the meta-parameter (average cost per
investigation, average claim amount) associated with that category. In economic
language, row 8 shows how many of the 35 methods had a higher operating cost
than that of the alternative, i.e. if the insurer did not investigate the validity of the
claims and simply paid out the claims received. Counter-intuitively, the best-case
scenario here is the one with the highest rate of fraudulent claims, since in this case
even a less efficient method can achieve higher cost savings.

Table 2
Cost-effectiveness of methods used to identify fraudulent claims
Most likely Worst case
Best scenario
scenario scenario
35 models 35 models 35 models
Proportion of fraudulent claims (%) 10 5 20
Average claim amount (USD) 2,420 2,420 2,420
Average cost per investigation (USD) 145 193 97
Average administrative cost (USD) 12 12 12
Average savings for identified fraudulent
485 315 1,213
claims (USD)
Number of methods with an operating cost
27 31 0
higher than the “no action cost”
Note: For the worst and best case scenarios, we used the extreme values provided by the insurance
companies.

86 Study
Traditional versus AI-Based Fraud Detection

We emphasise that the data summarised in Table 2 well illustrate the importance of
the proposed cost savings calculation method from a business perspective. While
the cost-saving calculation method proposed by Phua et al. (2004) classifies almost
all models as cost-effective, our proposed method (which takes into account the
costs incurred in the real fraud detection process) classifies only 22.85 per cent of
the models as cost-effective even in the most likely scenario, while only 11.42 per
cent of the methods can be classified as cost-effective in the worst-case scenario,
compared to the 94.28 (and 68.57) per cent of the methods proposed by Phua et
al. (2004).

4.2. Heat maps of the cost-saving potential of fraud identification methods


In view of the rather surprising results revealed by the meta-analysis, we considered
it an important step to further analyse each fraud detection method in depth and
to investigate the circumstances under which the individual methods may be more
beneficial than their counterparts. One reason for this approach is that, depending
on the input parameters used in the meta-analysis (e.g. percentage of fraudulent
claims, average cost per investigation), the cost-effectiveness of fraud detection
methods varies significantly. The other reason is that some detection methods are
unusable for some insurers, as these fraud detection methods use inputs (accident
characteristics, police/medical reports, accident photographs) that are not (or not
yet) available to the insurer.

In order to take into account as much as possible the specific characteristics of the
fraud detection methods and to perform the meta-analysis with a wide range of
input parameters, we ran 3 different simulations to investigate the performance of
the methods and created heat maps to visualize the results.

In the first simulation, a fixed investigation cost of USD 145 was assumed, while the
percentage of fraudulent claims and the average savings in the case of identified
fraudulent claims were varied. This approach can be very useful for insurance
companies that work with a fixed cost per investigation (for example, by hiring
a specialised external company to carry out the investigation and paying a pre-
determined price for each claim), as they can easily decide which method is the
most efficient for them in the given market circumstances. For example, if an
insurance company is unable to use the fraud detection methods proposed by
Tao et al. (2012) or Bermúdez et al. (2008) because it does not have the input
parameters necessary to apply the model, but operates in a market with a high
percentage of fraudulent claims and low average savings in the case of identified
fraudulent claims, the multinomial logit model proposed by Artís et al. (1999)
may be an optimal choice (Figure 1), as it performs almost as well as the method
proposed by Tao et al. (2012). Likewise, any insurance company can easily choose
the most appropriate method based on the percentage of fraudulent claims and
the average savings in the case of fraudulent claims. For companies operating in
a market with a low percentage of fraudulent claims and low average savings, the
method proposed by Zelenkov (2019) seems to be better than the one proposed
by Sundarkumar et al. (2015), see Figure 2. 87
Botond Benedek – Bálint Zsolt Nagy

Figure 1
Cost-saving ability of the models proposed by Tao et al. (2012) and Artís et al. (1999)
on a heat map

Tao et al. (2012) – Artís et al. (1999) –


linear discriminant analysis multinomial logit model
35 35

30 30

20 20
Ran

Ran
k

k
g

g
10 10
vin

vin
sa

sa
Fr Fr
au au
st

st
dr dr
co

co
at at
e

e
e e
ag

ag
er

er
Av

Av
1 1
Note: The cost-saving ability of the linear discriminant analysis model proposed by Tao et al. (2012) and
the multinomial logit model proposed by Artís et al. (1999) compared to the cost-saving ability of the 35
models analysed under different scenarios.

Figure 2
Cost-saving ability of the models proposed by Zelenkov (2019) and Sundarkumar et al.
(2015) on a heat map

Zelenkov (2019) – example-dependent Sundarkumar et al. (2015) –


cost-sensitive Ada-Boost (EDAB.C1) support vector machine
35 35

30 30

20 20
Ran

Ran
k

k
ing

ing

10 10
av

av

Fr Fr
ts

ts

au au
s

dr dr
co

co

at at
e

e e
ag

ag
er

er
Av

Av

1 1
Note: The cost-saving ability of the example-dependent cost-sensitive AdaBoost (EDAB.C1) model
proposed by Zelenkov (2019) and the support vector machine model proposed by Sundarkumar et al.
(2015) compared to the cost-saving ability of the 35 models analysed under different scenarios.

88 Study
Traditional versus AI-Based Fraud Detection

For the second simulation, the savings from identified fraudulent claims were held
constant (USD 485) and the cost of investigation and the percentage of fraudulent
claims were varied. In the third simulation, the percentage of fraudulent claims
was held constant (10%) and the cost of investigation and the average savings in
the case of identified fraudulent claims were varied.

4.3. Comparison of traditional statistical and machine-learning-based methods


in terms of average cost savings
After the meta-analysis and heatmaps, a detailed non-parametric rank correlation
analysis of the different fraud detection methods was performed. For a detailed
discussion of Spearman’s rank correlations, see Benedek et al. (forthcoming). The
magnitude and significance of the correlations clearly show that the performance
measures used in this study result in a consistent ranking of the fraud detection
methods analysed (details in Table 3 in the Appendix).

Perhaps the most interesting question in the study is whether AI-based detection
methods are significantly more cost-effective than traditional statistical-econometric
tools.

Obviously, AI and traditional statistical econometric methods are all parts of the
same discipline generically called data science, and as such, the boundary between
them is rather subjective and fluid, especially given the dynamic evolution of AI that
is taking place before our eyes. For example, most machine learning courses start
with the methodology of linear and logistic regression, which is also part of any
standard econometrics curriculum. However, in our study, the following distinction
was made: Any method developed after the emergence of the AI terminology in
the literature was considered an AI or machine learning method. Therefore, e.g.
linear and logistic regression as well as linear discriminant analysis were classified
in the traditional category (since they do not require big data or neural nets) while
genetic algorithms, neural nets, etc. were classified in the AI category.

As a first step, the differences in average cost savings between these two groups
of methods were calculated, and the statistical significance of the differences was
tested using the Mann–Whitney non-parametric test. These comparisons were
performed on a wide range of combinations of input parameters (10,780 in total),
resulting in a synthetic cross-tabulation between the average cost per test and the
average savings of the identified fraudulent claims.

Table 5 in the Appendix clearly shows that the average cost savings for the vast
majority of combinations are higher for traditional statistical methods5 (the

5
 lthough it is not the purpose of this study to examine the implementation costs of traditional statistical
A
and AI methods, it is highly likely that the cost implications of traditional methods in this area are also lower,
which further supports the conclusions observed in Table 5 in the Appendix.

89
Botond Benedek – Bálint Zsolt Nagy

differences are positive and significant) than for AI-based methods, and we
concluded that, surprisingly, there is no justification for insurance companies to
invest heavily in AI-based fraud detection algorithms at this stage. This does not
mean, of course, that these companies do not need software support in their
operations, only that in most cases the traditional statistical software is sufficient.

5. Conclusions

In our research, we pointed out that there is a lack of literature examining the
cost-effectiveness of methods for detecting automobile insurance fraud. Moreover,
in the case of emerging markets, there is a complete lack of literature on the
detection of automobile insurance fraud. Therefore, in this study, we applied
the method proposed by Benedek et al. (forthcoming) to correctly calculate the
cost-saving potential of automobile insurance fraud identification. The proposed
method takes into account all costs incurred in a real fraud detection process
(with particular emphasis on the fact that in the case of a fraudulent or partially
fraudulent claim, the insurer will usually not deny payment completely but offer
partial compensation).

In this cost-effectiveness study, we conducted a meta-analysis of 35 fraud detection


methods from 12 different sources and concluded that most of the current methods
of automobile insurance fraud detection in the literature are not profitable. In
addition, we also pointed out that the approaches based on traditional statistical
methods perform better than AI-based methods for the time being. In other words,
there is no justification for insurance companies to make significant additional
investments in AI-based fraud detection algorithms at this stage, and in most
cases the use of traditional statistical software is sufficient. This result is consistent
with that presented by Benedek et al. (forthcoming). This means that the use of
traditional statistical methods is also more economical for the sample examined
in this study (pre-2012 traditional statistical methods versus post-2012 AI-based
approaches). With this result, the present study acts as a test of robustness and
confirms previous research findings.

The most important limitation of the research, which is also an opportunity for
further development, is that the input parameters in the meta-analysis are based on
previous algorithms trained and tested on different datasets. The really convincing
proof would be to run the same algorithms one by one on the same sample.

90 Study
Traditional versus AI-Based Fraud Detection

References

Ágoston, N. (2022): Mesterséges intelligencia és gépi tanulási módszerek a vállalati


fizetésképtelenség becslésére (Artificial intelligence and machine learning methods to
estimate firm insolvency). Statisztikai Szemle (Hungarian Statistical Review), 100(6):
584–609. [Link]

Artı́s, M. – Ayuso, M. – Guillén, M. (1999): Modelling different types of automobile insurance


fraud behaviour in the Spanish market. Insurance: Mathematics and Economics, 24(1–2):
67–81. [Link]

Artís, M. – Ayuso, M. – Guillén, M. (2002): Detection of Automobile Insurance Fraud With


Discrete Choice Models and Misclassified Claims. Journal of Risk and Insurance, 69(3):
325–340. [Link]

Association of British Insurers (2021): No Time to Lie. [Link]


articles/2021/10/detected-fraud-2020/. Downloaded: 15 November 2021

Badriyah, T. – Rahmaniah, L. – Syarif, I. (2018): Nearest neighbour and statistics method based
for detecting fraud in auto insurance. International Conference on Applied Engineering
(ICAE), Batam, Indonesia, pp. 1–5. [Link]

Bánkúty-Balog, L. (2022): A mesterséges intelligencia elterjedésének geoökonómiai hatásai


és Magyarország (The Geo-economic Ramifications of the Spread of Artificial Intelligence
and Hungary). Külgazdaság, 66(7–8): 102–130. [Link]
8.102

Belhadji, E.B. – Dionne, G. – Tarkhani, F. (2000): A Model for the Detection of Insurance Fraud.
The Geneva Papers on Risk and Insurance - Issues and Practice, 25(4): 517–538. https://
[Link]/10.1111/1468-0440.00080

Benedek, G. (1999): Mesterséges intelligencia az üzleti világban: Marketingakciók


hatékonyságának elemzése statisztikai és Data Mining módszerekkel (Artificial intelligence
in business: analysing the effectiveness of marketing actions using statistical and data
mining methods). Vezetéstudomány-Management and Business Journal, 30(11): 33–36.

Benedek, B. – Ciumas, C. – Nagy, B.Z. (2022): Automobile insurance fraud detection in the
age of big data – a systematic and comprehensive literature review. Journal of Financial
Regulation and Compliance, 30(4): 503–523. [Link]

Benedek, B. – Ciumas, C. – Nagy, B.Z. (forthcoming): On the cost-efficiency of automobile


insurance fraud detection methods – A meta-analysis. Global Business Review, accepted
for publication, forthcoming.

91
Botond Benedek – Bálint Zsolt Nagy

Bermúdez, L. – Pérez, J.M. – Ayuso, M. – Gómez, E. – Vázquez, F.J. (2008): A Bayesian


dichotomous model with asymmetric link for fraud in insurance. Insurance: Mathematics
and Economics, 42(2): 779–786. [Link]

Bhowmik, R. (2011): Detecting auto insurance fraud by data mining techniques. Journal of
Emerging Trends in Computing and Information Sciences, 2(4): 156–162.

Csillag, J.B. – Granát, P.M. – Neszveda, G. (2022): Media Attention to Environmental


Issues and ESG Investing. Financial and Economic Review, 21(4): 129–149. [Link]
org/10.33893/FER.21.4.129

Danyi, P. (2018): A mesterséges intelligencia alkalmazása az árazásban (Artificial intelligence


in pricing). Marketing & Menedzsment (Marketing & Management), 52(3–4): 5–18.

Derrig, R.A. – Ostaszewski, K.M. (1995): Fuzzy techniques of pattern recognition in risk
and claim classification. Journal of Risk and Insurance, 62(3): 447–482. [Link]
org/10.2307/253819

Farkas, G. – Magyar, P. – Molnár, A. – Zubor-Nemes, A. (2020): Adatbányászati módszerek


alkalmazása a mezőgazdaságban – a gépi tanulás felhasználási lehetőségei (Adaptation
of data mining methods in agriculture – potential uses of machine learning). Gazdálkodás:
Scientific Journal on Agricultural Economics, 64(1): 15–24.

Farquad, M.A. – Ravi, V. – Raju, S.B. (2012): Analytical CRM in banking and finance using
SVM: a modified active learning-based rule extraction approach. International Journal of
Electronic Customer Relationship Management, 6(1): 48–73. [Link]
IJECRM.2012.046470

Gepp, A. – Wilson, H.J. – Kumar, K. – Bhattacharya, S. (2012): A Comparative Analysis of


Decision Trees Vis-a-vis Other Computational Data Mining Techniques in Automotive
Insurance Fraud Detection. Journal of Data Science, 10(3): 537–561. [Link]
org/10.6339/JDS.201207_10(3).0010

Green, D.M. – Swets, J.A. (1966): Signal detection theory and psychophysics (1 ed., Vol. 1).
New York: Wiley.

Hámori, G. (2001): A fizetésképtelenség előrejelzése logit-modellel (Logit model for predicting


insolvency). Bankszemle, 45(1–2): 65–87.

He, H. – Bai, Y. – Garcia, E. – Li, S. (2008): ADASYN: Adaptive synthetic sampling approach
for imbalanced learning. IEEE International Joint Conference on Neural Networks (IEEE
World Congress On Computational Intelligence), Hong Kong, pp. 1322–1328. [Link]
org/10.1109/IJCNN.2008.4633969

92 Study
Traditional versus AI-Based Fraud Detection

Hill, H.R. – Sandler, B. – Mokgokong, R. – Lister, S. – Ward, T. – Boyce, R. – Farooqui, U. –


Gordon, J. (2020): Cost-effectiveness of targeted screening for the identification of patients
with atrial fibrillation: evaluation of a machine learning risk prediction algorithm. Journal
Of Medical Economics, 23(4): 386–393. [Link]

III (2019): Insurance Fact Book. Insurance Information Institute. Insurance Information
Institute. [Link]

III (2021): Background on: Insurance fraud. Insurance Information Institute. [Link]
org/article/background-on-insurance-fraud. Downloaded: 20 November 2021.

Karamizadeh, F. – Zolfagharifar, S.A. (2016): Using the Clustering Algorithms and Rule-based
of Data Mining to Identify Affecting Factors in the Profit and Loss of Third Party Insurance,
Insurance Company Auto. Indian Journal of Science and Technology, 9(7): 1–9. https://
[Link]/10.17485/ijst/2016/v9i7/87846

Lee, A. – Taylor, P. – Kalpathy-Cramer, J. – Tufail, A. (2017): Machine Learning Has Arrived!


Ophthalmology, 124(12): 1726–1728. [Link]

LII (2023): Insurance Fraud. Legal Information Institute, Cornell Law School. [Link]
[Link]/wex/insurance_fraud. Downloaded: 26 April 2023

Li, Y. – Yan, C. – Liu, W. – Li, M. (2018): A principle component analysis-based random forest
with the potential nearest neighbor method for automobile insurance fraud identification.
Applied Soft Computing, 70(September): 1000–1009. [Link]
asoc.2017.07.027

Massachusetts Regulation (1993): Division of Insurance Regulations. Massachusetts


government. [Link]
Downloaded: 26 April 2023.

Muraközy, B. (2018): Gépi tanulás, predikció és okság a közgazdaság-tudományban (Machine


Learning, Prediction and Causality in Economics). Magyar Tudomány, 179(7): 1027–1037.
[Link]

Nian, K. – Zhang, H. – Tayal, A. – Coleman, T. – Li, Y. (2016): Auto insurance fraud detection
using unsupervised spectral ranking for anomaly. The Journal of Finance and Data Science,
2(1): 58–75. [Link]

Padmaja, T.M. – Dhulipalla, N. – Bapi, R.S. – Krishna, P.R. (2007): Unbalanced data
classification using extreme outlier elimination and sampling techniques for fraud
detection. 15th International Conference on Advanced Computing and Communications
(ADCOM 2007), Guwahati, India, pp. 511–516. [Link]

93
Botond Benedek – Bálint Zsolt Nagy

Pathak, J. – Vidyarthi, N. – Summers, S.L. (2005): A fuzzy-based algorithm for auditors to


detect elements of fraud in settled insurance claims. Managerial Auditing Journal, 20(6):
632–644. [Link]

Pérez, J.M. – Muguerza, J. – Arbelaitz, O. – Gurrutxaga, I. – Martín, J.I. (2005): Consolidated


Tree Classifier Learning in a Car Insurance Fraud Detection Domain with Class Imbalance.
In: Singh, S. – Singh, M. – Apte, C. – Perner, P. (eds.): Pattern Recognition and Data Mining.
ICAPR 2005. Lecture Notes in Computer Science, vol 3686. Springer, Berlin, Heidelberg.
[Link]

Phua, C. – Alahakoon, D. – Lee, V. (2004): Minority report in fraud detection: classification


of skewed data. ACM Sigkdd Explorations Newsletter, 6(1): 50–59. [Link]
org/10.1145/1007730.1007738

Shaeiri, Z. – Kazemitabar, S.J. (2020): Fast unsupervised automobile insurance fraud detection
based on spectral ranking of anomalies. International Journal of Engineering, 33(7): 1240–
1248. [Link]

Šubelj, L. – Furlan, Š. – Bajec, M. (2011): An expert system for detecting automobile insurance
fraud using social network analysis. Expert Systems with Applications, 38(1): 1039–1052.
[Link]

Subudhi, S. – Panigrahi, S. (2017): Use of optimized Fuzzy C-Means clustering and supervised
classifiers for automobile insurance fraud detection. Journal of King Saud University
– Computer and Information Sciences, 32(5): 568–575. [Link]
jksuci.2017.09.010

Sundarkumar, G.G. – Ravi, V. (2015): A novel hybrid undersampling method for mining
unbalanced datasets in banking and insurance. Engineering Applications of Artificial
Intelligence, 37(January): 368–377. [Link]

Sundarkumar, G.G. – Ravi, V. – Siddeshwar, V. (2015): One-class support vector machine


based undersampling: Application to churn prediction and insurance fraud detection. IEEE
International Conference on Computational Intelligence and Research (ICCIC), Madurai,
India, pp. 1–7. [Link]

Tao, H. – Zhixin, L. – Xiaodong, S. (2012): Insurance fraud identification research based


on fuzzy support vector machine with dual membership. International Conference on
Information Management, Innovation Management and Industrial Engineering, Sanya,
pp. 457–460. [Link]

Viaene, S. – Ayuso, M. – Guillen, M. – Van Gheel, D. – Dedene, G. (2007): Strategies for


detecting fraudulent claims in the automobile insurance industry. European Journal of
Operational Research, 176(1): 565–583. [Link]

94 Study
Traditional versus AI-Based Fraud Detection

Viaene, S. – Derrig, R.A. – Baesens, B. – Dedene, G. (2002): A Comparison of State‐of‐the‐Art


Classification Techniques for Expert Automobile Insurance Claim Fraud Detection. Journal
of Risk and Insurance, 69(3): 373–421. [Link]

Virág, M. – Kristóf, T. (2005): Az első hazai csődmodell újraszámítása neurális hálók


segítségével (Recalculation of the first Hungarian bankruptcy-prediction model using neural
networks). Közgazdasági Szemle (Economic Review), 52(2) 144–162.

Virág, M. – Nyitrai, T. (2013): Application of support vector machines on the basis of the
first Hungarian bankruptcy model. Society and Economy, 35(2): 227–248. [Link]
org/10.1556/SocEc.35.2013.2.6

Virág, M. – Nyitrai, T. (2014): The application of ensemble methods in forecasting bankruptcy.


Financial and Economic Review, 13(4): 178–193. [Link]
letoltes/[Link]

Wang, Y. – Xu, W. (2018): Leveraging deep learning with LDA-based text analytics to detect
automobile insurance fraud. Decision Support Systems, 105(January): 87–95. [Link]
org/10.1016/[Link].2017.11.001

Weisberg, H. – Derrig, R. (1991): Fraud and Automobile Insurance: A Report on Bodily Injury
Liability Claims in Massachusetts. Journal of Insurance Regulation, 9(4): 497–541.

Weisberg, H. – Derrig, R. (1998): Quantitative methods for detecting fraudulent automobile


bodily injury claims. Risques, 35: 75–99.

Wilson, J.H. (2009): An analytical approach to detecting insurance fraud using logistic
regression. Journal of Finance and Accountancy, 85(150): 1–15.

Xu, W. – Wang, S. – Zhang, D. – Yang, B. (2011): Random rough subspace based neural
network ensemble for insurance fraud detection. Fourth International Joint Conference on
Computational Sciences and Optimization, Kunming and Lijiang City, China, pp. 1276–1280.
[Link]

Yan, C. – Li, Y. (2015): The Identification Algorithm and Model Construction of Automobile
Insurance Fraud Based on Data Mining. Fifth International Conference on Instrumentation
and Measurement, Computer, Communication and Control (IMCCC), Qinhuangdao, China,
pp. 1922–1928. [Link]

Zelenkov, Y. (2019): Example-dependent cost-sensitive adaptive boosting. Expert Systems


with Applications, 135: 71–82. [Link]

95
Botond Benedek – Bálint Zsolt Nagy

Appendix

Table 3
Spearman’s rank correlation coefficients between rankings based on different
parameters
Negative
Total Estimation
Sensitivity Specificity Precision predictive F-score
savings accuracy
value
Total
1.000
savings
0.069
Sensitivity 1.000
(0.731)
0.831 –0.346
Specificity 1.000
(49.41)*** (–2.57)**
0.924 0.047 0.871
Precision 1.000
(24.56)*** (0.48) (19.15)***
Negative
0.254 0.951 –0.028 0.252
predictive 1.000
(2.47) (33.51)*** (–0.41) (2.78)**
value
Estimation 0.947 –0.081 0.957 0.942 0.135
1.000
accuracy (98.34)*** (–0.62) (38.93)*** (25.87)*** (1.57)
0.828 0.278 0.599 0.792 0.616 0.732
F-score 1.000
(19.11)*** (4.01)*** (6.85)*** (15.68)*** (6.29)*** (11.03)***
Note: The formula used to determine the negative predictive value is: TN/(FN+TN). Student t-statistics in
parentheses. *Significant at 10% level; **Significant at 5% level; ***Significant at 1% level.

96 Study
Traditional versus AI-Based Fraud Detection

Table 4
The 35 fraud detection methods tested and their sensitivity and specificity
Author Method Sensitivity Specificity
Artís et al. (1999) multinomial logit model 0.6614 0.9065
nested multinomial logit model 0.3209 0.8132
Belhadji et al. probit regression – threshold 10% 0.6940 0.9145
(2000) probit regression – threshold 20% 0.5373 0.9596
Artís et al. (2002) logit regression with omission error 0.7793 0.6994
logit regression without omission error 0.7703 0.7094
Bermúdez et al. Bayesian skewed logit model 0.8515 0.9968
(2008) standard logit and Bayesian logit models 0.8515 0.6043
Wilson (2009) logit regression 0.5918 0.8163
Šubelj et al. (2011) social network analysis 0.8913 0.8667
Tao et al. (2012) linear discriminant analysis 0.7392 0.9738
quadratic discriminant analysis 0.7933 0.9767
naive Bayesian 0.8351 0.9815
Farquad et al. MALBA (logistic) – 1,000 extra instances 0.8838 0.5534
(2012) MALBA (normal) – 1,000 extra instances 0.8811 0.5588
ALBA – 1,000 extra instances 0.8784 0.5656
MALBA – 1,000 extra instances 0.8848 0.5560
Sundarkumar et al. decision tree 0.9552 0.5658
(2015) multi-layer perceptron 0.4859 0.7889
support vector machine 0.9400 0.5639
probabilistic neural network 0.9173 0.5533
group method of data handling 0.7362 0.7148
Sundarkumar – probabilistic neural network 0.8750 0.5894
Ravi (2015) multi-layer perceptron 0.6458 0.7189
decision tree 0.9074 0.5869
group method of data handling 0.5686 0.8020
support vector machine 0.9189 0.5839
Subudhi – GAFCM – DT 0.6625 0.8765
Panigrahi (2017) GAFCM – SVM 0.6970 0.8471
GAFCM – MLP 0.6107 0.8400
GAFCM – GMDH 0.5727 0.7976
Zelenkov (2019) example-dependent cost-sensitive Ada-Boost (EDAB.C1) 0.2510 0.9301
example-dependent cost-sensitive Ada-Boost (EDAB.C2) 0.5900 0.7327
example-dependent cost-sensitive Ada-Boost (EDAB.
0.4477 0.8050
C2-ROC)
example-dependent cost-sensitive Ada-Boost (EDAB.C3) 0.2510 0.9301
Note: indicated in bold for traditional statistical econometric models

97
Botond Benedek – Bálint Zsolt Nagy

Table 5
Average cost savings differences between traditional statistical and AI-based identifi-
cation methods
ACI
ASCIFC
100 110 120 130 140 150 160 170 180 190 200
160 73,100 87,426 101,753 116,079 130,405 144,732 159,058 173,384 187,711 202,037 216,363
(46)*** (45)*** (47)*** (48)*** (50)*** (51)*** (51)*** (51)*** (53)*** (53)*** (53)***
180 73,283 87,610 101,936 116,262 130,589 144,915 159,241 173,568 187,894 202,221 216,547
(45)*** (45)*** (46)*** (47)*** (48)*** (50)*** (51)*** (51)*** (51)*** (53)*** (53)***
200 73,467 87,793 102,120 116,446 130,772 145,099 159,425 173,751 188,078 202,404 216,730
(46)*** (46)*** (45)*** (46)*** (47)*** (48)*** (50)*** (51)*** (49)*** (51)*** (51)***
220 73,650 87,977 102,303 116,629 130,956 145,282 159,608 173,935 188,261 202,588 216,914
(44)*** (46)*** (46)*** (45)*** (46)*** (47)*** (48)*** (50)*** (51)*** (50)*** (51)***
240 73,834 88,160 102,487 116,813 131,139 145,466 159,792 174,118 188,445 202,771 217,097
(49)*** (46)*** (46)*** (45)*** (45)*** (47)*** (47)*** (48)*** (50)*** (50)*** (51)***
260 74,017 88,344 102,670 116,996 131,323 145,649 159,975 174,302 188,628 202,955 217,281
(49)*** (43)*** (47)*** (46)*** (45)*** (45)*** (47)*** (47,5)*** (48)*** (50)*** (50)***
280 74,201 88,527 102,853 117,180 131,506 145,833 160,159 174,485 188,812 203,138 217,464
(44)*** (49)*** (46)*** (46)*** (45)*** (45)*** (46)*** (47)*** (48)*** (48)*** (50)***
300 74,384 88,711 103,037 117,363 131,690 146,016 160,342 174,669 188,995 203,322 217,648
(42)*** (46,5)*** (43)*** (47)*** (46)*** (45)*** (45)*** (46)*** (47)*** (48)*** (48)***
320 74,568 88,894 103,220 117,547 131,873 146,200 160,526 174,852 189,179 203,505 217,831
(41)*** (48)*** (47)*** (46)*** (45)*** (46)*** (45)*** (45)*** (46)*** (47)*** (48)***
340 74,751 89,078 103,404 117,730 132,057 146,383 160,709 175,036 189,362 203,689 218,015
(42)*** (44)*** (47)*** (43)*** (46)*** (46)*** (45)*** (45)*** (45)*** (47)*** (47)***
360 74,935 89,261 103,587 117,914 132,240 146,567 160,893 175,219 189,546 203,872 218,198
(43)*** (42)*** (49)*** (46)*** (46)*** (45)*** (46)*** (45)*** (45)*** (46)*** (47)***
380 75,118 89,445 103,771 118,097 132,424 146,750 161,076 175,403 189,729 204,056 218,382
(47)*** (41)*** (46)*** (49)*** (44)*** (46)*** (46)*** (46)*** (45)*** (45)*** (46)***
400 75,302 89,628 103,954 118,281 132,607 146,934 161,260 175,586 189,913 204,239 218,565
(50)*** (42)*** (44)*** (46,5)*** (44)*** (46)*** (46)*** (46)*** (45)*** (45)*** (45)***
420 75,485 89,812 104,138 118,464 132,791 147,117 161,443 175,770 190,096 204,423 218,749
(51)*** (43)*** (42)*** (49)*** (49)*** (44)*** (47)*** (46)*** (46)*** (45)*** (45)***
440 75,669 89,995 104,321 118,648 132,974 147,301 161,627 175,953 190,280 204,606 218,932
(54)*** (43)*** (41)*** (45)*** (47)*** (44)*** (46)*** (46)*** (46)*** (46)*** (45)***
460 75,852 90,179 104,505 118,831 133,158 147,484 161,810 176,137 190,463 204,790 219,116
(60)*** (47)*** (41)*** (44)*** (49)*** (47)*** (44)*** (47)*** (45)*** (46)*** (45)***
480 76,036 90,362 104,688 119,015 133,341 147,668 161,994 176,320 190,647 204,973 219,299
(61)*** (50)*** (42)*** (42)*** (48)*** (49)*** (44)*** (46)*** (45)*** (46)*** (46)***
500 76,219 90,546 104,872 119,198 133,525 147,851 162,177 176,504 190,830 205,157 219,483
(61)*** (52)*** (43)*** (41)*** (44)*** (46,5)*** (47)*** (44)*** (47)*** (45)*** (46)***
520 76,403 90,729 105,055 119,382 133,708 148,035 162,361 176,687 191,014 205,340 219,666
(62)*** (53)*** (45)*** (41)*** (44)*** (49)*** (49)*** (43)*** (46)*** (47)*** (46)***
540 76,586 90,913 105,239 119,565 133,892 148,218 162,544 176,871 191,197 205,523 219,850
(65)*** (57)*** (47)*** (42)*** (42)*** (47,5)*** (47)*** (46)*** (44)*** (47)*** (45)***
560 76,770 91,096 105,422 119,749 134,075 148,402 162,728 177,054 191,381 205,707 220,033
(66)*** (60)*** (50)*** (43)*** (41)*** (44)*** (48)*** (49)*** (43)*** (46)*** (47)***
580 76,953 91,280 105,606 119,932 134,259 148,585 162,911 177,238 1915,64 205,890 220,217
(73)*** (60)*** (51)*** (44)*** (41)*** (44)*** (49)*** (47)*** (45)*** (44)*** (47)***
600 77,137 91,463 105,789 120,116 134,442 148,769 163,095 177,421 191,748 206,074 220,400
(73)*** (61)*** (51,5)*** (45)*** (42)*** (42)*** (46)*** (46,5)*** (47)*** (43)*** (46)***
620 77,320 91,647 105,973 120,299 134,626 148,952 163,278 177,605 191,931 206,257 220,584
(76)** (62)*** (54)*** (47)*** (42)*** (41)*** (44)*** (49)*** (49)*** (44)*** (44)***
640 77,504 91,830 106,156 120,483 134,809 149,136 163,462 177,788 192,115 206,441 220,767
(77)** (65)*** (60)*** (50)*** (43)*** (41)*** (44)*** (48)*** (47)*** (47)*** (43)***
660 77,687 92,014 106,340 120,666 134,993 149,319 163,645 177,972 192,298 206,624 220,951
(82)** (66)*** (60)*** (51)*** (43)*** (41)*** (42)*** (45)*** (46)*** (49)*** (44)***
680 77,871 92,197 106,523 120,850 135,176 149,503 163,829 178,155 192,482 206,808 221,134
(87)** (68)*** (60)*** (52)*** (46,5)*** (42)*** (41)*** (44)*** (49)*** (47)*** (47)***
700 78,054 92,381 106,707 121,033 135,360 149,686 164,012 178,339 192,665 206,991 221,318
(90)** (73)*** (61)*** (54)*** (47)*** (43)*** (41)*** (44)*** (48)*** (46,5)*** (49)***
Note: ASCIFC: average savings for identified fraudulent claims; ACI: average cost per investigation.
Mann-Whitney U-statistics in parentheses. *Significant at 10% level; **Significant at 5% level;
***Significant at 1% level.

98 Study

Common questions

Powered by AI

Factors contributing to differences in model cost-effectiveness include the type of detection method used, the specific input parameters available, and the operational costs associated with each model. The chosen methodology affects how models are ranked in terms of cost savings potential. Real-world costs such as administrative and investigation costs also play a significant role in the cost-effectiveness of fraud detection methods .

Future research should explore the potential integration of hybrid models that combine elements of both AI-based and traditional methods to enhance cost-effectiveness and adaptability in various market conditions. Additionally, studies could focus on the development of AI algorithms that require fewer specific input parameters, making them more accessible to a wider range of companies. Research should also address the gap in literature concerning cost-effectiveness in emerging markets .

Challenges in implementing AI-based fraud detection algorithms include high costs of investment and potentially lower cost-effectiveness compared to traditional methods. Additionally, AI models require large datasets and specific data inputs, which may not be available to all insurance companies. This can limit the applicability and effectiveness of AI methods in certain settings or markets .

Input parameter availability significantly impacts the usability of fraud detection methods for insurance companies. Some methods require specific data inputs such as accident characteristics and reports, which may not be available to all insurers. This variability necessitates the evaluation of available input parameters in each insurance company to decide the most efficient fraud detection methods under given market conditions .

An insurance company might use simulations to test different fraud detection methods under varied scenarios, such as altering the percentage of fraudulent claims and average savings in identified fraudulent claims while keeping investigation costs fixed. This strategy allows the company to visualize cost-effectiveness through heat maps and select the most efficient method under current market conditions and available input parameters .

The boundary between AI and traditional statistical methods is fluid, impacting the evaluation by potentially blurring distinctions in methodology categorization. AI methods are generally developed after the emergence of the AI terminology, while traditional methods involve techniques like linear regression not requiring big data. This distinction shapes the evaluation in cost-effectiveness analysis, affecting decisions regarding which methodology offers better savings or fits specific company needs .

The average savings for identified fraudulent claims vary significantly between different detection methods and scenarios. For example, savings can range from USD 315 to USD 1,213 depending on the method and the specific conditions like the proportion of fraudulent claims and average cost per investigation. These variances underscore the importance of context-specific analysis when selecting fraud detection methods .

The primary difference in cost-effectiveness between traditional statistical methods and AI-based methods lies in their average cost savings. According to the study, traditional statistical methods generally offer higher average cost savings than AI-based methods. The meta-analysis highlights that 22.85% of models are considered cost-effective using the proposed cost-saving calculation method, compared to 94.28% with the method proposed by Phua et al. (2004). Moreover, AI-based methods are currently not justified for heavy investment by insurance companies due to the lack of significant cost-saving advantages .

The method proposed by Phua et al. (2004) classifies almost all models as cost-effective, whereas the new proposed method takes into account the costs incurred in the real fraud detection process and classifies only 22.85% of the models as cost-effective in the most likely scenario, and only 11.42% in the worst-case scenario. This highlights a discrepancy in the classification of cost-effectiveness between both methods due to differing consideration of real operational costs .

Traditional statistical methods might still be preferred over AI-based methods due to their generally higher cost-effectiveness and lower implementation costs. The study finds that the average cost savings for the majority of scenarios are higher for traditional methods. Furthermore, AI-based methods may not offer sufficient cost-saving advantages to justify the investment needed for their deployment, making traditional methods more practical and financially viable for many insurance companies .

You might also like