0% found this document useful (0 votes)
169 views16 pages

Ensemble Auto Insurance Fraud Detection

This study presents an ensemble-based method for auto insurance fraud detection using the Binary Quantum-Based Avian Navigation Optimizer Algorithm (BQANA) for hyperparameter tuning of classifiers like Support Vector Machines, Random Forest, and XGBoost. The proposed model addresses challenges such as dataset imbalance and high false positives, achieving impressive metrics including 99.94% accuracy and 100% recall, significantly outperforming traditional methods. The research emphasizes the effectiveness of combining optimized classifiers into an ensemble to enhance predictive accuracy in fraud detection.

Uploaded by

givenneongobeni
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
169 views16 pages

Ensemble Auto Insurance Fraud Detection

This study presents an ensemble-based method for auto insurance fraud detection using the Binary Quantum-Based Avian Navigation Optimizer Algorithm (BQANA) for hyperparameter tuning of classifiers like Support Vector Machines, Random Forest, and XGBoost. The proposed model addresses challenges such as dataset imbalance and high false positives, achieving impressive metrics including 99.94% accuracy and 100% recall, significantly outperforming traditional methods. The research emphasizes the effectiveness of combining optimized classifiers into an ensemble to enhance predictive accuracy in fraud detection.

Uploaded by

givenneongobeni
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Received 23 January 2025, accepted 3 March 2025, date of publication 5 March 2025, date of current version 14 March 2025.

Digital Object Identifier 10.1109/ACCESS.2025.3548529

An Ensemble-Based Auto Insurance Fraud


Detection Using BQANA Hyperparameter Tuning
AFSANEH GHEYSARBEIGI , MORTEZA RAKHSHANINEJAD ,
MOHAMMAD FATHIAN , AND FARNAZ BARZINPOUR
School of Industrial Engineering, Iran University of Science and Technology, Tehran 13114-16846, Iran
Corresponding author: Mohammad Fathian (fathian@[Link])

ABSTRACT The prevalence of insurance fraud in the auto industry poses significant financial challenges
and undermines customer trust. Despite the application of machine learning methods to reduce these losses,
current literature lacks effective tuned algorithms for detecting fraud in insurance claims. To address this
gap, this study proposes an ensemble-based method with a weighted voting strategy for auto insurance fraud
detection. The study uses the Binary Quantum-Based Avian Navigation Optimizer Algorithm (BQANA)
to optimize the hyperparameters of Support Vector Machines (SVM), Random Forest (RF), and XGBoost
classifiers, which are combined into an ensemble. To address the dataset’s imbalance, random undersampling
was applied to create five legitimate-to-fraudulent claim ratios: A:A, 1:1, 2:1, 4:1, and 8:1. The performance
of BQANA was compared with Genetic Algorithms and Simulated Annealing for hyperparameter tuning.
The results indicate that the ensemble model with BQANA-optimized hyperparameters outperforms other
methods, particularly at a 1:1 ratio, achieving 99.94% Accuracy, 98.93% Precision, 100% Recall, and a
99.46% F1-score. These metrics surpass those obtained without optimization or with traditional tuning
methods. This research highlights the efficacy of the BQANA algorithm in optimizing hyperparameters for
classification models. By combining these optimized classifiers into an ensemble, the study significantly
enhances predictive accuracy in car insurance fraud detection, offering notable improvements over
conventional methods.

INDEX TERMS Insurance fraud detection, machine learning, ensemble learning, metaheuristic,
hyperparameter tuning.

I. INTRODUCTION remain undetected and potentially lead to larger undisclosed


The financial sector is progressively contending with an financial losses [3]. The insurance industry is huge, with
escalation in fraudulent activities, particularly within the more than 7,000 companies and over $1 trillion in yearly
domain of auto insurance [1]. The increase in the number premiums, making it harder to fight fraud [4]. Auto insurance
of vehicles and corresponding insurance policies has been fraud typically involves exaggerated claims, false claims [5],
paralleled by a rise in fraudulent claims, resulting in signif- intentional injury [6], multiple claims, or other forms of
icant economic burdens. Auto insurance fraud is estimated misrepresenting insurance-related information [7]. However,
to impose an annual financial burden exceeding $40 billion, research shows that fraudulent behavior among insurers and
resulting in an incremental cost of approximately $400 to auto repair shops is related to each other, resulting from
$700 in premiums for the average American household [2]. their mutual perceptions of each other’s actions. For example,
Significantly, research shows that while 21% to 36% of if both parties engage in fraudulent activities, both will
auto insurance claims may involve fraud, only about 3% benefit. However, each party refrains from such activities,
receive legal scrutiny, showing that many fraudulent claims neither can benefit and the risk of detection increases [8].
Fraudsters often believe that insurance companies and police
The associate editor coordinating the review of this manuscript and lack the expertise to detect fraud, leading to low interest in
approving it for publication was Chao Tong . such crimes [9]. This belief is reinforced by two factors:
2025 The Authors. This work is licensed under a Creative Commons Attribution 4.0 License.
VOLUME 13, 2025 For more information, see [Link] 42997
A. Gheysarbeigi et al.: Ensemble-Based Auto Insurance Fraud Detection

insurance companies frequently pay illegitimate claims, and tion [21]. By reducing the number of non-fraudulent cases
insurance fraud is rarely prosecuted [4].Traditional manual through techniques like random undersampling, the dataset
verification methods are inadequate, often producing false becomes more balanced, reducing bias and enhancing model
alarms. Consequently, it is essential to employ precise and reliability. Ensemble learning, which combines multiple
intelligent methods, such as data mining, to effectively detect base classifiers with distinct features, further improves
and reduce fraudulent cases [10]. detection accuracy [22]. When these base classifiers have
Various methods are employed to detect auto insurance their hyperparameters tuned with metaheuristic algorithms,
fraud, including statistical techniques [11], machine learning their individual performances are optimized. This optimized
(ML) [12], and deep learning (DL) [13]. Statistical methods, performance, along with the ensemble method employing
such as regression and hypothesis testing, are used to identify weighted voting between the results of these base classifiers,
and analyze anomalies in auto insurance data. Moreover, ensures that the most accurate predictions are prioritized. This
ML methods use intelligent algorithms to detect fraudulent method collectively addresses the three primary challenges
activities in real-time by analyzing relevant data, while DL unbalanced datasets, high false positives, and untuned
methods leverage artificial neural networks (ANN) [14] to hyperparameters resulting in a more effective fraud detection
automatically identify complex patterns and features in large system.
datasets. Each of these methods has its own pros and cons This study introduces an ensemble learning framework
when applied to different types of data, making the choice of designed for the detection of auto insurance fraud, utilizing
method crucial for effective fraud detection [12]. a publicly available dataset of 15,420 car insurance claims
In the field of auto insurance fraud detection, three main from Kaggle, recorded between 1994 and 1996. The dataset
challenges are commonly encountered: unbalanced datasets, includes 32 features, combining categorical (e.g., accident
a high number of false positives, and ML models with area, policy type) and numerical (e.g., age, deductible) data,
untuned hyperparameters. The imbalance between fraudulent offering a rich foundation for analysis. However, with only
and non-fraudulent samples in datasets can lead to biased 6% of claims classified as fraudulent and 94% as legitimate,
models and poor fraud detection [15]. To address this, two the dataset presents a significant class imbalance, posing
main methods are used: oversampling and undersampling. a key challenge for fraud detection. To address this, the
Oversampling techniques like synthetic minority oversam- proposed framework incorporates three base classifiers—
pling technique (SMOTE) [16] increase the number of random forest (RF), extreme gradient boosting (XGBoost),
fraudulent cases, while undersampling methods, such as and support vector machines (SVM)—combined into an
random undersampling [17], reduce the number of non- ensemble model with a weighted voting strategy, and
fraudulent cases. Both methods aim to balance the dataset leverages balancing techniques to mitigate the impact of the
and improve the performance and reliability of predictive imbalance.
models [18]. The hyperparameters of each base classifier are fine-tuned
Additionally, a high number of false positives can occur, using a metaheuristic optimization algorithm named
meaning legitimate claims are incorrectly flagged as fraud- binary quantum-based avian navigation optimizer algorithm
ulent. This not only leads to customer dissatisfaction but (BQANA). For comparative analysis, additional hyperparam-
also increases operational costs for insurance companies. eter tuning methods, including simulated annealing (SA) and
Advanced anomaly detection techniques and robust vali- genetic algorithms (GA), are employed, with their evaluation
dation processes can help mitigate this issue by refining outcomes assessed. To further explore the impact of data
model accuracy [19]. Moreover, ML models with untuned imbalance, the study generates four subsets with varying
hyperparameters often result in suboptimal performance. ratios of fraudulent to legitimate claims. Each classifier
Hyperparameter tuning is essential for enhancing the accu- is assigned a specific weight based on its performance
racy and efficiency of fraud detection models [20]. Properly during classification, enhancing the accuracy of fraud
tuned models can better differentiate between fraudulent identification in the final ensemble voting and prediction
and non-fraudulent claims, thereby reducing both false stages. By integrating these techniques, the framework
positives and false negatives. Our motivation is to address effectively addresses the challenges of unbalanced datasets,
these challenges by incorporating balancing techniques, high false positives, and untuned hyperparameters, providing
an ensemble learning model, and a metaheuristic-based a robust solution for fraud detection in the auto insurance
hyperparameter tuning algorithm to enhance the accuracy sector. The contributions of this work are summarized as
and robustness of auto insurance fraud detection systems. follows:
This approach aims to develop a comprehensive solution that
effectively tackles the limitations of existing methods. • Utilization of the BQANA metaheuristic optimization
Using an undersampled dataset as the input for ensemble algorithm for hyperparameter tuning to enhance classi-
learning, which integrates multiple base classifiers, can fier performance.
effectively mitigate the challenges of unbalanced datasets • Development of an ensemble learning framework with
and high false positives in auto insurance fraud detec- RF, XGBoost, and SVM, using a weighted voting

42998 VOLUME 13, 2025


A. Gheysarbeigi et al.: Ensemble-Based Auto Insurance Fraud Detection

strategy to combine base classifier results, effectively with balanced and unbalanced data. These insights emphasize
reducing false positives in fraud detection. the importance of selecting appropriate algorithms to enhance
• Application of balancing techniques, such as random fraud detection systems. Furthermore, [29] examined the
undersampling, to mitigate the significant class imbal- use of RF, LR, and ANN for fraud detection, revealing
ance present in the dataset. that the RF method demonstrated superior performance,
The rest of this paper is organized into four sections. achieving an Accuracy of 98.21%, Precision of 98.08%,
In Section II, we review previous studies on auto insurance Recall of 100%, and F1-score of 99.03%. In addition, [30]
fraud detection. Section III discusses the research method proposed an ensemble model combining basic ML algorithms
employed in this study. The obtained results and the model’s (RF, DT, XGBoost, and LR) with a meta-heuristic method
effectiveness compared to other methods are presented in called Particle Swarm Optimization (PSO). After balancing
Section IV. Finally, in Section V, we provide practical the classes using SMOTE, the proposed ensemble model
conclusions and implications of the findings. improved the overall Accuracy to 99%.
Several recent studies have also focused on hyperparameter
II. LITERATURE REVIEW tuning techniques to optimize ML models in fraud detection
The detection of insurance fraud remains a complex and contexts. Researchers have adopted both exact and meta-
persistent challenge in contemporary society, requiring the heuristic methods, each offering distinct advantages. Exact
continuous development of innovative algorithms and meth- methods, such as Grid Search (GS), as discussed in [31],
ods to address it effectively. As outlined in the study by [23], provide a systematic approach to hyperparameter optimiza-
fraud can be defined as the misuse of professional positions tion but can be computationally intensive. In contrast,
for personal gain through the intentional misappropriation [32] employed ML techniques, specifically employing GA
of organizational resources. Additionally, the study by [24] to optimize hyperparameters. Their findings showed that
emphasizes the growing issue of financial fraud, particularly incorporating GA results into the LR model increased
within the banking and finance sectors, where complex Accuracy to 94%.
organizational structures and international capital flows are Moreover, [33] offers a thorough exploration of GA
exploited for illegitimate gains. This manipulation under- and XGBoost, focusing on hyperparameter optimization to
mines economic stability and violates legal frameworks and enhance fraud detection systems in smart grid environments.
ethical standards, thus making the development of advanced The experimental findings showed a significant boost in
fraud detection systems increasingly vital. model performance, raising Accuracy from 0.82 to 0.978.
Given the significant economic impact of auto insur- Similarly, [34] compares the proposed PSO method with GS,
ance, numerous researchers have investigated innovative demonstrating that PSO can produce superior solutions more
techniques to detect fraud in this domain, underscoring rapidly. Incorporating PSO results into a Deep Neural Net-
the need for continued improvement. These include ML work (DNN) model led to an Accuracy of 94.93%. Building
techniques such as Multi-Layer Perceptron (MLP), Decision on these advancements, the recent study by [35] introduces a
Trees (DT), SVM [25], Logistic Regression (LR), ANN, and PSO-XGBoost framework tailored for automobile insurance
AdaBoost, Stochastic gradient descent (SGD) methods [18], fraud detection. This framework leverages the optimization
Bagging [26], and other ensemble approaches. The appli- capabilities of PSO to fine-tune XGBoost hyperparameters,
cation of ML techniques in various aspects of the finance achieving a notable 95% accuracy. By enhancing model pre-
domain has been the subject of extensive research in recent cision and interpretability, this approach provides actionable
years. insights for early fraud prevention, further demonstrating the
The study by [12] examines the performance of several effectiveness of PSO in optimizing machine learning models
ML models, including SVM, RF, DT, Adaboost, K-Nearest for complex fraud detection challenges.
Neighbor (KNN), LR, Naïve Bayes (NB), and MLP. Their Data preprocessing and class imbalance handling tech-
findings reveal that the DT model improves the overall niques have been critical in developing effective fraud
Accuracy of the fraud detection system. Similarly, [27] detection methods. Specifically, [36], [37], and [38] empha-
investigates the use of various ML models for insurance fraud sized the use of SMOTE, Random Under-Sampling (RUS),
detection, including RF, Adaboost, XGBoost, KNN, and LR. and Random Over-Sampling (ROS) to address imbalanced
Among the models tested, the RF algorithm demonstrated the datasets. Building on these efforts, [39] propose methods to
highest Accuracy and F1 score in detecting insurance fraud, address imbalanced datasets and missing values, which are
thereby highlighting its strong classification performance. common challenges in real-world insurance fraud detection.
Reference [28] investigated the application of various ML Their framework integrates data imputation techniques, such
algorithms to detect fraudulent vehicle insurance claims. The as KNN and multivariate imputation, with ensemble learning
research evaluated the performance of several models, includ- methods, including Random Forest, XGBoost, and stacking
ing AdaBoost, XGBoost, NB, SVM, LR, DT, ANN, and RF, classifiers. By incorporating advanced resampling techniques
finding that AdaBoost and XGBoost outperformed the other like SMOTE and ADASYN, the model achieved superior
models by achieving a classification Accuracy of 84.5%. accuracy and F1-scores, demonstrating its effectiveness in
In contrast, the LR classifiers showed poor performance, both detecting fraudulent claims and further highlighting the
VOLUME 13, 2025 42999
A. Gheysarbeigi et al.: Ensemble-Based Auto Insurance Fraud Detection

importance of robust preprocessing strategies. Moreover, The employed dataset shows a clear imbalance in the
feature selection methods, such as GA, Firefly Algorithm distribution of fraudulent cases, which account for only 6%
Optimization (FFA), PSO, ANOVA, and Chi-2 [40], [41], (923 claims) of the total, in contrast to the 94% (14,497
are frequently employed to identify the most relevant claims) that are non-fraudulent. This data imbalance shows a
features for robust fraud detection. Additionally, some studies fundamental challenge in fraud detection and is a significant
have utilized unsupervised learning techniques, including factor to consider when selecting appropriate modeling and
K-means and C-means clustering [42], [43], to enhance evaluation techniques.
detection capability. In preparation for the analysis, the dataset underwent a
Despite these advances, several gaps remain. First, unbal- partitioning process to facilitate the model’s training, testing,
anced datasets continue to pose a significant challenge [44], and validation phases. Specifically, the data was randomly
often leading to skewed performance metrics and overlooked divided into three subsets: 60% was allocated for training
fraudulent instances [45]. Second, a high number of false the model, enabling it to learn and adapt to the patterns
positives can undermine practitioner trust and inflate inves- within the data; 20% was reserved for testing, providing an
tigation costs [46]. Finally, many existing ML models are not evaluation of the model’s predictive performance on unseen
rigorously tuned, resulting in suboptimal performance when data; and the remaining 20% was allocated for validation
dealing with complex fraud patterns [47]. purposes, allowing an additional layer of evaluation to
To address these challenges, this study systematically ana- fine-tune the model’s hyperparameters. This partitioning
lyzes different class ratios to identify the optimal approach for strategy is essential for the valid evaluation of fraud detection
balancing our dataset, thereby mitigating skewed detection models. All codes were implemented in Python 3.6.15. The
outcomes. We also reduce false positives through advanced computations were performed on a system equipped with an
feature selection, leading to more precise fraud detection. Intel Core i3-3220 processor and 4 GB DDR3 RAM.
In addition, our proposed methodology adopts an ensemble
learning algorithm that employs a weighted voting strategy to
improve predictive performance in the insurance fraud detec- B. PREPROCESSING
tion field. Furthermore, we leverage BQANA to fine-tune The preprocessing step commenced with a meticulous
the hyperparameters of each base classifier incorporated into examination of the initial dataset to detect and eliminate
the ensemble model, thereby enhancing overall detection redundant features that could hinder the model’s learning
accuracy. By tackling unbalanced data, high false-positive efficiency and jeopardize result Accuracy. The features
rates, and untuned model parameters simultaneously, our named ‘‘Policy Number’’ and ‘‘Age’’ were removed from
research fills a critical gap in the literature and establishes the dataset, as ‘‘Policy Number’’ was used solely as
a more robust framework for detecting fraudulent activities distinct identifiers for each claim, and ‘‘Age’’ duplicated the
in the auto insurance domain. information already provided by the ‘‘Age of policyholder’’
Finally, this research synthesizes key approaches in feature [4]. Upon a detailed review of prior studies outlined
Table 1, specifically examining the BQANA technique in Table 3, it was observed that 10 features have no impact
for hyperparameter optimization in auto insurance fraud on the model’s accuracy and efficiency. Consequently, these
detection. Through this review, we aim to advance the field irrelevant features were methodically eliminated from the
and offer valuable insights that inform future developments. dataset to enhance the speed and Accuracy of fraud detection
in auto insurance. The final set of features that we removed
from the dataset in our study are shown in the last row
III. MATERIAL AND METHODS of Table 3. Moreover, to meet the algorithmic requirement
A. DATASET for numerical input, the classification features underwent
The current research utilizes a comprehensive dataset a transformation process assigning each category a unique
obtained from an insurance company’s car claim records. integer value, streamlining computational processing and
This dataset, available on the Kaggle platform (Link to analysis.
Dataset), contains a substantial collection of 15,420 insurance Additionally, to meet the algorithmic requirement for
claims recorded from January 1994 to December 1996 numerical input, the categorical features underwent a trans-
[49]. For accessibility and reproducibility, details on data formation process, where each category was assigned a
availability are provided in the Data and Code Availability unique integer value. This transformation step reduces the
section at the end of the paper. computational processing and eases the analysis of the data.
The dataset contains 32 features, in addition to a target Another significant issue that researchers are involved with
feature, as detailed in Table 2. Each sample in the dataset is the challenge of data imbalance [15]. Previous scientific
is shown by a binary target feature, which is essential literature has documented various strategies, including both
for classifying the claims into fraudulent or non-fraudulent undersampling and oversampling techniques, to address
categories. The dataset is identified by its variety, and the challenge of data imbalance in the context of fraud
contains 25 categorical and 8 numerical features, providing detection. Previous research has explored the effectiveness
a rich foundation for analysis. of methods such as SMOTE [58], Adaptive Synthetic

43000 VOLUME 13, 2025


A. Gheysarbeigi et al.: Ensemble-Based Auto Insurance Fraud Detection

TABLE 1. Studies reviewed in the field of fraud detection.

Sampling Approach (ADASYN) [59], and Tabular Genera- errors. Meanwhile, the testing and validation sets, each
tive Adversarial Networks (TGAN) [60] in decreasing the comprising 2,899 normal and 185 fraud samples, remain
imbalance within the dataset. These techniques have been unchanged, ensuring an unbiased evaluation of the trained
explored and evaluated for their ability to effectively handle model.
the disproportionate representation of fraudulent and non- Another significant aspect in the field of data prepro-
fraudulent cases, which is a fundamental issue in fraud cessing is the identification of important features. Based on
detection. previous research, various feature selection techniques have
In alignment with the findings of previous study [27], been employed, as documented in Table 5. These include
we opted for the RUS to rectify the balance in our dataset. Our Boruta’s algorithm [4] and meta-heuristic methods such as
objective was to systematically assess which undersampling Ant Colony Optimization (ACO), PSO, and GA [57].
ratio produces the most effective results during the training The results presented in Table 3 and Table 5 shows that
phase and, subsequently, leads to the best performance in certain features, such as ‘‘Rep Number’’, ‘‘Deductible’’, and
prediction. Accordingly, the dataset was divided into training, ‘‘Policy Type’’, have been recognized as significant factors
testing, and validation subsets. Specifically, the entire dataset in some research studies while being deemed less significant
was split randomly, with 60% of the samples constituting the in others. When these features are recognized as important,
training set, 20% allocated to testing, and the remaining 20% the model shows increased Accuracy levels, prompting the
reserved for validation. researchers to categorize these three features as significant.
Table 4 illustrates the five different undersampling ratios Consequently, in the current study, we have utilized a set
we applied to the training set. In Ratio A:A (no under- of 23 features after removing 10 less significant features
sampling), all training samples remained intact, resulting identified through the aforementioned feature selection
in 8,699 normal and 553 fraudulent cases. By contrast, the techniques. This feature engineering process aims to increase
1:1 ratio reduces the number of normal samples to 553, the speed and Accuracy of the fraud detection model by
creating a perfectly balanced subset. The 2:1 ratio allows focusing on the most relevant features.
for 1,106 normal and 553 fraudulent samples, while the
4:1 and 8:1 ratios further increase the number of normal C. MODELING
samples to 2,212 and 4,424, respectively, against the same After the data preprocessing steps on the primary dataset
553 fraud cases. Our intent in gradually modifying the class are completed, the focus is on developing a fraud detection
distribution is to identify the ratio that optimizes the detection model. Through a comprehensive review of the existing
of fraudulent activities while minimizing misclassification literature in the field of fraud detection, the researchers

VOLUME 13, 2025 43001


A. Gheysarbeigi et al.: Ensemble-Based Auto Insurance Fraud Detection

TABLE 2. Description of the dataset.

TABLE 3. Removed features. TABLE 4. Undersampling training set.

method. Once the optimal hyperparameters were determined,


the models were trained using five data imbalance ratios
(A:A, 1:1, 2:1, 4:1, and 8:1) as described in Table 4.
Subsequently, we constructed an ensemble model where
have explored various modeling techniques, including LR, each of the three base classifiers was assigned a weight
DT, SVM, NB [49], CatBoost, XGBoost, RF [56], KNN, reflecting its relative importance. Weights were calculated
and AdaBoost [27]. Based on insights from the literature, using 10-fold cross-validation results. Each classifier within
we selected SVM, RF, and XGBoost for this study, as shown the ensemble algorithm was allocated a weight between
in Table 6. To enhance their performance, we optimized 0 and 1 based on its performance on the validation dataset,
the hyperparameters of these classifiers using the BQANA measured by Accuracy, Precision, Recall, and F1-score.

43002 VOLUME 13, 2025


A. Gheysarbeigi et al.: Ensemble-Based Auto Insurance Fraud Detection

TABLE 5. Methods used to select important features. Afterward, we calculated the weight for all 5 ensembles:

Ensembles Weight = {WEclf1 , WEclf2 , WEclf3 , WEclf4 , WEclf5 }

Ultimately, the ensemble algorithm with the maximum


calculated weight was selected for evaluation based on the
criteria of Accuracy, Precision, Recall, and F1-score using
the test dataset. The flowchart of the proposed method is
illustrated in Figure 1.

These weights were then used in a weighted voting technique


to determine the ensemble model’s results on the test set.
Given the five training datasets with varying ratios of
fraudulent to normal samples, we developed five distinct
ensemble models, denoted as Eclf1 , Eclf2 , . . . , Eclf5 . The
base classifiers in the first ensemble algorithm are denoted as FIGURE 1. The flowchart of the proposed method.
clf1 , clf2 . . . , clf3 . For example, Equation (1) and Equation (2)
determines the weight of the first base classifier within the In the field of ML, the definition of hyperparameters is
first ensemble algorithm. The result from Equation (1) is used significant for any algorithm prior to starting the training step.
in Equation (2) to calculate the weights of each classifier The optimization of these hyperparameters plays a significant
within the ensemble model. In each ensemble, the sum of role in increasing the model’s performance on the test dataset.
the base classifiers’ weights is constrained to equal one. Each algorithm is determined by a unique set of predefined
The notations used in Equations (1) and (3) are as follows: hyperparameters. Therefore, we can describe hyperparameter
Accuracy (A), Precision (P), Recall (R), and F1-Score (F1). tuning as an endeavor to determine the optimal values of
Aclf × Pclf1 × Rclf1 × F1clf1 the hyperparameters for the learning algorithm that submits
Dclf1 = P3 1  (1) the best model performance. Previous studies have explored
i=2 Aclfi × Pclfi × Rclfi × F1clfi several types of hyperparameter tuning methods, such as RS,
Dclf GS [32], and meta-heuristic algorithms. Through an exam-
Wclf1 = P3 1 (2)
i=1 Dclfi
ination of prior research in the domain of fraud detection,
various methods such as GA, Differential Evolution (DE),
Then, as below, for all the ensembles, calculate the weights Artificial Bee Colony (ABC), Grey Wolf Optimizer (GWO),
of the classifiers inside them: PSO, Teaching-Learning-Based Optimization (TLBO), and
GS [34] has been used. In this study, we use BQANA,
Eclf1 − Classifiers Weight = {Wclf1 , Wclf2 , Wclf3 } which has not been studied in past studies in the field of car
Eclf2 − Classifiers Weight = {Wclf4 , Wclf5 , Wclf6 } insurance fraud detection.
Eclf3 − Classifiers Weight = {Wclf7 , Wclf8 , Wclf9 } First in the study [61] Quantum-based avian navigation
Eclf4 − Classifiers Weight = {Wclf10 , Wclf11 , Wclf12 } optimizer algorithm (QANA) was introduced. Then in the
study [62] a binary version of QANA named BQANA
Eclf5 − Classifiers Weight = {Wclf13 , Wclf14 , Wclf15 } was introduced. In this paper, using BQANA, the superior
features of several high-dimensional datasets were identified
The voting strategy used the calculated weights to determine
each ensemble algorithm’s outcome. For example, the weight to perform classification operations. Then in the study [63]
of the first ensemble was computed using Equations (3) and [64] QANA was used to solve engineering problems.
and (4), with the result from Equation (3) serving as input Although QANA is powerful and used in versatile fields of
for Equation (4). engineering studies, this algorithm and its binary versions are
not used for hyperparameter tuning in any of the published
AEclf1 × PEclf1 × REclf1 × F1Eclf1 papers. We aim to optimize classifiers including XGBoost,
DEclf1 = P5 (3)
 RF, and SVM across the ensemble using the BQANA
i=2 AEclfi × PEclfi × REclfi × F1Eclfi
DEclf1 optimization algorithm.
WEclf1 = P5 (4) QANA is a population-based meta-heuristic algorithm,
i=1 DEclfi that draws inspiration from the navigational patterns of

VOLUME 13, 2025 43003


A. Gheysarbeigi et al.: Ensemble-Based Auto Insurance Fraud Detection

migratory birds during extended aerial journeys. This algo- is a random integer acting as a coefficient for adjusting the
rithm, QANA, is designed with a multi-flock framework length of the vector |ϕi ⟩d within the Bloch sphere [61].
and quantum-driven navigation, incorporating two muta- (
tion techniques and a qubit-crossover operator to enhance xid (t + 1), |ϕi |d < rand
uid (t + 1) = (9)
efficient exploration of the search domain. At first, the vid (t + 1), |ϕi |d ≥ rand
initial step involves dividing the migratory bird population   
θ θ
  
into multiple flocks in a random manner. Subsequently, the |ϕi ⟩d = |ϕR ⟩d × cos |0⟩ + eiϕ sin |1⟩ ,
2 2
algorithm imitates the flight formation of migratory birds π
to share acquired information among search agents through θ, ϕ = rand × (10)
2
the utilization of a V-echelon communication structure.
Assuming V represents a collection of n individuals within As per the findings from a prior study [61], QANA
the flock fq , comprising a header (H) and two subgroups demonstrates superior performance compared to other estab-
known as right-line (R) and left-line (L) arranged in a V- lished optimizers across diverse continuous search space
shaped configuration. benchmark assessments. When compared to its rivals, QANA
The flocks employ a quantum-based navigation method surpasses them in terms of both exploration and exploitation
for search space exploration, incorporating a Success-based capabilities. Consequently, the foundational components
Population Distribution (SPD) strategy, two mutation meth- of the conventional QANA are adapted to formulate its
ods known as ‘‘DE/quantum/I’’ and ‘‘DE/quantum/II,’’ and binary counterpart. During the binary QANA’s formulation,
a qubit-crossover operator. Each flock dynamically switches the initial solutions are generated at random within the
between these mutation techniques, with fm representing the interval [0, 1]. Following this initialization, the iterative
flocks utilizing Mm in iteration t (as shown in Equation (5)). procedure is carried out until the predefined termination
The variable τij is set to 1 if Mm enhances aj of the i-th flock criterion, typically the maximum number of iterations is met.
in the set fm ; otherwise, it is assigned a value of 0. Based on this study [62], using the threshold method for
   binary conversion of continuous solutions yields significantly
X nj=1 τij .
P
improved outcomes compared to transfer functions like S-
SRm (t) =   |fm | × 100 (5) shaped, V-shaped, U-shaped, and Z-shaped.
n
i∈fm The performance of SVM, RF, and XGBoost models
The quantum mutation strategies are defined by Equa- within an ensemble model relies on the precise selection of
tions (6) and (7). Here, xi (t) represents the current position optimal hyperparameters, so in this study, our objective is to
of search agent ai in the iteration t, xVechelon (t) denotes the employ BQANA to select optimal hyperparameters set for
position of the subsequent search agent after ai , and xbest (t) each base classifiers inside the ensemble. A list of possible
indicates the best search agent’s location. Random selections parameters associated with these three models is outlined in
from short-term memory (STM) and long-term memory Table 6.
(LTM) are denoted by xj∈STM (t) and xj∈LTM (t) respectively. The SVM model includes c, kernel, degree, gamma,
Equation (8) is utilized to compute the trial vector vH (t + 1) shrinking, and tol hyperparameters [34]. In the case
as the leader in the V-echelon structure, where L and U of the RF model, fine-tuning involved hyperparameters
represent the lower and upper boundaries of the search space. such as solver type, n_estimators, criterion, max_depth,
Additionally, Si denotes the quantum orientation of the bird min_samples_split, min_samples_leaf, max_features, boot-
ai , and incorporates a parameter adaptation mechanism based strap, and max_features [34]. The tuning of the XGBoost
on a historical record of successful parameters. model centered on hyperparameters such as learning
 rate, n_estimators, min_weight_fraction_leaf, max_depth,
vi (t + 1) = xbest (t) + Si (t) × xVechelon (t) − xj∈LTM (t) min_impurity_decrease, colsample_bytree, reg_alpha,
reg_lambda, and subsample [33]. The range of the relevant

+ Si (t) × xVechelon (t) − xbest (t)
 hyperparameters was identified through a review of the
+ Si (t) × xj∈LTM (t) − xj∈STM (t) (6)
 documentation available on the Scikit-learn ([Link]
vi (t + 1) = Si (t) × xbest (t) − xVechelon (t) [Link]) platform as well as related scientific literature.

+ Si (t) × xi (t) − xj∈LTM (t) − xj∈STM (t) (7)
vH (t + 1) = Si (t) × xbest + (L + (U − L) × rand(0, 1)) D. FITNESS EVALUATION
(8) To optimize the hyperparameters of the XGBoost, SVM,
and RF classifiers, we employed the BQANA algorithm.
To generate the trial vector ui (t + 1), the mutant vector This algorithm at first generates proposed solutions for the
vi (t + 1) is combined with its parent xi (t) using Equation (9), model hyperparameters in the first iteration. Each solution
with |ϕi ⟩d representing the qubit-crossover probability for the length is equal to a number of hyperparameters of classifiers.
d-th dimension. Each iteration involves the calculation of a Subsequently, the fitness function calculates the mean of
qubit-crossover |ϕi ⟩d for each dimension of the trial vector Accuracy, Precision, Recall, and F1-score for each of these
ui (t + 1) through Equation (10), where the parameter |ϕR ⟩d solutions and selects the best solutions from the first iteration.

43004 VOLUME 13, 2025


A. Gheysarbeigi et al.: Ensemble-Based Auto Insurance Fraud Detection

TABLE 6. XGBoost, Svm, Rf hyperparameters to be optimized by BQANA, E. COMPARISON


SA, and GA.
In this study, we employed the BQANA meta-heuristic
algorithm to optimize the hyperparameters of the SVM, RF,
and XGBoost models, as well as their ensemble, and then
compared the results against two other metaheuristic methods
(SA and GA).
In Section III, we explained the mechanism of BQANA
for hyperparameter tuning. In comparison, SA begins with a
defined initial temperature and a cooling coefficient, leading
to a gradual reduction in temperature with each iteration.
At each iteration, SA evaluates a new set of randomly
selected hyperparameters. If this new set results in a better
fitness function value, it is used for subsequent iterations.
As the algorithm progresses, the likelihood of accepting
suboptimal solutions decreases. The fitness function within
SA assesses the efficacy of a solution in relation to the defined
problem [66].
GA, on the other hand, starts by creating an initial
population of potential solutions, typically represented as
binary strings, though other data structures may also be
used. An objective function evaluates each individual’s
performance within the population. Higher-quality solu-
tions are more likely to be selected for reproduction.
During the crossover phase, the genetic information of
two parent solutions merges to produce new offspring,
However, we repeated this process to select the optimal aiming to create an improved solution that incorporates
solutions for a total of 10 iterations. We employed four beneficial features of the parents. The mutation step intro-
classification metrics to assess the predictive abilities of duces random changes in the offspring’s genetic com-
the generated solutions by BQANA: Accuracy, Precision, position to maintain genetic diversity and prevent the
Recall, and F1-score. Following an fitness evaluation of algorithm from becoming too uniform or trapped in local
the generated solutions, we rank them based on the mean optima [67].
performance across the four metrics. The hyperparameter values used for BQANA, SA, and GA
The accuracy metric is defined as the ratio of the number are presented in Table 7. These parameters were selected
of correct predictions to the total number of predictions [65], based on guidelines from previous studies, specifically [61]
relies on True Positives (TP), True Negatives (TN), False for BQANA, [66] for GA, and [67] for SA, which detail
Positives (FP), and False Negatives (FN) to determine the standard parameter settings and best practices for each
model’s accuracy, as outlined below. metaheuristic method. Similar to BQANA, SA and GA also
TN + TP employ an iterative evaluation of the fitness function to
Accuracy = (11)
TN + FP + FN + TP identify an optimal solution. In our experiments, we ran
The precision formula signifies the ratio of positive each of these algorithms (BQANA, SA, and GA) ten times,
predictions that were accurately identified. with each run consisting of 100 iterations. The best results
obtained from these ten runs for each algorithm are illustrated
TP
Precision = (12) in Figures 2, 3, and 4. For the comparative evaluation, we used
TP + FP the ‘‘[Link]’’ insurance dataset, partitioning it into
The recall formula, provided below, indicates the propor- three subsets for training (60%), testing (20%), and validation
tion of actual positive cases that were correctly recognized. (20%). We began by fine-tuning the hyperparameters of the
TP SVM, RF, and XGBoost classifiers using BQANA for each
Recall = (13) of the specified data ratios (A:A, 1:1, 2:1, 4:1, and 8:1).
TP + FN
After determining the optimal hyperparameters, we trained
The balance between Precision and Recall can lead to a
each classifier separately and then combined them into an
situation where a model performs well on one metric but
ensemble learning model, with the goal of further improving
poorly on the other. The F1-score tackles this challenge by
fraud detection performance.
taking both metrics into account simultaneously [4]. It is
We first evaluated the performance of each base classifier
formulated as follows:
and the ensemble model across the five data ratios (A:A,
2
F1-Score = (14) 1:1, 2:1, 4:1, and 8:1) before hyperparameter tuning. Next,
1 1
Precision + Sensitivity we employed the meta-heuristic algorithms BQANA, SA,
VOLUME 13, 2025 43005
A. Gheysarbeigi et al.: Ensemble-Based Auto Insurance Fraud Detection

TABLE 7. Hyperparameters and values used in three metaheuristic TABLE 8. Hyperparameter values for the SVM model tuned using BQANA,
optimization algorithms BQANA, GA, and SA. GA, and SA across different training set ratios.

and GA to optimize the hyperparameters of these classifiers,


and re-evaluated their performance using the same data ratios.

IV. RESULTS AND DISCUSSION


In this section, we present the outcomes of applying
three metaheuristic algorithms—BQANA, SA, and GA—to
fine-tune the hyperparameters of SVM, RF, and XGBoost
models across five dataset ratios (A:A, 1:1, 2:1, 4:1, and
8:1). After determining the optimal hyperparameters for
each model-dataset combination, we leveraged the tuned
SVM, RF, and XGBoost classifiers as base learners in an
ensemble approach, aiming to enhance overall performance.
Tables 8, 9, and 10 summarize the best hyperparameter
values identified for SVM, RF, and XGBoost, respectively,
categorized by the metaheuristic optimization algorithms
(BQANA, SA, and GA) and the training set ratios used during
the tuning process.
The effectiveness and performance of the metaheuristic
algorithms in tuning the hyperparameters of the ML models
are depicted in Figures 2, 3, and 4. The iterative process
of the BQANA algorithm, as shown in Figure 2, begins
by evaluating the fitness value of the initial solution and
refining this solution over 100 iterations to identify the
optimal hyperparameter configuration. Similarly, Figures 3
and 4 illustrate the fitness values at each iteration for
the SA and GA algorithms, respectively. These figures
provide a clear visualization of how each algorithm converges FIGURE 2. BQANA hyperparameter selection optimization plot.
toward improved solutions over the course of 100 iterations,
demonstrating their effectiveness in optimizing ML model
performance through hyperparameter tuning.
We first assessed the performance of each base classifier Following this tuning process, we re-evaluated their perfor-
and the ensemble model across the five data ratios prior to mance using the same dataset ratios. The comparative results,
hyperparameter tuning, as shown in Table 11. Subsequently, highlighting the impact of hyperparameter optimization on
we applied the metaheuristic algorithms BQANA, SA, and each classifier and the ensemble model, are presented in
GA to optimize the hyperparameters of these classifiers. Table 12.

43006 VOLUME 13, 2025


A. Gheysarbeigi et al.: Ensemble-Based Auto Insurance Fraud Detection

TABLE 9. Hyperparameter values for the RF model tuned using BQANA, TABLE 10. Hyperparameter values for the XGBoost model tuned using
GA, and SA across different training set ratios. BQANA, GA, and SA across different training set ratios.

and after hyperparameter tuning. These results demonstrate


the effectiveness of the metaheuristic algorithms employed
in this study—BQANA, SA, and GA—in optimizing the
FIGURE 3. SA hyperparameter selection optimization plot. hyperparameters of the ML models. The improvement in
performance across the classifiers and ensemble model
highlights the significance of these methods in accurately
Tables 11 and 12 summarize the evaluation metrics, distinguishing between fraudulent and legitimate insurance
including Accuracy, Precision, Recall, and F1-score, before claims.

VOLUME 13, 2025 43007


A. Gheysarbeigi et al.: Ensemble-Based Auto Insurance Fraud Detection

an enhanced capacity for identifying suspicious claims with


fewer false positives.
The Precision and F1-score of the ensemble method using
BQANA at a 1:1 ratio were also noteworthy, achieving
scores of 98.93% and 99.46%, respectively. These metrics
consistently outperformed those obtained by SA and GA
across all ratios. In comparison, the best Precision and F1-
score achieved without hyperparameter tuning were 96.84%
and 98.13%, which are lower than the values achieved with
BQANA.
Figure 5 visually illustrates the superior performance of
the BQANA algorithm across evaluation metrics, particularly
for the ensemble model at the 1:1 ratio. Figures 6 and 7
FIGURE 4. GA hyperparameter selection optimization plot.
depict the corresponding results for the SA and GA algo-
rithms, respectively, confirming that BQANA consistently
outperforms both methods across all metrics and ratios. These
TABLE 11. Evaluation metrics of ML models before hyperparameter
tuning. findings establish BQANA as a highly effective metaheuristic
algorithm for optimizing hyperparameters in ML models
for fraud detection, leading to substantial improvements in
overall performance.

FIGURE 5. BQANA results compare plot.

In terms of Accuracy, the ensemble method with BQANA


achieved the best performance at a 1:1 ratio, recording
an Accuracy of 99.94%. This surpasses the Accuracy of
99.87% and 99.84% achieved by SA and GA, respectively,
emphasizing the superior optimization capability of BQANA. FIGURE 6. SA results compare plot.
In contrast, when hyperparameter tuning was not applied,
the ensemble model with a 1:1 ratio achieved an Accuracy We also evaluated the computational efficiency of the
of 99.77%, which is lower than the results obtained with BQANA algorithm compared to SA and GA. The runtime
BQANA. This comparison highlights the added value of required for hyperparameter tuning across the different
hyperparameter optimization. machine learning models (SVM, RF, XGBoost) and training
For Recall, the ensemble model using BQANA at a 1:1 set ratios (A:A, 1:1, 2:1, 4:1, and 8:1) is detailed in Table 13.
ratio achieved a perfect score of 100%, while GA and SA The results show that BQANA, while achieving superior
produced Recall scores of 99.46% and 98.92%, respectively. performance metrics, requires slightly more computation
This result further underscores BQANA’s ability to correctly time than SA and GA due to its quantum-inspired complexity.
identify fraudulent claims while minimizing false negatives. For instance, when applied to the SVM model with a 1:1 ratio,
Without hyperparameter optimization, the Recall at a 1:1 ratio the runtime for BQANA was approximately 512 seconds,
was also 99.46%, but the results with BQANA demonstrated compared to 398 seconds for GA and 312 seconds for SA.

43008 VOLUME 13, 2025


A. Gheysarbeigi et al.: Ensemble-Based Auto Insurance Fraud Detection

TABLE 12. Evaluation metrics of ML models after hyperparameter tuning.

TABLE 13. Runtime comparison for hyperparameter tuning across


machine learning models and train set ratios.

FIGURE 7. GA results compare plot.

This difference is consistent across other models and ratios,


as detailed in Table 13, reflecting the trade-off between
computational efficiency and enhanced model performance.
The results clearly underscore the importance of selecting
optimal hyperparameters over relying on default settings
or traditional approaches such as Grid Search, Random
Search, or manual tuning. Hyperparameter optimization
through metaheuristic methods not only reduces the manual
effort involved in deploying machine learning models but
also enhances their predictive accuracy and robustness. Our methodology leverages extensive simulations involving
Furthermore, this approach supports the development of various machine learning models (SVM, RF, XGBoost) and
advanced methodologies for solving complex real-world hyperparameter tuning algorithms (BQANA, SA, and GA) to
problems, such as insurance fraud detection. identify the optimal model configuration.
To further validate the effectiveness of the proposed The comparative studies involve diverse methodolo-
method, we compared the best-performing model obtained gies, including PSO-based hyperparameter optimization for
in this study—an ensemble model with hyperparameters XGBoost [35], stacking models with oversampling tech-
tuned using BQANA at a 1:1 training set ratio—with models niques to address class imbalance [39], ensemble CNN mod-
developed in other state-of-the-art studies. The comparison, els for feature extraction and classification [68], hybrid data
presented in Table 14, highlights the superiority of our augmentation approaches for resampling [27], fuzzy clus-
proposed method in classifying insurance claims accurately. tering techniques integrated with advanced classifiers [56],

VOLUME 13, 2025 43009


A. Gheysarbeigi et al.: Ensemble-Based Auto Insurance Fraud Detection

TABLE 14. Comparison of evaluation metrics with related studies. The practical implications of this study are significant,
especially for insurance companies. By accurately iden-
tifying fraudulent claims, the proposed framework has
the potential to reduce financial losses and administrative
burdens. However, challenges such as handling real-world
data inconsistencies and ensuring seamless integration with
existing IT infrastructures must be addressed. For instance,
incorporating preprocessing steps to handle noisy or missing
data and optimizing runtime for larger datasets will be crucial
for operational deployment. Additionally, the ensemble
approach requires compatibility with existing fraud detection
systems, which may necessitate customization or incremental
integration strategies. Despite these challenges, the demon-
strated effectiveness of the BQANA-tuned ensemble model
underscores its potential as a transformative tool for modern
insurance fraud detection systems.

V. CONCLUSION
Insurance fraud remains a critical challenge, necessitating
and neural networks optimized for imbalanced datasets [23]. sophisticated detection systems. This study introduced a
Despite their effectiveness in specific scenarios, the results robust ensemble learning framework with hyperparameter
show that our ensemble model with BQANA tuning outper- optimization using the BQANA algorithm, outperforming
forms these approaches across multiple evaluation metrics GA and SA in enhancing model performance. Using the
(Accuracy, Precision, Recall, and F1-score), demonstrating imbalanced Carclaims dataset, the ensemble model tuned
its robustness and adaptability to the complex challenge of with BQANA and a 1:1 ratio achieved 99.94% accuracy,
insurance fraud detection. demonstrating superior results across all evaluation metrics.
The advantages of using metaheuristic algorithms, partic- These findings underscore the significance of addressing data
ularly BQANA, extend beyond superior performance metrics imbalances, employing metaheuristic-based hyperparameter
to their potential real-world applicability. For instance, tuning, and leveraging ensemble techniques to improve fraud
the ensemble model tuned with BQANA achieved an detection systems.
Accuracy of 99.94% and a perfect Recall of 100% at the Despite its promising outcomes, this study is limited
1:1 ratio, outperforming SA and GA by notable margins. to a single dataset, which may affect the generalizability
The balanced dataset provided by the 1:1 ratio amplifies of the findings. Additionally, while BQANA achieved the
the effectiveness of hyperparameter optimization, enabling best predictive performance, it incurred slightly higher
the model to accurately detect fraudulent claims while computational costs compared to GA and SA. Future work
minimizing false positives. These results, supported visually should focus on validating the proposed framework on
by Figures 5, 6, and 7, highlight the consistent superiority diverse datasets, enhancing computational efficiency, and
of the BQANA-tuned ensemble method compared to other integrating deep learning techniques to further advance fraud
approaches. Furthermore, the computational efficiency of detection capabilities.
BQANA, as shown in Table 13, demonstrates its feasibility
DATA AND CODE AVAILABILITY
for practical deployment. While BQANA incurs slightly
higher computational costs compared to SA and GA, this The dataset and code used in this study are publicly
is justified by its substantial performance gains, making it available in the Figshare repository under the DOI:
a viable solution for real-world scenarios where accuracy is 10.6084/[Link].28207571.
paramount. REFERENCES
The robustness of the proposed framework is particu- [1] E. W. T. Ngai, Y. Hu, Y. H. Wong, Y. Chen, and X. Sun, ‘‘The application
larly relevant in addressing challenges commonly faced in of data mining techniques in financial fraud detection: A classification
real-world settings, such as noisy or incomplete datasets. framework and an academic review of literature,’’ Decis. Support Syst.,
vol. 50, no. 3, pp. 559–569, Feb. 2011.
The adaptability of metaheuristic algorithms, demonstrated [2] L. Maiano, A. Montuschi, M. Caserio, E. Ferri, F. Kieffer, C. Germanò,
through evaluations across multiple training set ratios, L. Baiocco, L. R. Celsi, I. Amerini, and A. Anagnostopoulos, ‘‘A deep-
ensures that the models maintain robust performance even learning–based antifraud system for car-insurance claims,’’ Expert Syst.
Appl., vol. 231, Nov. 2023, Art. no. 120644.
in imbalanced or imperfect data conditions. Moreover, the [3] A. Singhal, N. Singhal, Divya, and K. Sharma, ‘‘Machine learning methods
automated nature of hyperparameter tuning minimizes the for detecting car insurance fraud: Comparative analysis,’’ in Proc. 3rd Int.
reliance on manual intervention, making the framework Conf. Intell. Technol. (CONIT), Jun. 2023, pp. 1–5.
[4] F. Aslam, A. I. Hunjra, Z. Ftiti, W. Louhichi, and T. Shams, ‘‘Insurance
scalable for integration into existing insurance fraud detection fraud detection: Evidence from artificial intelligence and machine
workflows. learning,’’ Res. Int. Bus. Finance, vol. 62, Dec. 2022, Art. no. 101744.

43010 VOLUME 13, 2025


A. Gheysarbeigi et al.: Ensemble-Based Auto Insurance Fraud Detection

[5] F. Gao, J. W. Lien, and J. Zheng, ‘‘Exaggerating to break-even: Reference- [28] H. I. Okagbue and O. Oyewole, ‘‘Prediction of automobile insurance fraud
dependent moral hazard in automobile insurance claims,’’ Available SSRN, claims using machine learning,’’ Sci. Temper, vol. 14, no. 3, pp. 756–762,
vol. 2021, pp. 1–51, Feb. 2021. Sep. 2023.
[6] A. Ahmed, A. F. M. Sadullah, and A. S. Yahya, ‘‘Errors in accident data, [29] E. Nabrawi and A. Alanazi, ‘‘Fraud detection in healthcare insurance
its types, causes and methods of rectification-analysis of the literature,’’ claims using machine learning,’’ Risks, vol. 11, no. 9, p. 160, Sep. 2023.
Accident Anal. Prevention, vol. 130, pp. 3–21, Sep. 2019. [30] B. P. Verma, V. Verma, and A. Badholia, ‘‘Hyper-tuned ensemble machine
[7] H. Lando, ‘‘Optimal rules of negligent misrepresentation in insurance learning model for credit card fraud detection,’’ in Proc. Int. Conf. Inventive
contract law,’’ Int. Rev. Law Econ., vol. 46, pp. 70–77, Jun. 2016. Comput. Technol. (ICICT), Jul. 2022, pp. 320–327.
[8] A. M. Macedo, C. V. Cardoso, J. S. M. Neto, and [31] O. R. Sanchez, M. Repetto, A. Carrega, and R. Bolla, ‘‘Evaluating ML-
C. A. da Costa Brás da Cunha, ‘‘Car insurance fraud: The role of based DDoS detection with grid search hyperparameter optimization,’’
vehicle repair workshops,’’ Int. J. Law, Crime Justice, vol. 65, Jun. 2021, in Proc. IEEE 7th Int. Conf. Netw. Softwarization (NetSoft), Jun. 2021,
Art. no. 100456. pp. 402–408.
[9] J. Jung and B.-J. Kim, ‘‘Insurance fraud in korea, its seriousness, and policy [32] M. Tayebi and S. E. Kafhali, ‘‘Hyperparameter optimization using genetic
implications,’’ Frontiers Public Health, vol. 9, Nov. 2021, Art. no. 791820. algorithms to detect frauds transactions,’’ in Proc. Int. Conf. Artif. Intell.
[10] W. Hilal, S. A. Gadsden, and J. Yawney, ‘‘Financial fraud: A review of Comput. Vis. Springer, Jan. 2021, pp. 288–297.
anomaly detection techniques and recent advances,’’ Expert Syst. Appl., [33] A. Mehdary, A. Chehri, A. Jakimi, and R. Saadane, ‘‘Hyperparameter
vol. 193, May 2022, Art. no. 116429. optimization with genetic algorithms and XGBoost: A step forward in
[11] T. Badriyah, L. Rahmaniah, and I. Syarif, ‘‘Nearest neighbour and statistics smart grid fraud detection,’’ Sensors, vol. 24, no. 4, p. 1230, Feb. 2024.
method based for detecting fraud in auto insurance,’’ in Proc. Int. Conf. [34] M. Tayebi and S. El Kafhali, ‘‘Performance analysis of metaheuristics
Appl. Eng. (ICAE), Oct. 2018, pp. 1–5. based hyperparameters optimization for fraud transactions detection,’’
[12] L. Rukhsar, W. H. Bangyal, K. Nisar, and S. Nisar, ‘‘Prediction of Evol. Intell., vol. 17, no. 2, pp. 921–939, Apr. 2024.
insurance fraud detection using machine learning algorithms,’’ Mehran [35] N. Ding, X. Ruan, H. Wang, and Y. Liu, ‘‘Automobile insurance fraud
Univ. Res. J. Eng. & Technol., vol. 41, no. 1, pp. 33–40, Jan. 2022. detection based on PSO-XGBoost model and interpretable machine
[13] C. Gomes, Z. Jin, and H. Yang, ‘‘Insurance fraud detection with unsu- learning method,’’ Insurance, Math. Econ., vol. 120, pp. 51–60, Jan. 2025.
pervised deep learning,’’ J. Risk Insurance, vol. 88, no. 3, pp. 591–624, [36] P. Mrozek, J. Panneerselvam, and O. Bagdasar, ‘‘Efficient resampling
Sep. 2021. for fraud detection during anonymised credit card transactions with
[14] S. Mirjalili, ‘‘Evolutionary algorithms and neural networks,’’ in Evolution- unbalanced datasets,’’ in Proc. IEEE/ACM 13th Int. Conf. Utility Cloud
ary Algorithms and Neural Networks: Theory and Applications (Studies in Comput. (UCC), Dec. 2020, pp. 426–433.
Computational Intelligence), vol. 780. Cham, Switzerland: Springer, 2019, [37] C. G. Tekkali and K. Natarajan, ‘‘Smart fraud detection in e-transactions
pp. 43–53. using synthetic minority oversampling and binary Harris hawks optimiza-
[15] M. Rakhshaninejad, M. Fathian, B. Amiri, and N. Yazdanjue, ‘‘An tion,’’ in Proc. Comput., Mater. & Continua, Jan. 2023, vol. 75, no. 2,
ensemble-based credit card fraud detection algorithm using an efficient pp. 3171–3187.
voting strategy,’’ Comput. J., vol. 65, no. 8, pp. 1998–2015, Aug. 2022.
[38] M. Hanafy and R. Ming, ‘‘Using machine learning models to compare
[16] I. M. Nur Prasasti, A. Dhini, and E. Laoh, ‘‘Automobile insurance fraud
various resampling methods in predicting insurance fraud,’’ J. Theor. Appl.
detection using supervised classifiers,’’ in Proc. Int. Workshop Big Data
Inf. Technol., vol. 99, no. 12, pp. 2819–2833, 2021.
Inf. Secur. (IWBIS), Oct. 2020, pp. 47–52.
[39] A. A. Khalil, Z. Liu, A. Fathalla, A. Ali, and A. Salah, ‘‘Machine learning
[17] D. Trisanto, N. Rismawati, M. Mulya, and F. Kurniadi, ‘‘Effectiveness
based method for insurance fraud detection on class imbalance datasets
undersampling method and feature reduction in credit card fraud
with missing values,’’ IEEE Access, vol. 12, pp. 155451–155468, 2024.
detection,’’ Int. J. Intell. Eng. Syst., vol. 13, no. 2, pp. 173–181, Apr. 2020.
[40] X. Li, ‘‘Identifying the optimal machine learning model for predicting
[18] B. Itri, Y. Mohamed, B. Omar, and Q. Mohamed, ‘‘Empirical oversampling
car insurance claims: A comparative study utilising advanced techniques,’’
threshold strategy for machine learning performance optimisation in
Academic J. Bus. Manag., vol. 5, no. 3, pp. 112–120, 2023.
insurance fraud detection,’’ Int. J. Adv. Comput. Sci. Appl., vol. 11, no. 10,
pp. 1–6, 2020. [41] E. Ileberi, Y. Sun, and Z. Wang, ‘‘A machine learning based credit card
fraud detection using the GA algorithm for feature selection,’’ J. Big Data,
[19] X. Liang, Y. Gao, and S. Xu, ‘‘ASE: Anomaly scoring based ensemble
vol. 9, no. 1, p. 24, Dec. 2022.
learning for highly imbalanced datasets,’’ Expert Syst. Appl., vol. 238,
Mar. 2024, Art. no. 122049. [42] A. Ghorbani and S. Farzai, ‘‘Fraud detection in automobile insurance using
[20] S. Dalal, B. Seth, M. Radulescu, C. Secara, and C. Tolea, ‘‘Predicting fraud a data mining based approach,’’ Int. J. Mechatronics, Elektrical Comput.
in financial payment services through optimized hyper-parameter-tuned Technol., vol. 8, no. 27, pp. 3764–3771, 2018.
XGBoost model,’’ Mathematics, vol. 10, no. 24, p. 4679, Dec. 2022. [43] H. Ahmad, B. Kasasbeh, B. Al-Dabaybah, and E. Rawashdeh, ‘‘EFN-
[21] A. Y. Kusdiyanto and Y. Pristyanto, ‘‘Machine learning models for SMOTE: An effective oversampling technique for credit card fraud
classifying imbalanced class datasets using ensemble learning,’’ in Proc. detection by utilizing noise filtering and fuzzy c-means clustering,’’ Int.
5th Int. Seminar Res. Inf. Technol. Intell. Syst. (ISRITI), Dec. 2022, J. Data Netw. Sci., vol. 7, no. 3, pp. 1025–1032, 2023.
pp. 648–653. [44] H. Kaur, H. S. Pannu, and A. K. Malhi, ‘‘A systematic review on
[22] A. A. Feitosa-Neto, J. C. Xavier-Júnior, A. M. P. Canuto, and imbalanced data challenges in machine learning: Applications and
A. C. M. Oliveira, ‘‘A study of model and hyper-parameter selection solutions,’’ ACM Comput. Surv., vol. 52, no. 4, pp. 1–36, Jul. 2020.
strategies for classifier ensembles: A robust analysis on different [45] S. Das, S. Datta, and B. B. Chaudhuri, ‘‘Handling data irregularities
optimization algorithms and extended results,’’ Natural Comput., vol. 20, in classification: Foundations, trends, and future challenges,’’ Pattern
no. 4, pp. 805–819, Dec. 2021. Recognit., vol. 81, pp. 674–693, Sep. 2018.
[23] M. A. Caruana and L. Grech, ‘‘Automobile insurance fraud detection,’’ [46] F. G. Rebitschek, G. Gigerenzer, and G. G. Wagner, ‘‘People underestimate
Commun. Statist., Case Stud., Data Anal. Appl., vol. 7, no. 4, pp. 520–535, the errors made by algorithms for credit scoring and recidivism prediction
Oct. 2021. but accept even fewer errors,’’ Sci. Rep., vol. 11, no. 1, p. 20171, Oct. 2021.
[24] A. Ali, S. Abd Razak, S. H. Othman, T. A. E. Eisa, A. Al-Dhaqm, [47] M. E. Lokanan and V. Maddhesia, ‘‘Supply chain fraud prediction with
M. Nasser, T. Elhassan, H. Elshafie, and A. Saif, ‘‘Financial fraud detection machine learning and artificial intelligence,’’ Int. J. Prod. Res., vol. 63,
based on machine learning: A systematic literature review,’’ Appl. Sci., no. 1, pp. 286–313, Jan. 2025.
vol. 12, no. 19, p. 9637, Sep. 2022. [48] S. Subudhi and S. Panigrahi, ‘‘Effect of class imbalanceness in detecting
[25] S. Subudhi and S. Panigrahi, ‘‘Use of optimized fuzzy C-means clustering automobile insurance fraud,’’ in Proc. 2nd Int. Conf. Data Sci. Bus. Anal.
and supervised classifiers for automobile insurance fraud detection,’’ (ICDSBA), Sep. 2018, pp. 528–531.
J. King Saud Univ.-Comput. Inf. Sci., vol. 32, no. 5, pp. 568–575, Jun. 2020. [49] B. Itri, Y. Mohamed, Q. Mohammed, and B. Omar, ‘‘Performance com-
[26] M. K. Severino and Y. Peng, ‘‘Machine learning algorithms for fraud parative study of machine learning algorithms for automobile insurance
prediction in property insurance: Empirical evidence using real-world fraud detection,’’ in Proc. 3rd Int. Conf. Intell. Comput. Data Sci. (ICDS),
microdata,’’ Mach. Learn. Appl., vol. 5, Sep. 2021, Art. no. 100074. Oct. 2019, pp. 1–4.
[27] Z. S. Rubaidi, B. Ben Ammar, and M. Ben Aouicha, ‘‘Vehicle insurance [50] S. Padhi and S. Panigrahi, ‘‘Decision templates based ensemble classifiers
fraud detection based on hybrid approach for data augmentation,’’ J. Inf. for automobile insurance fraud detection,’’ in Proc. Global Conf.
Assurance & Secur., vol. 18, no. 5, pp. 135–146, 2023. Advancement Technol. (GCAT), Oct. 2019, pp. 1–5.

VOLUME 13, 2025 43011


A. Gheysarbeigi et al.: Ensemble-Based Auto Insurance Fraud Detection

[51] S. Harjai, S. K. Khatri, and G. Singh, ‘‘Detecting fraudulent insur- AFSANEH GHEYSARBEIGI received the M.S.
ance claims using random forests and synthetic minority oversampling degree in information technology engineering,
technique,’’ in Proc. 4th Int. Conf. Inf. Syst. Comput. Netw. (ISCON), majoring in e-commerce (machine learning in car
Nov. 2019, pp. 123–128. insurance fraud detection) from Iran University
[52] N. S. Patil, S. Kamanavalli, S. Hiregoudar, S. Jadhav, S. Kanakraddi, of Science and Technology, Iran, in 2024. Her
and N. D. Hiremath, ‘‘Vehicle insurance fraud detection system using research interests include decision support sys-
robotic process automation and machine learning,’’ in Proc. Int. Conf. tems, industrial engineering, artificial intelligence,
Intell. Technol. (CONIT), Jun. 2021, pp. 1–5.
machine learning, and deep learning.
[53] S.-Z. Shareh Nordin, Y. B. Wah, N. K. Haur, A. Hashim, N. Rambeli, and
N. A. Jalil, ‘‘Predicting automobile insurance fraud using classical and
machine learning models,’’ Int. J. Electr. Comput. Eng., vol. 14, no. 1,
p. 911, Feb. 2024.
[54] M. Zhu, Y. Zhang, Y. Gong, C. Xu, and Y. Xiang, ‘‘Enhancing credit card
fraud detection a neural network and SMOTE integrated approach,’’ 2024,
arXiv:2405.00026.
[55] M. Abdul Salam, K. M. Fouad, D. L. Elbably, and S. M. Elsayed,
‘‘Federated learning model for credit card fraud detection with data bal-
MORTEZA RAKHSHANINEJAD received the
ancing techniques,’’ Neural Comput. Appl., vol. 36, no. 11, pp. 6231–6256,
Apr. 2024. M.S. degree in information technology engineer-
[56] S. K. Majhi, ‘‘Fuzzy clustering algorithm based on modified whale ing, majoring in e-commerce (machine learning
optimization algorithm for automobile insurance fraud detection,’’ Evol. in fraud detection systems) from Iran University
Intell., vol. 14, no. 1, pp. 35–46, Mar. 2021. of Science and Technology, Iran, in 2019. His
[57] S. Subudhi and S. Panigrahi, ‘‘Detection of automobile insurance fraud research interests include applied machine learn-
using feature selection and data mining techniques,’’ Int. J. Rough Sets ing, information systems, computational biology,
Data Anal., vol. 5, no. 3, pp. 1–20, Jul. 2018. bioinformatics, and big data.
[58] E. Ileberi, Y. Sun, and Z. Wang, ‘‘Performance evaluation of machine
learning methods for credit card fraud detection using SMOTE and
AdaBoost,’’ IEEE Access, vol. 9, pp. 165286–165294, 2021.
[59] M. Zakariah, S. A. AlQahtani, and M. S. Al-Rakhami, ‘‘Machine learning-
based adaptive synthetic sampling technique for intrusion detection,’’ Appl.
Sci., vol. 13, no. 11, p. 6504, May 2023.
[60] X. Zhao and S. Guan, ‘‘CTCN: A novel credit card fraud detection
method based on conditional tabular generative adversarial networks and
temporal convolutional network,’’ PeerJ Comput. Sci., vol. 9, Oct. 2023, MOHAMMAD FATHIAN received the M.S. and
Art. no. e1634. Ph.D. degrees in industrial engineering from Iran
[61] H. Zamani, M. H. Nadimi-Shahraki, and A. H. Gandomi, ‘‘QANA: University of Science and Technology, Tehran.
Quantum-based avian navigation optimizer algorithm,’’ Eng. Appl. Artif. He is currently a Professor with the School of
Intell., vol. 104, Sep. 2021, Art. no. 104314. Industrial Engineering, Iran University of Science
[62] M. H. Nadimi-Shahraki, A. Fatahi, H. Zamani, and S. Mirjalili, ‘‘Binary and Technology. He is working in the areas of
approaches of quantum-based avian navigation optimizer to select effective information technology and industrial engineer-
features from high-dimensional medical data,’’ Mathematics, vol. 10,
ing. He has more than 90 journal articles and five
no. 15, p. 2770, Aug. 2022.
books in the areas of industrial engineering and
[63] M. H. Nadimi-Shahraki, ‘‘An effective hybridization of quantum-based
information technology.
avian navigation and bonobo optimizers to solve numerical and mechanical
engineering problems,’’ J. Bionic Eng., vol. 20, no. 3, pp. 1361–1385,
May 2023.
[64] R. R. Mostafa, O. Kisi, R. M. Adnan, T. Sadeghifar, and A. Kuriqi,
‘‘Modeling potential evapotranspiration by improved machine learning
methods using limited climatic data,’’ Water, vol. 15, no. 3, p. 486,
Jan. 2023.
[65] A. Singh, A. Jain, and S. E. Biable, ‘‘Financial fraud detection approach FARNAZ BARZINPOUR received the Ph.D.
based on firefly optimization algorithm and support vector machine,’’ Appl.
degree from Tarbiat Modares University, in 2004.
Comput. Intell. Soft Comput., vol. 2022, pp. 1–10, Jun. 2022.
She is currently an Associate Professor of indus-
[66] I.-S. Oh, J.-S. Lee, and B.-R. Moon, ‘‘Hybrid genetic algorithms for
feature selection,’’ IEEE Trans. Pattern Anal. Mach. Intell., vol. 26, no. 11,
trial engineering with Iran University of Science
pp. 1424–1437, Nov. 2004. and Technology. She teaches operations research,
[67] A. Kuznetsov, M. Karpinski, R. Ziubina, S. Kandiy, E. Frontoni, meta-heuristics algorithms, principles of logistics
O. Peliukh, O. Veselska, and R. Kozak, ‘‘Generation of nonlinear and supply chain engineering, research method-
substitutions by simulated annealing algorithm,’’ Information, vol. 14, ology, and integer programming. Her research
no. 5, p. 259, Apr. 2023. interests include optimization and meta-heuristic
[68] Y. Abakarim, M. Lahby, and A. Attioui, ‘‘A bagged ensemble convolutional algorithms, health systems engineering, and
neural networks approach to recognize insurance claim frauds,’’ Appl. Syst. supply chain management.
Innov., vol. 6, no. 1, p. 20, Jan. 2023.

43012 VOLUME 13, 2025

Common questions

Powered by AI

The study's BQANA method was compared against other methodologies such as PSO-based optimization for XGBoost, stacking models with oversampling for class imbalance, and fuzzy clustering with advanced classifiers. BQANA was found to be superior in classification accuracy, especially in dealing with imbalanced datasets .

After applying hyperparameter optimization with algorithms like BQANA, the machine learning models showed marked improvements in predictive accuracy. For example, the ensemble model's accuracy increased from 99.77% without tuning to 99.94% with BQANA at a 1:1 ratio .

Dataset ratio variations influence model performance by altering the balance between training and testing sets. In this study, models achieved higher accuracy at a 1:1 ratio after hyperparameter tuning, as demonstrated by the BQANA algorithm's performance, which provided a comprehensive balance of fraud detection capabilities .

The study validated BQANA's effectiveness by comparing its tuned ensemble model at a 1:1 ratio against models from other advanced methodologies. The comparative metrics demonstrated its superiority in accuracy and recall, positioning BQANA as the most effective among evaluated techniques for insurance fraud detection .

The study underscores the limitations of traditional methods like Grid Search or manual tuning, advocating for quantum-inspired algorithms like BQANA. These advanced methods not only reduce manual labor but also significantly boost model performance in fraud detection, enabling more accurate and efficient detection processes .

With hyperparameter tuning using BQANA, the precision and recall metrics improved significantly compared to the untuned models. Precision rose to 98.93% and recall reached a perfect score of 100% in ensemble models at a 1:1 ratio, demonstrating the method's ability to correctly identify fraudulent claims while minimizing false positives .

Metaheuristic algorithms like BQANA, SA, and GA optimize hyperparameters of machine learning models, significantly improving their accuracy and robustness in fraud detection tasks by fine-tuning model parameters beyond default or manually-set options .

BQANA achieved the highest accuracy for hyperparameter tuning in insurance fraud detection models, outperforming SA and GA in most metrics. Specifically, the ensemble method with BQANA at a 1:1 ratio recorded an accuracy of 99.94%, surpassing SA's 99.87% and GA's 99.84% .

Using BQANA offers superior accuracy and recall in model performance but incurs a higher computational cost. For instance, BQANA's runtime for SVM at a 1:1 ratio is approximately 512 seconds, compared to GA's 398 seconds and SA's 312 seconds, reflecting a trade-off between computational efficiency and performance .

Ensemble models, which combine multiple classifiers, enhance detection capabilities by leveraging diverse algorithms like SVM, RF, and XGBoost. This study found that using an ensemble model tuned with BQANA improved performance metrics across accuracy, precision, recall, and F1 score, particularly when different algorithms are optimized and used together .

You might also like