0% found this document useful (0 votes)
21 views

Paper 29

Uploaded by

nav27543
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views

Paper 29

Uploaded by

nav27543
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

1

Transparency and Privacy: The Role of Explainable


AI and Federated Learning in Financial Fraud
Detection
Tomisin Awosika, Raj Mani Shukla, and Bernardi Pranggono
School of Computing and Information Science, Anglia Ruskin University, UK
Email: [email protected], [email protected], [email protected]
arXiv:2312.13334v1 [cs.LG] 20 Dec 2023

Abstract—Fraudulent transactions and how to detect them is bank-related fraud [2], which translates into bank account
remain a significant problem for financial institutions around fraud on which this research is based.
the world. The need for advanced fraud detection systems to Bank account fraud differs from other financial deceptions
safeguard assets and maintain customer trust is paramount for
financial institutions, but some factors make the development of in its methods, impacts, and detection challenges. Unlike credit
effective and efficient fraud detection systems a challenge. One of card fraud, where unauthorized transactions can be quickly
such factors is the fact that fraudulent transactions are rare and detected due to unusual spending patterns, bank account fraud
that many transaction datasets are imbalanced; that is, there can manifest itself in subtler ways, such as unauthorized funds
are fewer significant samples of fraudulent transactions than transfers, account takeovers, or even identity theft leading to
legitimate ones. This data imbalance can affect the performance
or reliability of the fraud detection model. Moreover, due to the the creation of new accounts [3]. The consequences for the
data privacy laws that all financial institutions are subject to victim can be long-lasting, both financially and emotionally.
follow, sharing customer data to facilitate a higher-performing Understanding and mitigating these threats requires thorough
centralized model is impossible. Furthermore, the fraud detection research, underpinned by rich and diverse datasets.
technique should be transparent so that it does not affect the user In the quest to develop systems that can detect bank account
experience. Hence, this research introduces a novel approach
using Federated Learning (FL) and Explainable AI (XAI) to fraud, Machine Learning (ML) is frequently adopted because
address these challenges. FL enables financial institutions to it effectively trains systems to deliver precise predictions based
collaboratively train a model to detect fraudulent transactions on data inputs. The choice of a specific machine learning
without directly sharing customer data, thereby preserving data algorithm is contingent upon the nature of the data and the
privacy and confidentiality. Meanwhile, the integration of XAI specific type of fraud that the target is trying to identify. Data
ensures that the predictions made by the model can be understood
and interpreted by human experts, adding a layer of transparency sets of bank account transactions not only hold confidential
and trust to the system. Experimental results, based on realistic data but also display an imbalance, with fraudulent transac-
transaction datasets, reveal that the FL-based fraud detection tions less frequent than legitimate ones. Such characteristics
system consistently demonstrates high performance metrics. present obstacles in devising a robust fraud detection system.
This study grounds FL’s potential as an effective and privacy- Banks employ their proprietary data to train different ML
preserving tool in the fight against fraud.
models to recognize potentially fraudulent activities, reflecting
Index Terms—Fraud Detection, Explainable AI, Federated a centralized ML methodology. This centralized approach is
Learning, Machine Learning
predominant in the financial sector today, credited largely to
its proficiency in processing vast data volumes and discerning
I. I NTRODUCTION underlying patterns [4]. However, one challenge with the
In the era of digital banking, ensuring the security and centralized model is that different banks often face diverse
integrity of financial activities has become paramount. While fraudulent patterns, which could hinder their ability to spot
this transformation offers unparalleled convenience, it also ex- new fraudulent behaviors. This is where Federated Learning
poses users to the vulnerabilities of cyber threats, a significant comes into the picture.
one being bank account fraud. Financial frauds, particularly Federated Learning (FL) is a novel privacy-preserving ap-
in online banking and credit card transactions, pose serious proach to decentralized machine learning, [5], [6]. It presents
threats to the global economy, the trustworthiness of finan- a potential solution to this predicament by enabling model
cial institutions, and the financial well-being of individuals. training on local devices and only sharing aggregated updates.
According to the UK Finance 2022 report, billions are lost The core difference between the centralized method is in the
annually due to fraudulent activities, highlighting the need for realm of collaboration and data protection. While the former
more robust detection mechanisms [1]. remains confined within the walls of one bank, the latter is a
With this, financial institutions have conducted and continue collective effort that spans multiple banks.
to undertake rigorous research to combat and identify fraud, In the context of fraud detection, FL stands out as more
irrespective of its nature. Nevertheless, fraud remains compli- than just a novel technological approach; it is identified as the
cated due to its ever-evolving tactics and diverse behaviors. indicator of a collaborative and confidential countermeasure
A prevalent domain that is the subject of extensive research against fraudulent schemes. This growing significance arises
2

from its capacity to combine insights from different institu- factors that lead an individual to commit fraud. The authors
tions without the need for direct data exchange. Moreover, emphasize three key elements: the incentive or pressure driving
FL focuses on sharing model updates rather than heavy data, the fraudulent act, the presence of an opportunity, and the
which is both faster and more efficient and ensures that rationalization or justification by the perpetrator. Expanding
customer data is not compromised. on this classic model, [9] added depth by introducing three
Additionally, the AI-based fraud detection techniques are more elements: the actual act of fraud, the methods employed
black-box in nature and not transparent. For critical appli- to conceal it, and the subsequent ‘conversion’, where the
cations, like bank fraud detection, it is imperative that the fraudsters benefit from their deceptive actions.
AI system is accurate as well as trustworthy. To address the As financial institutions grapple with a staggering volume of
problem, we integrate Explainable AI (XAI) methods in the fraudulent transactions, innovative solutions rooted in machine
given FL-based banking fraud detection. In this regard, the learning and deep learning have emerged at the forefront to
proposed method not only preserves user privacy and provides identify and mitigate these risks. At its core, machine learning
a collaborative infrastructure to train AI models, but is also is a subset of artificial intelligence that blends computer
trustworthy. Thus, in this research, a fraud detection technique algorithms with statistical modeling [10]. This synthesis allows
is proposed that uses the combined strengths of FL and XAI computers to perform tasks without explicit programming.
is proposed. The standout benefits of this methodology are Instead, the system learns from the training data and uses
numerous, with a central emphasis on user privacy preser- the experiential knowledge stored to make predictions or take
vation and transparency. The following is a summary of the actions. Within machine learning’s orbit is deep learning,
contributions of this study. which harnesses artificial neural networks to decipher more
1) Incorporate an FL approach for an advanced fraud de- intricate relationships in data. The depth and intricacy of
tection system to ensure that individual data remains these networks, like Convolutional Neural Networks (CNN)
localized allowing only model updates to be centralized or Restricted Boltzmann Machines (RBM), enable them to
thus enhancing the privacy of banking datasets. This capture unique relationships across large datasets.
design inherently bolsters the privacy of banking datasets, A variety of machine learning and deep learning methods
a crucial advantage in our data-sensitive age. have been explored in the academic sphere for fraud detection.
2) Develop a Deep Neural Network (DNN) model tailored For instance, a study by [11] delved into the efficacy of k-
to recognize patterns associated with fraudulent activities Nearest Neighbors (KNN), Support Vector Machines (SVM),
across federated databases, ensuring high accuracy. and ensemble classifiers in detecting fraud. Their research
3) Integration of the XAI technique to provide transparency emphasized the challenges, like the highly unbalanced data
in the model’s decisions, ensuring a novel approach to where fraudulent transactions are less than legitimate ones,
fraud detection systems. and the dynamic nature of fraud, which necessitates regu-
4) Integrating the proposed FL-based system into a web- larly updated machine learning algorithms. Meanwhile, other
based application to visualize the practicability of the researchers like [12] explored the application of machine
proposed approach. learning algorithms like Random Forest and ensemble models
The following are the remaining sections of this paper. like AdaBoost. Sharma et al. in [13] discussed the application
Section II covers the relevant literature review and related of Auto-Encoders in the fraud detection framework. Auto-
work. In Section III, we present our methodology. We explain Encoders are specialized neural networks designed for data
our detailed implementation in Section IV. In Section V, we encoding. They operate by compressing input data into a
discuss our results. The research is concluded in section VI. compact representation and subsequently reconstructing it.
Any significant reconstruction error, especially in a model
trained on legitimate transactions, can flag potential anomalies.
II. R ELATED W ORK Parallelly, the Restricted Boltzmann Machine (RBM) can learn
Fraud detection, a very old challenge, has seen immense a probability distribution over its set of inputs. The RBM’s
evolution over the years in response to the development ability to detect intricate patterns in unlabeled data makes it
of technology and the schemes and strategies employed by apt for identifying unauthorized transactions in vast, imbal-
fraudsters [7]. While fraud has been a disturbing menace anced datasets where fraudulent activities are but a minuscule
from ancient times, the advancement of new technology has fraction [14].
amplified the avenues for fraudulent behavior. Technological Historically, many ML-based approaches to fraud detec-
advancements, such as communication platforms and digital tion have been centralized. In a centralized learning system,
finance tools, that are meant to benefit us can unintentionally individual assets or clients transmit their data to a central
also give an advantage to malicious individuals whose main hub or server, where data management and training occur
goal is to cause harm. This has resulted in the rise of new [15]. However, this centralized approach presents challenges,
types of fraud, such as mobile telecommunications scams and especially in industries. The transfer of confidential data poses
computer breaches. risks related to latency, data security, and privacy issues.
Understanding the nature and underpinnings of fraudulent In many cases, data owners might be unwilling or legally
behavior has been a subject of significant scholarly atten- restricted from sharing sensitive information, complicating the
tion [8]. A foundational approach to this study has been process.
the ‘fraud triangle’, a conceptual model that demystifies the In response to these challenges, the focus has shifted to
3

decentralized learning approaches. FL stands out as a promis- A. System architecture


ing solution [16]. Originally developed for cellular phones, FL
offers a mechanism in which the end user’s data remains on
their device, ensuring data privacy. In this approach, instead
of sending raw data, only model updates or changes are
sent to the central server, which aggregates these updates to
refine the global model [16]. Notably, FL is adaptable and
can handle both homogeneous independent and identically
distributed (IID) and heterogeneous nonindependent identi-
cally distributed (non-IID) data. This flexibility is especially
beneficial when assets have different failure modes or operate
under diverse conditions.
However, the choice between centralized and decentralized
models does not only depend on data security and privacy
concerns. The effectiveness of the chosen machine learning
model also plays a pivotal role in fraud detection. Although
recurrent neural networks (RNNs), especially long-short-term
memory networks (LSTMs), have been popular for such pre-
dictions, they have inherent limitations. For instance, LSTMs, Fig. 1: Proposed Federated Learning Architecture
due to their sequential nature, tend to forget long sequences
from earlier time-steps [17]. The proposed architecture diagram of the system shown in
The choice between centralized and decentralized machine Fig. 1 outlines the structure and flow of the FL approach
learning systems in finance and asset management is multi- for detecting bank fraud. Central to this architecture is a
faceted. While the centralized approach offers simplicity and server that orchestrates the coordination of model training and
direct control, it poses significant challenges related to data aggregation across numerous banks and financial institutions.
transfer, privacy, and security. On the other hand, decentralized Each institution functions as a distinct node, housing its own
systems, especially those based on FL, provide a more private local DNN model trained on proprietary data, ensuring that
and potentially efficient approach. As industries continue to data never leaves its premises, thereby bolstering data privacy.
evolve and prioritize operational reliability and safety, the Through periodic communication, these local models transmit
balance between these two systems will likely shift, with their insights, not the data itself, to refine and improve a global
more innovations in decentralized learning approaches leading model. This architecture not only capitalizes on the collective
the way. intelligence of all participating entities but also respects the
FL presents a potential solution to this predicament by imperative need for data security in the financial domain.
enabling model training on local devices and only sharing The client and the server both have their individualistic roles
aggregated updates. FL’s core idea is to train machine learning and actions that they perform. The server initializes the global
models across multiple decentralized devices or servers that model, often starting with random weights or from a pre-
hold local data samples, without the need to exchange the trained model, and distributes this model to all participating
data itself. The central server aggregates these updates to clients (banks) for local training. Once clients complete their
form a global model, which is then sent back to each device. local training and send back model updates, the server is
This global model is refined with more rounds of localized responsible for aggregating these updates to refine the global
training and aggregation, leading to a model that benefits model. This typically involves averaging weights, but more
from all available data without directly accessing it [18]. A sophisticated aggregation algorithms can also be employed.
study by [19] details the growing concern about data privacy After aggregation, the server may validate the newly updated
that creates obstacles for banks to share data. Concurrently, global model using a held-out validation set to ensure its
most fraud detection systems are developed internally, keeping performance meets the required standards. The server oversees
model details confidential to maintain data security. the synchronization of model updates, ensuring that clients are
To address the problem, in this paper, we propose a feder- working with the most recent version of the global model. It
ated learning-based architecture for banking fraud detection. also manages any necessary communication between clients,
Furthermore, we integrate the XAI techniques into the pro- although direct client-to-client communication is typically
posed FL-based method to impart the additional advantages of minimal in FL.
transparency and trust to create a more robust financial system. Clients on the other hand are responsible for training the
In addition to developing a theoretical infrastructure for the FL received global model on their local dataset. This involves
platform, we also deploy it using a web-based framework. running several training epochs to refine the model based on
their specific data. After local training, clients send the model
updates (e.g., weight changes) back to the central server. This
III. M ETHODOLOGY
does not involve sending any raw data, thus preserving data
In this section, we explain the proposed system architecture privacy. Clients ensure that raw data never leaves their local
and the methodology adopted in this research. environment. All data preprocessing, cleaning, and training are
4

done in-house, ensuring data confidentiality and compliance


with privacy regulations. After the server aggregates updates
and refines the global model, clients receive this updated
model for subsequent rounds of training. Using the locally
trained model, clients can perform real-time fraud detection on
new transactions, leveraging the insights from the collective
intelligence of the FL system without compromising data
security. This approach not only guarantees the consistency
of the globally shared model but also ensures its convergence.
Instead of using generic models, a fine-tuned Deep Neural
Network (DNN) model was specifically designed to recognize
patterns associated with fraudulent activities across federated
databases. The architecture of the model capitalizes on the
immense computational strength of DNNs to discern and
pinpoint fraudulent activities more effectively.
Furthermore, the proposed system model combines FL with
XAI to bring forth a novel approach to fraud detection. While
FL ensures efficient model training across various devices
without compromising data privacy, XAI offers transparent
and interpretable model decisions. This combination is pivotal,
especially in sectors where understanding the rationale behind
a model’s decision is as critical as the decision itself [20]. With
FL, the benefits are manifold. First and foremost, data privacy
and security are significantly enhanced, as raw data remains
at its origin, mitigating the risk of breaches during transfers
to centralized servers — a pivotal safeguard considering the
vulnerability of sensitive financial data to cyber-attacks [21].
Furthermore, FL capitalizes on efficient data utilization by
training models on real-time, varied data from multiple origins,
leading to a more comprehensive and current fraud detection Fig. 2: Workflow of the proposed System
system [22].

B. Proposed Federated Learning-based model to form the global model update. This aggregation in Federated
Utilizing FL for fraud detection not only leverages the averaging is typically a weighted sum of the local updates:
power of collective data without compromising individual data
X ηk
privacy but also promotes more collaborative efforts among W t +1 = Wtk +1
institutions to combat fraud in an ever-evolving landscape. η
k
A typical ML model update in a centralized setting, using
Stochastic Gradient Descent (SGD), can be represented as: Where nk is the number of data points on client k, and n
is the total number of data points across all clients.
Wt +1 = Wt − η∇L(Wt ) This process repeats for several rounds until convergence.
A key advantage is that only the model updates (and not the
where W represents the model parameters, η is the learning raw data) are communicated, which helps in maintaining data
rate, and ∇L(Wt ) is the gradient of the loss function L with privacy. In essence, Federated averaging offers a compromise
respect to the model parameters at iteration t. between local and centralized learning, allowing models to
In FL, this update is not done centrally. Instead, each benefit from diverse local data sources without compromising
client (device or server) computes its update, and these are user privacy.
aggregated in some way to update the global model. The
core idea behind the Federated averaging algorithm proposed
by McMahan et al. [16] is to modify the standard SGD by C. Explainable AI integration
computing several updates on each client and then averaging In the sphere of finance, interpretability is a necessity. The
these updates on the server. decisions and predictions made by models have profound
For a given global model w, each client k computes its
real-world implications, so understanding these decisions is
update from its local data:
paramount. XAI has emerged to bridge this gap between
the opaque nature of certain models and the requirement for
Wtk +1 = Wt − η∇Lk (Wt )
transparency. These XAI techniques, when integrated into the
Where Lk is the local loss on client k. After each client has FL model, empower stakeholders with insights, enhancing
computed its local update, the server aggregates these updates confidence in the model’s decisions. Moreover, understanding
5

Fig. 3: Characteristics of the dataset

feature importance aids in model debugging and refinement,


addressing potential pitfalls or biases.
One popular XAI technique is SHAP (SHapley Additive Fig. 4: Characteristics of the dataset
exPlanations) [23]. Originating from game theory, SHAP
values provide a unified measure of feature importance by
attributing the difference between the model’s prediction and
the average prediction to each feature. For a given instance
and feature, the SHAP value is the average contribution of
that feature to all possible combinations of features.
The SHAP value for feature j is calculated as:
X |S|!(|N | − |S| − 1)!
ϕj (f ) = [f (S ∪ {j}) − f (S)]
|N |!
S⊆N \{j}

where:
• N is the set of all features.
• S is a subset of N without feature j. Fig. 5: Imbalanced distribution of the proposed dataset
• f (S) is the prediction of the model for the input features
in set S.
6) days since request: A continuous variable that represents
IV. I MPLEMENTATION DETAILS the number of days since a particular request (maybe a
This section presents the details of the implementation of credit request) was made.
the proposed technique as shown in Fig. 2. The steps involve 7) intended balcon amount: The amount on the balcony or
preliminary checks, data processing, FL development, and a credit amount.
XAI integration techniques. 8) payment type, employment status, housing status,
source, device os: These are categorical features
indicating the method of payment, the employment
A. Dataset
status of the customer, their housing situation, where the
The data set referenced in this paper is sourced from data came from, and the operating system of the device
[24], encompassing realistic data based on a present-day real- used, respectively.
world dataset for fraud detection. The dataset contains 29,042 9) credit risk score: A continuous variable possibly indicat-
entries, spread across 32 distinctive features. This rich dataset ing the riskiness of providing credit to the individual or
incorporates various data types: 17 columns of integer type, 10 entity.
columns representing floating-point numbers, and 5 columns 10) email is free: A binary variable indicating whether the
containing categorical or string data. email provider is free (like Gmail, Yahoo) or not.
Some of the dataset features are as follows: 11) phone home valid, phone mobile valid: Binary or
1) fraud bool: This is a binary feature and target variable score-based indicators denoting the validity of home and
for predictive models. It is an indicator of whether the mobile phone numbers.
record was fraudulent (1) or not (0). 12) month: This is probably indicating the month when the
2) income: Represents the income of the user and is a data was recorded or the transaction occurred.
continuous variable of type float. The distribution of the dataset’s features is shown in Fig. 3
3) name email similarity: A continuous variable, capturing and Fig. 4.
the similarity score between the name and email.
4) prev address months count, current address months-
count: Indicators of the duration (in months) at the B. Data balancing
previous and current addresses. Since the data was highly imbalance as depicted in Fig. 5,
5) customer age: Age of the customer. balancing was performed using the Synthetic Minority Over-
6

Fig. 6: Fraud Dataset Correlation Matrix

sampling Technique (SMOTE) [25]. The SMOTE algorithm is was performed:


a popular technique to address class imbalance by generating
1) Binning: The continuous income column was binned into
synthetic samples in the feature space. It is used to ensure that
intervals to create the binned income column. Binning is
both classes (majority and minority) have an equal number of
a technique used to convert continuous data into discrete
samples, thus addressing any imbalance present in the training
groups (or bins) [27]. By converting the income into
data.
ten discrete bins and labeling them with integers, the
model can potentially discern patterns or trends more
C. Data Pre-Processing easily across income ranges rather than individual income
In the conducted analysis, missing data in the dataset was values.
managed using two distinct strategies. For numerical attributes, 2) One-hot Encoding: Columns such as employment status,
the mean value of the respective column was used to input housing status, payment type, source, and device os
the missing entries. In contrast, for categorical attributes, the were one-hot encoded. One-hot encoding is a process by
mode, or most frequently occurring value, was employed for which categorical variables are converted into a format
imputation purposes. that could be provided to machine learning algorithms
In the data preprocessing stage, outlier removal was under- to do a better job in prediction. For each unique value
taken for columns containing floating-point values. Utilizing in the original categorical column, a new binary (0 or
the Interquartile Range (IQR) technique, any data values that 1) column is created. This transformation is essential for
were beyond 1.5 × IQR from the first (Q1) or third quartile models that work better with numerical input, such as
(Q3) were identified as outliers and consequently excluded neural networks.
from the dataset to enhance the data’s robustness [26].
Subsequently, to delve into the relationships and potential
dependencies among the numeric attributes, a correlation ma-
trix was computed, as depicted in Fig. 6. This matrix was then E. Deep Learning model
visualized as a heatmap, providing a color-coded depiction of The deep learning (DL) model chosen for this study com-
the pairwise linear relationships among variables. The hues in prises a three-layer dense neural network. The initial layer
this heatmap ranged from shades representing perfect negative contains 64 neurons and employs the ReLU activation func-
correlation to those indicating perfect positive correlation, tion, taking its input shape from the validation dataset’s feature
allowing for quick identification of strong correlations or dimension. The subsequent layer, also activated by ReLU, has
potential multicollinearity scenarios. 32 neurons. The terminal layer, designed for binary classi-
fication, consists of a single neuron activated by a sigmoid
D. Feature Selection function. The model’s weights are adjusted using the Adam
To enhance the model’s ability to learn from the data by optimization algorithm, and the binary cross-entropy function
presenting it in a more amenable format, feature engineering evaluates prediction losses.
7

F. Training and Validation V. E VALUATION AND R ESULTS


In this research, data splitting was carried out after the In this section, the performance metrics of the federated
data had been pre-processed. The widely accepted method of model were well evaluated. Additionally, we also showcase the
splitting the data into 80% for training and 20% for testing web-based framework developed in this research. Furthermore,
ensuring random and unbiased partitioning was carried out the power of XAI was harnessed to understand the model’s
in this research. To incorporate these datasets into FL, the decisions, identifying key features that play pivotal roles in
adjusted training dataset was split into three parts: X train1, detecting fraudulent activities. Through this exploration, the
X train2, and X train3. intention is to validate the effectiveness of this approach and
provide insights that could reshape the landscape of fraud
detection strategies.
G. Performance Metrics
To effectively assess the performance of an ML model, it
is crucial to employ an appropriate metric that reflects the A. Web-based framework
accuracy and reliability of the model. For binary classification Fig. 7a represents the FL setup. The HTML page displays
tasks, such as distinguishing between two outcomes, the con- the status of each client as it trains its local dataset with the
fusion matrix is a commonly used, straightforward metric. This global model fetched from the central server and then updates
matrix provides four distinct prediction outcomes: (i) True the server with its model weights (not real data). The three
Positive, where a fraudulent transaction is accurately identified status updates the clients have are: updating, training, and
as fraud; (ii) False Positive, where a legitimate transaction idle. The client status becomes idle when it is waiting for
is mistakenly labeled as fraud; (iii) False Negative, where the updated global model to be sent from the server. As the
a fraudulent transaction is incorrectly marked as legitimate; server and client send updates to one another in real time, the
and (iv) True Negative, where a legitimate transaction is Gradient Updates tab which shows ‘33’ is the number of times
correctly classified as such. A multi-pronged metric approach the federated learning model has been trained on the client
was employed to ascertain model performance, utilizing the systems. For each iteration, the accuracy, precision, recall, and
validation dataset: F1-score are calculated and updated on the page.
• Accuracy reflects the proportion of correct predictions Fig. 7c depicts a real-time accuracy over updates graph. This
made. graph also updates itself as the Gradient Update increases and
• Precision quantifies the accuracy of positive predictions. more accuracy is computed. The accuracy of the FL model
• Recall highlights the fraction of positives that were rightly as shown above indicates that the model converges due to the
classified. time taken to aggregate updates from all clients. The accuracy
• F1-Score offers a harmonic balance between precision increases exponentially and has a definite accuracy score of
and recall, particularly crucial when faced with class 93%. Similarly, Fig. 7b and Fig. 7d show the training process
imbalances. and weight updates.

H. Explainable AI integration B. Model Performance Metrics


In terms of feature importance visualization, the SHAP The accuracy of the FL model as shown in Fig. 8a indicates
method is employed, offering an interpretation of feature that the model converges due to the time taken to aggregate
impact relative to a specified baseline value [23]. updates from all clients. The accuracy increases exponentially
and has a definite accuracy score of 93%.
Fig. 8b shows that the precision of the FL model increases
I. Simulation Setup linearly as the number of epochs increases until around 40
Incorporating FL and using modern DL frameworks, the epochs. After around 40 epochs, the precision starts to fluc-
simulation setup provides a holistic perspective on how in- tuate and does not increase as much. This may indicate that
dividual client-side models can contribute to the learning of the model is reaching its limit of learning from the training
a global model. The FL architecture aims to decentralize data. Fig. 8c shows that the recall of the FL model increases
model training across multiple clients without sharing raw with the number of epochs. This indicates the model is able
data. Post-training, using SHAP, we elucidate model decisions to learn more about the training data and improve its ability
to make them more interpretable. The simulation was designed to identify positive examples over time. Fig. 8d shows that
using a suite of software tools to facilitate FL and subsequent the F1-score of the FL model increases with the number of
evaluations. Flask was employed to craft a lightweight web epochs. This indicates that the model is able to learn more
application framework for server-client interactions [28]. The about the training data and improve its predictions over time.
TensorFlow library facilitated DL operations, particularly con- However, the graph also shows that the rate of improvement
structing, training, and evaluating the neural network model decreases as the number of epochs increases. This is because
[29]. Cross-Origin Resource Sharing (CORS) was managed the model is eventually able to learn all that it can from the
through the Flask-CORS extension, ensuring seamless AJAX training data, and any further improvement is minimal.
cross-origin functionality. The server was configured to oper- The research aimed to understand the benefits of using an
ate locally, accessible via port 5000. FL-based approach for advanced fraud detection systems in
8

(a) Federated Learning Dashboard (b) Federated Learning Dashboard (training process)

(c) Federated Learning Dashboard (performance metrics) (d) Federated Learning Dashboard (Weights)
Fig. 7: The FL dashboard description of different pages

terms of data privacy. FL compared to other centralized ML


models showed that real data was not shared, instead, model
weights were sent to a central server to perform aggregations
and produce a global model. This iterative approach formed
the base idea and concept of the federated learning model.

C. Explainable AI Insights
We have used the SHAP plots for the model explainability
[23].In the SHAP plots, the color coding (red and blue)
represents the positive and negative impact of feature values
(a) Accuracy of the Federated (b) Precision of the Federated
Learning Model Learning Model on the prediction of the output, relative to the baseline value.
The baseline value is the average of all model output values
over the dataset and serves as a reference point. In the SHAP
plots, the red indicates that a particular feature value repre-
sented with numbers increased the prediction value. While
the blue indicates that a particular feature value decreased
the prediction value. The feature 2.086 on the red side of the
two client plots indicates the feature has a positive impact on
the prediction of the FL model. The magnitude (that is, the
(c) Recall of the Federated (d) F1-score of the Federated distance from the baseline) of the SHAP values can indicate
Learning Model Learning Model the strength of the influence of a feature. Higher magnitudes,
Fig. 8: Performance meterices whether positive or negative, signify that the feature has a
strong impact on the prediction. For example, the features
2.15, 0.418, and 0.517 in the second SHAP plot (Fig. 9b)
have more impact on the model prediction than 1.15 and -0.8.
9

[9] G. M. Trompeter, T. D. Carpenter, N. Desai, K. L. Jones, and R. A.


Riley, “A synthesis of fraud-related research,” Auditing: A Journal of
Practice & Theory, vol. 32, no. Supplement 1, pp. 287–321, 2013.
[10] P. Raghavan and N. El Gayar, “Fraud detection using machine learning
(a) SHAP Plot of Client 1 and deep learning,” in 2019 international conference on computational
intelligence and knowledge economy (ICCIKE). IEEE, 2019, pp. 334–
339.
[11] M. Zareapoor, P. Shamsolmoali et al., “Application of credit card fraud
detection: Based on bagging ensemble classifier,” Procedia computer
(b) SHAP Plot of Client 2
science, vol. 48, no. 2015, pp. 679–685, 2015.
[12] K. Randhawa, C. K. Loo, M. Seera, C. P. Lim, and A. K. Nandi, “Credit
card fraud detection using adaboost and majority voting,” IEEE access,
vol. 6, pp. 14 277–14 284, 2018.
(c) SHAP Plot of Client 3 [13] Sharma, M. Abhilash, Raj, B. R. Ganesh, Ramamurthy, B., and
Bhaskar, R. Hari, “Credit card fraud detection using deep learning
Fig. 9: Explainable AI insights based on auto-encoder,” ITM Web Conf., vol. 50, p. 01001, 2022.
[Online]. Available: https://doi.org/10.1051/itmconf/20225001001
[14] A. Pumsirirat and Y. Liu, “Credit card fraud detection using deep
learning based on auto-encoder and restricted boltzmann machine,”
VI. C ONCLUSION International Journal of advanced computer science and applications,
vol. 9, no. 1, 2018.
In the FL-based approach presented for banking fraud [15] S. Kamei and S. Taghipour, “A comparison study of centralized and
detection, a decentralized approach to model training was decentralized federated learning approaches utilizing the transformer ar-
chitecture for estimating remaining useful life,” Reliability Engineering
employed, allowing clients to train on their local data and & System Safety, vol. 233, p. 109130, 2023.
subsequently share model updates with a central server. This [16] B. McMahan, E. Moore, D. Ramage, S. Hampson, and B. A. y Arcas,
approach prioritizes data privacy, as raw data remain local “Communication-efficient learning of deep networks from decentralized
data,” in Artificial intelligence and statistics. PMLR, 2017, pp. 1273–
while still benefiting from the insights of diverse datasets. 1282.
The project utilized Flask for server-client communication and [17] I. Benchaji, S. Douzi, and B. El Ouahidi, “Credit card fraud detection
handled potential scalability challenges by integrating multi- model based on lstm recurrent neural networks,” Journal of Advances
in Information Technology, vol. 12, no. 2, 2021.
threading. Client weights, which are multidimensional arrays [18] S. Bharati, M. Mondal, P. Podder, and V. Prasath, “Federated learning:
representing the learned parameters of the neural network, Applications, challenges and future directions,” International Journal of
played a crucial role in aggregating insights from individual Hybrid Intelligent Systems, vol. 18, no. 1-2, pp. 19–35, 2022.
[19] W. Yang, Y. Zhang, K. Ye, L. Li, and C.-Z. Xu, “Ffd: A federated
models to update the global model. By integrating SHAP, the learning based method for credit card fraud detection,” in Big Data–
project not only focuses on achieving accurate model predic- BigData 2019: 8th International Congress, Held as Part of the Services
tions but also sheds light on which features are most influential Conference Federation, SCF 2019, San Diego, CA, USA, June 25–30,
2019, Proceedings 8. Springer, 2019, pp. 18–32.
in driving these predictions. This can be especially beneficial [20] F. Doshi-Velez and B. Kim, “Towards a rigorous science of interpretable
in sensitive domains where understanding the rationale behind machine learning,” arXiv preprint arXiv:1702.08608, 2017.
predictions is as important as the predictions themselves. In [21] K. Bonawitz, H. Eichner, W. Grieskamp, D. Huba, A. Ingerman,
V. Ivanov, C. Kiddon, J. Konečnỳ, S. Mazzocchi, B. McMahan et al.,
the context of the project, SHAP offers a pathway to build “Towards federated learning at scale: System design,” Proceedings of
trust and ensure that the decisions of the federated model can machine learning and systems, vol. 1, pp. 374–388, 2019.
be understood and justified. Overall, the project highlights the [22] Q. Yang, Y. Liu, T. Chen, and Y. Tong, “Federated machine learning:
Concept and applications,” ACM Transactions on Intelligent Systems and
efficacy and potential of FL in scenarios where centralized data Technology (TIST), vol. 10, no. 2, pp. 1–19, 2019.
collection might be impractical or undesirable due to privacy [23] S. M. Lundberg and S.-I. Lee, “A unified approach to interpreting model
or security concerns. predictions,” Advances in neural information processing systems, vol. 30,
2017.
[24] S. Jesus, J. Pombal, D. Alves, A. Cruz, P. Saleiro, R. P. Ribeiro, J. Gama,
and P. Bizarro, “Turning the Tables: Biased, Imbalanced, Dynamic
R EFERENCES Tabular Datasets for ML Evaluation,” Advances in Neural Information
Processing Systems, 2022.
[1] UKFinance, “Annual fraud report 2022,” [25] N. V. Chawla, K. W. Bowyer, L. O. Hall, and W. P. Kegelmeyer, “Smote:
https://www.ukfinance.org.uk/policy-and-guidance/reports-and- synthetic minority over-sampling technique,” Journal of artificial intel-
publications/annual-fraud-report-2022, 2022. ligence research, vol. 16, pp. 321–357, 2002.
[2] A. Abdallah, M. A. Maarof, and A. Zainal, “Fraud detection system: [26] X. Wan, W. Wang, J. Liu, and T. Tong, “Estimating the sample mean
A survey,” Journal of Network and Computer Applications, vol. 68, pp. and standard deviation from the sample size, median, range and/or
90–113, 2016. interquartile range,” BMC medical research methodology, vol. 14, pp.
[3] A. Pascual, K. Marchini, S. Miller, and J. S. . Research., “2017 identity 1–13, 2014.
fraud: securing the connected life,” Javelin (February 1), http://www. [27] M. Hegland, “Data mining techniques,” Acta numerica, vol. 10, pp. 313–
javelinstrategy. com/coverage-area/2017-identity-fraud, 2017. 355, 2001.
[4] S. Bhattacharyya, S. Jha, K. Tharakunnel, and J. C. Westland, “Data [28] G. Dwyer, Flask By Example. Packt Publishing Ltd, 2016.
mining for credit card fraud: A comparative study,” Decision support [29] T. Hope, Y. S. Resheff, and I. Lieder, Learning tensorflow: A guide to
systems, vol. 50, no. 3, pp. 602–613, 2011. building deep learning systems. ” O’Reilly Media, Inc.”, 2017.
[5] L. T. Rajesh, T. Das, R. M. Shukla, and S. Sengupta, “Give and
take: Federated transfer learning for industrial iot network intrusion
detection,” 2023.
[6] S. Vyas, A. N. Patra, and R. M. Shukla, “Histopathological image
classification and vulnerability analysis using federated learning,” 2023.
[7] R. J. Bolton and D. J. Hand, “Statistical fraud detection: A review,”
Statistical science, vol. 17, no. 3, pp. 235–255, 2002.
[8] H. Van Driel, “Financial fraud, scandals, and regulation: A conceptual
framework and literature review,” Business History, 2018.

You might also like