DeepCardFraud Abakarim
DeepCardFraud Abakarim
net/publication/331396279
An Efficient Real Time Model For Credit Card Fraud Detection Based On Deep
Learning
CITATIONS READS
50 6,032
3 authors:
Abdelbaki Attioui
ENS- Université Hassan II de Casablanca Morocco
18 PUBLICATIONS 113 CITATIONS
SEE PROFILE
All content following this page was uploaded by Mohamed Lahby on 16 July 2019.
ABSTRACT institutions are faced with the challenge to build effective and
In the last decades Machine Learning achieved notable results in proactive fraud detection systems. Machine learning represents a
various areas of data processing and classification, which made the promising solution to deal with this problem, and this by using the
creation of real-time interactive and intelligent systems possible. gathered historical customers data and their real-time transaction
The accuracy and precision of those systems depends not only on details [3].
the correctness of the data, logically and chronologically, but also In banking and financial sectors, machine learning is used ac-
on the time the feed-backs are produced. This paper focuses on tively today for different applications, notably in portfolio man-
one of these systems which is a fraud detection system. In order agement, trading, risk analysis, prevention and fraud detection . In
to have a more accurate and precise fraud detection system, banks the financial landscape, for example, Machine Learning is used to
and financial institutions are investing more and more today in build Chatbots, an artificial intelligence software that can interact
perfecting the algorithms and data analysis technologies used to with the customers and respond to there queries. In trading, Deci-
identify and combat fraud. Therefore, many solutions and algo- sion Trading Support Systems or Algorithmic Trading, is used to
rithms using machine learning have been proposed in literature to make extremely fast decisions [4]. Moreover, one of the primary
deal with this issue. However, comparison studies exploring Deep use of machine learning in the banking industry is the protection
learning paradigms are scarce, and to our knowledge, the proposed against fraud. With the help of ML algorithms, detecting suspicious
works don’t consider the importance of a Real-time approach for activities became and easier task. Based on the transactions history,
this type of problems. Thus, to cope with this problem we propose machine learning showed promising new methods to analyze the
a live credit card fraud detection system based on a deep neural net- behavior of users and detect if there is a fraud or not [12] [14] [15].
work technology. Our proposed model is based on an auto-encoder In this work, we are aiming our attention on one aspects of the
and it permits to classify, in real-time, credit card transactions as latter : Credit Fraud detection.
legitimate or fraudulent. To test the effectiveness of our model, four In a McKinsey article published in 2017 [5] Deep Learning is
different binary classification models are used as a comparison. The presented as a very promising solution to deal with fraud in fi-
Benchmark shows promising results for our proposed model than nancial transactions, making to best use of banks big-data. Deep
existing solutions in terms of accuracy, recall and precision. learning is a generic term that refers to machine learning using
deep multilayer artificial neural network (ANN). It is a biologically
KEYWORDS inspired model of human neurons; composed of multilevel hidden
layers of nonlinear processing units, where each neuron is able to
Deep Learning, Real-Time Data, Binary Classification, Fraud Detec-
send data to a connected neuron within hidden layers [7]. Deep
tion.
neural networks attracted much attention in the field of machine
ACM Reference Format: learning. It’s currently providing the best precision and accuracy to
Youness Abakarim, Mohamed Lahby, Abdelbaki Attioui. 2018. An Efficient
many problems; providing promising results in many field, notably
Real Time Model For Credit Card Fraud Detection Based On Deep Learning.
in binary classification.
In Proceedings of October 2018 (SITA’18). ACM, New York, NY, USA, 7 pages.
https://doi.org/10.1145/nnnnnnn.nnnnnnn In a data analysis view, Fraud detection is a binary classifica-
tion problem, the transactions data is analyzed and classified in
1 INTRODUCTION "legitimate" or "fraudulent". Binary classification is a simple case of
classification, where a collection of data is classified into two classes,
In the recent years the number of bank transactions via credit cards
based on some features. It is mainly used in situation where we
raised drastically and with it the number of frauds and card theft.
want to predict a specific outcome, that can only take two distinct
The 2018 Association for Financial Professionals Payments Fraud
values. Some typical examples include medical diagnosis, spam
Survey, underwritten by J.P. Morgan [1], reported a new increase
detection, or in our case : fraud detection. Although fairly simple,
in payments fraud. An unprecedented record of 78 percent of all
binary classification is a very basic problem. There are numerous
organizations experimented payments fraud last year, a total of 700
paradigms used for learning binary classifiers, such as: Decision
treasury and finance professionals according to the survey. With
Trees, Neural Networks, Bayesian Classification, Support Vector
the rise of digital payment, finance institutions has lost billions
Machines, Logistic regression, K-nearest neighborhood, ect. [8]
due to credit card fraud [2]. Due to this issue banks and financial
In the other hand, Real-time data processing has become an
SITA’18, Rabat, Morocco, important field of research. We speak of Real-time data when the
2018. ACM ISBN 978-x-xxxx-xxxx-x/YY/MM. . . $15.00 information is delivered directly after it has been gathered, making
https://doi.org/10.1145/nnnnnnn.nnnnnnn
SITA’18, Rabat, Morocco,
Youness Abakarim, Mohamed Lahby, Abdelbaki Attioui
the data accuracy time dependent. A Real-time system has been there have been many researchers around the use of SVMs in bi-
described by John A. Stankovic(1988) as : " One which controls an nary classification problems, in particular for image classification
environment by receiving data, processing them, and returning the problems [25], and in the financial related problems [26]. In [27],
results sufficiently quick to affect the environment at that time" [11]. compared SVM with various other paradigms in solving credit card
The novelty of our work, is the Real-time approach to credit card fraud, the authors focused on creating new inputs by aggregating
fraud detection, using state of the art machine learning technology. common transactional variables.
By using Deep neural network based on an auto-encoder, our model Artificial neural network (ANN) is a computing system that
classifies live feed of consumer transactions, and gives a real-time consists of networks interconnected elementary computing units,
decision - are they legitimate or fraudulent. designed to imitate the functioning of neurons in the human brain.
To test the effectiveness of our model, different binary classifica- Ghosh and Reilly [16] used data from a credit card issuer, one of
tion models are used for benchmark. Although the criteria used by the neural network based fraud detection system was trained on a
banks, differs for one institution to another, the use of linear models large sample of labeled credit card account transactions and tested
is well known in the banking sector [17]. Therefor, in the present on a holdout data set that consisted of all account activity over
work, we have selected logistic regression as a comparison, as well a subsequent two-month period of time. The method gave good
as linear SVM regression. Support vector machines (SVM) were results with high fraud account detection with fewer false positives
introduced for the first time by Vapnik [6], as a state-of-the art tech- by a factor of 20 compared with traditional methods. Cardwatch
nique to solve binary classification problems. SVMs have drawn a [19] developed in 1997 is a data mining system based on an artifi-
lot of attention in recent studies due to their major performance as cial neural network, trained with historical data of customers. By
classifiers. However, in some cases using non linear regression ANN processing the transaction of a specific customer it detect possible
has given good results in comparison with other linear models [18]. anomalies. Dorrronso et al. [20] presented an online system for
Hence, we have also selected a non linear artificial neural network fraud detection of credit card operations based on a neural classifier.
algorithms as a benchmark. As for the data sample, in this exercise The system was installed in a transactional hub and relies on the
we have used, the publicly available, European cards transaction of information of the operation and its previous history. A nonlinear
September 2012. version of Fisher’s discriminant analysis was used to overcome
This paper presents two key point. First, a real-time deep learn- imbalance of the rate of normal and fraudulent operations. In 2014
ing approach to the credit card fraud detection problem, based on [21] the authors proposed a credit card fraud detection model using
an auto-encoders. Second, a comparison of different binary clas- frequent item set mining. in [22] the authors used a feed forward
sification methods applied to this financial problem. We observe ANN for detecting fraudulent transactions, the method is found to
that our model based on deep machine learning with auto-encoder be effective.
gives promising results on this binary classification problem. Recently, and with the hype surrounding deep learning, some
The rest of the paper is organized as follows: In Section 2 we give works have shown that deep neural network, such as recurrent
a short review our problem’s related works. In section 3, we describe neural networks, have promising results in this field. However,
the architecture of our deep neural network prediction model. In since deep learning is a new approach in machine learning, the use
section 4, we give a description of our testing environment, then and analysis of various deep learning paradigms remains scarcely
we show and analyze the test results. Finally, in section 5 we draw explored. In [29] the authors used a convolutional neural network, a
some conclusions and indicate the future works. feed forward neural network inspired form the animal visual cortex,
to classify a set of card transaction in being fraudulent or not. In [30]
the authors used a Hidden Markov Model in a eCommerce fraud
2 RELATED WORK
detection model, the method is found to detect 80% of fraudulent
Over the past few years, many solution have been proposed to transaction. In [31] the authors found that deep learning method
cope with the problem of credit card fraud. The major approaches based on auto-encoder performed better than gradient boosted trees
are either statistical (a survey of Boltan and Hand in 2002 [9]) or for fraud detection. However, the proposed works don’t consider
based on artificial intelligence [10]. Various paradigms have been the importance of a Real-time approach for this type of problems,
used, some notable examples are : Regression models, Support and to our knowledge, comparison studies exploring Deep learning
Vector Machines, Restricted Boltzmann Machines, Artificial Neural paradigms are scarce. To cope with this situation we introduce a live
network, ect. binary classification method based on a deep learning auto-encoder.
Statistical models are increasingly applied to financial data min- In the second contribution, we propose an exhaustive comparison
ing task, including logistic regression, regression analysis, multiple with some typical binary classification algorithms. Our motivation
discriminant analysis, and Probit method, ect [23]. In the literature about the benchmark algorithms is shown in Tab. 3.
Logistic regression is widely used for binary classification problems
[24]. In [15] the author compared various traditional model for 3 OUR REAL-TIME CLASSIFICATION
fraud detection, Logistic regression is found to be the most accurate
of the traditional methods. SVMs have drawn a lot of attention in
APPROACH
recent studies due to its major performance as a classifier. Unlike 3.1 Prediction Model
ANNs which minimize empirical risk, SVMs are based on structural As a first approach we propose a classification method based on two
risk minimization. They use a nonlinear mapping to transform the stages: a periodical offline training of the historical data, by which
input data into a multidimensional feature space. In recent years we build our machine learning models. This stage first consists of a
SITA’18, Rabat, Morocco,
An Efficient Real Time Model For Credit Card Fraud Detection Based On Deep Learning
feature engineering process, to transform the transaction data into math library. Tensorflow is the main used here for building and
features and labels for our machine learning classification. The data training our ML algorithms.
is then split into test and training sets. Afterward, Our models are Keras is an open source neural network library written in Python
build with the training features and labels. We test the models with [36]. Due of it’s capability to run over Tensorflow, it’s a good choice
the test features to get our first predictions and then compare the in our study.
test prediction with our test labels. The process is redone multiple Kafka is an Apache distributed streaming platform. It is used
times until we are satisfied with the models accuracy. In the second to build real-time streaming application that react and transforms
stage the models are used for prediction on a live stream of new the streams of Data. In our study, Kafka is used for building the
data. Fig. 1 shows the methodology followed to produce the results. real-time streaming data pipes lines.
To build the mentioned model we used the following technolo- Memsql is a distributed SQL data base. We have chosen this
gies : distribution for it’s wide use with Apache Kafka, and the amount
• Tensorflow : For building our machine learning model of documentation available on its implementation.
• Kafka : For building our RealTime streaming data pipeline.
• Memsql : For data pipelines. 3.2 Deep Auto-Encoder
Fig. 2 shows how the different technologies are implemented to For deep Learning we use an Auto-Encoder. Auto-Encoders are
build our live classification model. neural networks with an equal input and output. There architecture
For experiments, implementation and analysis we are using the is basically a two stacked Restricted Boltzmann Machines parallel
following environment: Python 3.5, Keras over Tensorflow Back- to each other, see Fig. 3. An auto encoder consists on two part, the
end, GPU GTX 660, Memory 8 GB. encoder and the decoder :
TensorFlow is an open-source software library for machine learn- • Encoder : compressing of the input into a fewer number
ing across a range of tasks [34]. In our study, we used it’s symbolic of bits. The part of the network with fewer bit is called
SITA’18, Rabat, Morocco,
Youness Abakarim, Mohamed Lahby, Abdelbaki Attioui
Figure 3: Auto-encoder
Data set Characteristics Multivariate
Attribute characteristics Categorical, Integer
bottleneck or the "maximum point of compression" since Associated Task Classification
at this point the input is compressed the maximum. These Number of instances 284807
compressed bits that represent the original input are together Number of Attributes 30
called an “encoding” of the input. Missing Values N/A
• Decoder : here the input is reconstructed using the encoding
of the input. A successful encoding is when the decoder is
able to reconstruct the input exactly as it was fed to the
encoder.
In this work we use an hyperbolic tangent function "tanh" for the
encoding and decoding of the input to the output. The equations
are as follows :
Encoder
h(x) = tanh(Wx ) (1)
Decoder
â = tanh(W ∗ h(x)) (2)
The error reconstruction is done by back-propagation; the error
signal is computed and propagated backward in the network. The Figure 5: Data set distribution Over-Sampled, the fraud class
errors form desired and actual output values is used as condition. counts for 3,1% of all transactions
Parameter gradients is used for the back-propagation realization.
Time and Amount have not been transformed and all the other
3.3 Deep Neural network features are represented by V0, V1, . . . V26 values. See Tab. 2.
For our Auto encoder deep neural network we used 3 encoders and
3 decoders for a total of 6 hidden layers. The composition of the 4.2 Performance metrics
neural network is shown in Fig. 4. Due to the high metrics of the
This Data-set classifies transactions by being fraudulent or not.
"Tanh" activation function, it has been used in every hidden layer
We have 492 frauds out of 284807, which is highly unbalanced
for the Auto-encoder neural network.
0.173%. To solve this class unbalance, Random over-Sampling is
used. Fig. reffig:datasetOverSampling shows the distribution of the
4 EXPERIMENTAL RESULTS Data-set After Over-Sampling .
4.1 Dataset The Dataset is spliced into training and test sets. For a Pre-trained
The Data-set used in this work contains the transactions made in model performance check, we split the data into two separate train-
two days by European cards in September 2012, gathered and ana- ing sets and one independent test set for final model comparison.
lyzed during a research collaboration of Worldline and the Machine See Tab. 4.
Learning Group of ULB on big data mining and fraud detection. It
is freely available on Kaggle. See Tab. 1. 4.3 Paradigms
The data contains only numerical values. Due to confidentiality In order to measure the efficiency of deep learning in this case
the values where changed by PCA transformation. The features of study, three learning classification methods were chosen for
SITA’18, Rabat, Morocco,
An Efficient Real Time Model For Credit Card Fraud Detection Based On Deep Learning
Support Vector Machine Linear SVM Regression SVM has been proven to surpasses traditional neural
network models for solving complex non-linear prob-
lem, whch makes SVM a good choice for solving the
complex changeable data structure problems such as
fraud detection [32] [33].
Regression Logistic Regression As mentioned in section 2, in [15] the author used lo-
gistic regression to perform various credit card fraud
experiments, the latter is found to have promising re-
sults in comparison of the traditional methods
Classical ANN Non Linear Auto regression NN Using non linear regression ANN has given good results
in comparison with other linear models [18]
Deep Learning Deep Neural network (DNN) with auto encoders Beside being data-specific, Auto-encoders don’t need
explicit labels to train on, the labels are self generated
withing the model. Due to the diversity of the fraudulent
transaction, we believe that auto-encoders could be the
most fitted for this task.