0% found this document useful (0 votes)
20 views

Classification of Spam Emails using Deep learning

Classification of Spam Emails using Deep learning

Uploaded by

ameenuddin2817
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views

Classification of Spam Emails using Deep learning

Classification of Spam Emails using Deep learning

Uploaded by

ameenuddin2817
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

1st.

Babylon International Conference on Information Technology and Science 2021 (BICITS 2021)- Babil- IRAQ

Classification of Spam Emails using Deep learning


2021 1st Babylon International Conference on Information Technology and Science (BICITS) | 978-1-6654-1573-6/21/$31.00 ©2021 IEEE | DOI: 10.1109/BICITS51482.2021.9509909

Nuha H. Marza Mehdi E. Manaa Hussein A. Lafta


Department of Computer science Department of Information Networks Department of Computer science
Science College For Women College of Information Technology Science College For Women
University of Babylon University of Babylon University of Babylon
Babylon, Iraq Babylon, Iraq Babylon, Iraq
[email protected] [email protected] [email protected]
q

Abstract—The Internet has become an integral part of


modern life. One of the most critical aspects of the Internet is
collaboration. Email is a communication tool that can be used for
both personal and professional purposes. Spam messages are not
intended to be received by addressee of emails, and therefore are
often regarded as unwanted bulk emails. Every day, a wide range
of people use email to connect globally. Currently, large numbers
of Spam emails are logic genes. Being in large quantities already
causes real frustration for both internet users and providers. For
instance, it degrades user analysis data, encourages network virus
Fig.1. The Confidentiality, Integrity and Availability triad for Information
migration, expands stack on arrangement movement, absorbs Security
mail server storage, wastes time and network bandwidth, and
depletes the vitality of real emails among the Spam. It is therefore Email has now become one of the easiest and cheapest
necessary to prevent the spread of Spam. Given the fact that there means of contact. Email popularity, however, has further raised
are several data mining techniques beneficial in preserving Spam emails over the past few last years. In order to categorize
security, they can also be of use in classifying Spam email. As for the email as Spam or non-Spam (Ham), data mining
the present work, the Min-hash technique is combined with the
classification algorithms are used. Emails is an effective form
Deep Neural Network (DNN) algorithm to classify emails into
Spam and Ham. The results indicate that a remarkably high of online communication because it saves resources and tends
accuracy rate (98%) is obtained by using this combination, which to minimize communication time, making it a popular means of
means that it is an effective method to be adopted and further communication for private communication and technical
developed in the field of Spam detection and classification. communication. Simple data transmission, as well as initial and
other files that can be sent worldwide, are supported through
Keywords— Spam emails, Data Mining, Classification, Deep company emails. There are also other occasions where
Learning, Data Security, Min-hash. numerous attacks affect the emails that users send, which can
I. INTRODUCTION be active or interactive. Emails are often received from
unknown sources, some of which contain meaningless
Network security is the process whereby a safe environment information that is not important or relevant to the recipient.
is provided for computer, users, and programs in order to Spam mails are a well-known way of transmitting unnecessary
perform their essential functions. It can be achieved by means of or broad data to a list of particular or random emails addresses.
taking both physical and software protective measures to deter Spam mail can thus be defined as a subset of internet Spam,
unauthorized access, misuse, malfunction, modification, whereby messages are delivered to all recipients via email, who
deterioration, or improper disclosure of the underlying network
are linked in some way or other to the same or similar post [2].
infrastructure. At the heart of information security, there are
Spam emails may also include malware in form of scripts or
three key goals to be achieved, namely ensuring confidentiality,
other files that are executable and can harm the user's system.
durability, and usability [1]. Fig. 1 illustrates the information
Most emails and Spam lists are created by thoroughly searching
security triad of CIA (Confidentiality, Integrity, and
the UseNet and stealing the internet email list. Such emails meet
Availability).
the three key criteria below:
1) Anonymity: the sender's address and name are secret.
2) Bulk mailing: postal messages delivered to an
enormous group of persons.
3) Unsolicited: receivers do not request the email.
The aim of this paper is to use the Min-hash technique
combined with Deep Neural Networks (DNN) to identify spy
communications and eventually prevent the issues that Spam

978-1-6654-1573-6/21/$31.00 ©2021 IEEE 63

Authorized licensed use limited to: CAMBRIDGE UNIV. Downloaded on December 21,2022 at 19:37:24 UTC from IEEE Xplore. Restrictions apply.
1st. Babylon International Conference on Information Technology and Science 2021 (BICITS 2021)- Babil- IRAQ
emails create. The proposed method is built and implemented describe several higher level features [6] [14] , as illustrated in
through the use of data analysis. Fig. 2.
II. RELATED WORKS
Various works by authors have previously been published to
demonstrate the architectures, techniques, and algorithms under
which deep learning is used to classify email, according to a
literary study. Nevertheless, the Min- hash technology was not
commonly used for email classification, as one dimensional
Convolutional Neural Network (1D CNN) only was used to
classify emails into Ham and spam. Fig.2. Machine learning and Deep Learning are subfields of Artificial
Intelligence.
DNNs have been commonly used in many fields since the
development of Deep Learning (DL), which eventually Different DL architectures exist. A popular type of such
highlighted the critical security issues related to DNNs. Many architectures is the Convolutional Neural Network (CNN),
studies have examined the security of DNNs and they identified which has often used as an alternative recently because of their
several vulnerabilities through proposing a number of attack ability of performing complicated operations through the use
methods [3], including black-box attacks [4] and white-box of convolution filters [6] [15]. A standard CNN structure
attacks [5]. consists of several fully-connected layers that convert the
The study presented in [6] makes use of a Deep Neural previous layers' 2D feature maps into 1D vectors for
Network classifier, which is one of the DL architectures, to classification, followed by a sequence of feedforward layers
identify a dataset. They coupled the classifier with the Discrete that introduce convolutional filters and pooling layers [15]
Wavelet Transform (DWT), a strong feature extraction method, [16].
and principal components analysis (PCA). Their results turned DNNS are also a type of DL architectures which have been
out to be very successful throughout all performance steps. successfully used in classifying and regressing data in a variety
of fields. In this type of networks, the information flows from
As for [7] , the authors classified emails into Spam and not- the input layer to the output layer through several hidden layers
Spam (Ham) by analyzing the whole content (i.e. both image in a typical feed-forward network (more than two) [17].
and text), and processing it through independent classifiers using
Convolutional Neural Networks. They proposed two hybrid Figure 3 shows a standard DNN architecture, which includes
multi-modal architectures by forging the image and text an input layer with neurons for input characteristics, an output
classifiers. It is a study approaching our research. layer with neurons for output classes, and hidden layers. As for
Fig. 4, it illustrates the neural node in detail.
A comprehensive survey is presented in [8], wherein the
authors shed light on a broad range of text classification
algorithms, such as the Support Vector Machine, Decision Tree,
and Rule-based Classifiers.
The authors in [9] present the preliminary results of their
study, whereby deep learning is used in legal document analysis.
Their experiments were performed on four datasets of actual
legal matters, after which their deep learning findings are
compared to the results obtained using an SVM algorithm. Their
outcomes revealed that Convolutional Neural Network (CNN)
performs sufficiently when used with a wider training dataset,
and it is therefore found to be a suitable tool for text
classification in the legal industry.
As for the studies and experiments presented in [10] [11]
[12] [13] , the authors of each work proposed novel strategies
for email Spam identification, whereby their work is based on
SVM and feature extraction. The application of their proposed
Fig. 3. Structure diagram of deep neural network
strategies on test data sets all resulted in high accuracy rates of
about 98%.
III. OVERVIEW OF THE DEEP LEARNING
A general introduction to Deep Learning (DL) involves
defining it as sub-field of machine learning that focuses on
learning several levels of representations. It is performed
through building a hierarchy of features in which lower levels
classify higher levels, and lower level features can be used to

64

Authorized licensed use limited to: CAMBRIDGE UNIV. Downloaded on December 21,2022 at 19:37:24 UTC from IEEE Xplore. Restrictions apply.
1st. Babylon International Conference on Information Technology and Science 2021 (BICITS 2021)- Babil- IRAQ
TABLE I. CHARACTERISTIC MATRIC-BASED K-SHINGLE (K=3)
#Shingling Emails
E1 E2
This email is 1 1
email is Spam 1 1
is Spam and 1 0
Spam and is 0 1
And is not 1 1
: : :
Shingle m M m

The algorithm that implements the major steps of k-shingle


hashes for emails whereas the input is Emails (E1,E2,….En) and
Fig. 4. neural node in Deep Learning. the output is Characteristic Matrix (M); is clarified as follows
steps:
The DL paradigm extends standard NN by incorporating 1) Preprocess the emails text by.
multiple hidden layers into the network between input and
output layers, for modeling more complex and non-linear 2) removing the punctuation.
relations. Due to its outstanding progress in being the perfect 3) removing adjusting the white space.
approach to a number of problems, this theory has piqued the
interest of researchers in recent years [6]. A 1D CNN throughout 4) Choose k number
this work and a five-layer system of hidden layers was proposed. 5) Set emails to group based on k
Starting with 12 nodes in the first secret layer and 24 nodes in
the second, there were 48 nodes in the third layer, followed by 6) Hashing set (shingling)
24 and 12 nodes in the fourth and fifth layers, respectively. 32 7) Find existing tokens in emails
batches were set up after the training set was trained.
8) Generate characteristic Matrix

IV. DATA MINING C. Min-hash Functions (for each shingle)


The data mining processes and occurrences are statistically The Min-hash method takes tokenized text and transforms it
important. There are various protocols adopted in data mining to a collection of hash integers, after which it finds the lowest
and assigned to different data mining techniques. As for the value (minimum). “(1)” gives the general shape of the Min-hash
present work, the Min-hash technique is adopted to gather and function [21] .
obtain relevant information. As for the classification, it is a
statistical model that displays and generalizes the data into h(x)=ax+b mod P (1)
established fixed groups in a predefined manner [18].
Where a & b are two random values, X is the hash function
A. Min-hash Technique value for the tokens and P is the prime number (greater than the
Min-hash is mostly used in applications that need massive maximum number of x) [20].
amounts of data. One of the Google techniques [19] is to use it
in text similarities. The major stages involved are the k-shingles, In this research, the Min-hash method is used to generate the
min-hash function, and signature matrix. matrix, and the Min-hash steps applied for generating the codes
B. Hashing shingles (k-shingles) through the use of a single hash function. Table II explains the
In document similarity, the word K-shingle is often used. result of implementing hash function-based k-shingle to
Documents are split into a number of tokens depending on the characteristic matrix.
length of k in this technique [20]. Supposing that a text contains
the string "This email is Spam and not legitimate" If the value
k=3 is used, then the total number of generated tokens is (n-k+1), TABLE III. VALUES OF CHARACTERISTIC MATRIX WITH MIN-HASH
where n is the total number of words in the documents and k is
the shingle length . In the present study, the text files are divided #Shingling Characteristics Matrix
into shingles depending on the number length of k. Table I
explains in detail how the group keys are produced, When the k E1 E2 H1 H2 H3 H4
value of k-shingle equals (3). 1 1 5 2 3 9
998816769
351110407 1 1 8 10 10 0

65

Authorized licensed use limited to: CAMBRIDGE UNIV. Downloaded on December 21,2022 at 19:37:24 UTC from IEEE Xplore. Restrictions apply.
1st. Babylon International Conference on Information Technology and Science 2021 (BICITS 2021)- Babil- IRAQ
316870923 1 0 1 6 2 6 V. THE PROPOSED METHODOLOGY
1976815438 0 1 9 2 7 9
The proposed Methodology consist of the fallowing steps:
339473466 1 1 6 9 1 1
: : : : : : : 1) Data set : The dataset used in this study can be
M M M M m m m found on Kaggle, which is a machine learning
database. There are 5725 instances in the dataset of
"Spam filter", with two columns for class and email
string. Fig 5. shows sample of the dataset.
The algorithm that implements the main steps Min-hash
functions hashing on emails whereas the input is Characteristic 2) Data cleaning :Data cleaning is an essential aspect
Matrix M, and Hash Functions (h1, h2, h3, …, hn) and the of data science. Working with skewed data will
output is Signature Matrix (S); is clarified as follows steps: cause a large number of problems. This process
1) Picking n randomly hashing functions h1, h2, h3, …, hn. entails breaking down into words and dealing with
punctuation and case. In this work, unnecessary
2) Construct the signature Matrix S from characteristic values are excluded, as is the elimination of stop
Matrix M, where each row (i) is a hash function and words, symbols, and punctuation marks, and
each column (c) is a emails. convert data types are used.
3) Then, set SIG(i,c) as signature matrix element for the
hash h function and column c. 3) Calculating Hashing shingles (k-shingles): K-
shingles with (k=3) length was used in this work.
4) Convert the long bit vector into short signatures. The characteristics matrix and signature are
5) To every column c in documents, do next steps: implemented to generate a dense matrix from large
sparse matrix. Characteristic matrix was dense
a) if c has 0 in both documents’ rows r, do nothing. using (h=4) Min-hash function. The values of the
b) if row has 1, then, for each i=1,2, ……, n set min-hash were used to build a signature matrix.
SIG(i,c) to the smaller value of the current value These values feed to deep neural network as input
of SIG(i,c) and hi(r) Then Pr[hπ(c1) =hπ(c2)] =sim vectors.
(c1, c2).
4) Calculating Hash crc32 (Shingles): After the k-
D. Min-hash Signatures
shingle stage, the Hash functions (crc32) is used. As
After finding the similarities in the emails, as well as the min for the k-hashed tokens, the Min-hash algorithm is
hash for all emails, the Min-hash Signatures can be obtained used.
through the construction of the signature matrix by considering
each row in the order in which it appears. SIG (i, c) is the 5) The data is split into training and testing.
signature matrix unit for the ith hash function and column c.
Supposing that SIG (i, c) is for both I and c at first ∞, Row r is 6) The number of hidden layers, nodes on each layer,
operated with in the following manner: training batches, as well as the number of training
data for each batch in a DNN with several hidden
1) Calculate h1 (shingles) to hn (shingles). layers are set.
2) Do the following for each column c:
a) If c has 0 in row, do nothing 7) The DNN Spam classifier is used to classify the
b) If c has 1 in row (shingles), then set SIG(i, c) to the checked emails, so as to decide whether or not they
smaller of the current value of SIG (i, c), and hi for each are considered to be Spam. The DNN classifiers can
I = 1, 2,..., n. (shingles) [21], as shown in Table III. be created by training the DNN. And the output for
this stage is Optimum Weights.
TABLE III. SIGNATURE MATRIX FOR THE ALL EMAILS
Hash Emails 8) The obtained email classification result are
Email 1 Email 2
compared with the real tags to ensure that the
algorithm is indeed accurate. The result is given as
#1 1 5
Spam or Ham.
#2 2 2
#3 1 1 The essential stages of the proposed system are illustrated in
#4 0 0 Fig. 6.

66

Authorized licensed use limited to: CAMBRIDGE UNIV. Downloaded on December 21,2022 at 19:37:24 UTC from IEEE Xplore. Restrictions apply.
1st. Babylon International Conference on Information Technology and Science 2021 (BICITS 2021)- Babil- IRAQ

Fig. 6. The main steps of the proposed system.

VI. RESULT AND DISCUSSION


The results of the experiment conducted in this study revealed
that DNN with Min-hash function outperformed the use of DNN
individually, particularly in terms of email classification
accuracy. It was therefore observed that the DNN had a positive
Fig. 5. Sample of Data set used in this work classification effect.
Throughout this work, a five-layer system of hidden layers
was proposed. Starting with 12 nodes in the first secret layer
Input Data set (Spam emails) and 24 nodes in the second, there were 48 nodes in the third
layer, followed by 24 and 12 nodes in the fourth and fifth layers,
respectively. 32 batches were set up after the training set was
trained. The Python environment was used to carry out this
Data Cleaning
work, and 70% to 30% of the data was taken for the purpose of
training and testing the results. Whereas previous authors did not
Min-hash Technique use Min-hash with DNN but only used DNN ,and the accuracy
is described as follows [22]:
Calculate hash 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑐𝑜𝑟𝑟𝑒𝑐𝑡𝑙𝑦 𝑖𝑑𝑒𝑛𝑡𝑖𝑓𝑖𝑒𝑑 𝑒𝑚𝑎𝑖𝑙
Calculate (k- 𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 = (2)
Shingles) crc32(Shingles) 𝑡𝑜𝑡𝑎𝑙 𝑛𝑚𝑏𝑒𝑟 𝑜𝑓 𝑒𝑚𝑎𝑖𝑙𝑠
Or
(𝑇𝑃+𝑇𝑁)
𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 = (3)
(𝑇𝑃+𝑇𝑁+𝐹𝑃+𝐹𝑁)
Calculate Characteristic
Matrix and Min-hash
where, TP, TN, FP, and FN represent the True Positive, True
Negative, False Positive and False Negative, respectively [23]
Calculate Signature The Recall is described as follows:
matrix (𝑇𝑝)
𝑅𝑒𝑐𝑎𝑙𝑙 = (4)
(𝑇𝑃+𝐹𝑁)
Precision is described as follows:
(𝑇𝑃)
Precision = (5)
(𝑇𝑃+𝐹𝑃)
Split
data Table IV is clearly shown that the min-hash with deep neural
conducted high results in term of accuracy compared with the
authors in [6] and [7] which are satisfied accuracy less 97%.
Training Data Test Data

The main contribution of this work is using the hash to form


the sparse characteristics matrix and then using min-hash for
Set the number of hidden layers, the number of
nodes on each layer, the number of training generating signature matrix with dense dimensions. The values
batches, and the number of training data for each of this signature matrix is used as an input for feeding deep
batch in a DNN with several hidden layers. neural network to get high results in term of accuracy for Ham
and spam emails.

TABLE IV. COMPUTATIONAL RESULTS COMPARISON ACCURACY, RECALL


Using the DNN classifier to AND PRECISION IN DNN& MIN-HASH AND DNN
classify the Spam emails
Method Accuracy Recall Precision

Min-hash In proposed system The proposed The


DNN Classifier Optimum
Optimum Weights Weights +DNN = 98% system = 95% proposed

system =

Prediction Step 88%

Spam email Ham email

67

Authorized licensed use limited to: CAMBRIDGE UNIV. Downloaded on December 21,2022 at 19:37:24 UTC from IEEE Xplore. Restrictions apply.
1st. Babylon International Conference on Information Technology and Science 2021 (BICITS 2021)- Babil- IRAQ
VII. CONCLUSION [10] B. K. Dedeturk and B. Akay, “Spam filtering using a logistic
regression model trained by an artificial bee colony algorithm,”
The Min-hash technology was not used with email Applied Soft Computing Journal, vol. 91. 2020, doi:
classification before, as the email was classified using 1D CNN 10.1016/j.asoc.2020.106229.
to classify emails into Ham and spam. in this paper presented an [11] M. A. Hassan and N. Mtetwa, “Feature Extraction and Classification
of Spam Emails,” 5th International Conference on Soft Computing
email classification algorithm, whereby data mining is used to
and Machine Intelligence, ISCMI 2018. pp. 93–98, 2018, doi:
design and execute an effective method to differentiate and 10.1109/ISCMI.2018.8703222.
classify email into Spam and Ham. Neural networks have a lot [12] S. Sumathi and G. K. Pugalendhi, “Cognition based spam mail text
to offer the computer community. Their ability to learn allows analysis using combined approach of deep neural network classifier
them to be very adaptable and strong. Furthermore, there is no and random forest,” J. Ambient Intell. Humaniz. Comput., vol.
need to comprehend the task's internal mechanics. Because of 0123456789, 2020, doi: 10.1007/s12652-020-02087-8.
their parallel architecture, they are also well adapted for real- [13] D. K. Dewangan and P. Gupta, “Email Spam Classification Using
Support Vector.pdf,” International Journal for Research in Applied
time systems due to their fast response and computation times. Science & Engineering Technology (IJRASET), vol. 6, no. VI, June
The training and test data generated by the Min-hash Technique 2018-Available at www.ijraset.com. 2018.
was fed into the DNN algorithm, which produced several hidden [14] G. Litjens et al., “A survey on deep learning in medical image
layers and generates NN classifiers through training. As analysis,” Med. Image Anal., vol. 42, pp. 60–88, 2017, doi:
compared to alternative works, it has been observed that the 10.1016/j.media.2017.07.005.
proposed method is relatively more effective, as the accuracy [15] Y. Pan et al., “Brain tumor grading based on Neural Networks and
rate obtained was remarkably high (98%). The findings show Convolutional Neural Networks,” Proc. Annu. Int. Conf. IEEE Eng.
Med. Biol. Soc. EMBS, vol. 2015-Novem, pp. 699–702, 2015, doi:
that the signature matrix is more sufficient for this mission, as it 10.1109/EMBC.2015.7318458.
emphasizes tempo, secrecy, and honesty. The success criterion [16] “Diving Deep into Deep Learning:History, Evolution, Types and
for consistency received a high ranking. Applications,” International Journal of Innovative Technology and
Exploring Engineering, vol. 9, no. 3. pp. 2835–2846, 2020, doi:
10.35940/ijitee.a4865.019320.
ACKNOWLEDGMENT [17] A. Anuse and V. Vyas, “A novel training algorithm for convolutional
neural network,” Complex Intell. Syst., vol. 2, no. 3, pp. 221–234,
Authors would like to thank university of Babylon- College 2016, doi: 10.1007/s40747-016-0024-6.
of IT for supporting this paper
[18] Mohammed Awad and Monir Foqaha, “Email Spam Classification
Using Hybrid Approach of Rbf Neural Network and Particle Swarm
Optimization,” International Journal of Network Security & Its
REFERENCES Applications, vol. 8, no. 5. pp. 19–38, 2016, doi:
10.5121/ijnsa.2016.8402.
[1] D. Coss and S. Samonas, “The CIA Strikes Back: Redefining
Confidentiality, Integrity and Availability in Security.,” Journal of [19] A. S. Das, M. Datar, A. Garg, and S. Rajaram, “Google news
Information System Security, vol. 10, no. 3. pp. 21–45, 2014. personalization: Scalable online collaborative filtering,” 16th
International World Wide Web Conference, WWW2007. pp. 271–
[2] T. Sultana, K. A. Sapnaz, F. Sana, and N. Mrs. Jamedar, “email- 280, 2007, doi: 10.1145/1242572.1242610.
based-spam-detection.pdf,” Int. J. Eng. Res. Technol., vol. 9, no. 06,
June, 2020. [20] J. Leskovec, A. Rajaraman, and J. D. Ullman, “Mining of Massive
Datasets,” Mining of Massive Datasets. 2020, doi:
[3] M. Barreno, B. Nelson, R. Sears, A. D. Joseph, and J. D. Tygar, “Can 10.1017/9781108684163.
machine learning be secure?,” Proc. 2006 ACM Symp. Information,
Comput. Commun. Secur. ASIACCS ’06, vol. 2006, pp. 16–25, [21] J. Leskovec, A. Rajaraman, and J. D. Ullman, “Mining of Massive
2006, doi: 10.1145/1128817.1128824. Datasets,” Mining of Massive Datasets. 2014, doi:
10.1017/cbo9781139924801.
[4] M. Barreno, B. Nelson, R. Sears, A. D. Joseph, and J. D. Tygar, “Can
machine learning be secure?,” Proc. 2006 ACM Symp. Information, [22] M. Majumder, “EMAIL CLASSIFICATION USING ARTIFICIAL
Comput. Commun. Secur. ASIACCS ’06, vol. 2006, no. March, pp. NEURAL NETWORK,” pp. 49–54, 2015, doi: 10.1007/978-981-
16–25, 2006, doi: 10.1145/1128817.1128824. 4560-73-3_3.
[5] B. Liang, M. Su, W. You, W. Shi, and G. Yang, “Cracking classifiers [23] M. Rout, J. K. Rout, and H. Das, “Correction to: Nature Inspired
for evasion: A case study on the google’s phishing pages filter,” 25th Computing for Data Science,” vol. SCI 871, 2020, pp. C1–C1.
Int. World Wide Web Conf. WWW 2016, pp. 345–356, 2016, doi:
10.1145/2872427.2883060.
[6] H. Mohsen, E.-S. A. El-Dahshan, E.-S. M. El-Horbaty, and A.-B. M.
Salem, “Classification using deep learning neural networks for brain
tumors,” Future Computing and Informatics Journal, vol. 3, no. 1. pp.
68–71, 2018, doi: 10.1016/j.fcij.2017.12.001.
[7] S. Shikhar and S. Biswas, “Multimodal Spam Classification Using
Deep Learning Techniques Shikhar.pdf,” 2017 13th Int. Conf.
Signal-Image Technol. Internet-Based Syst., vol. 978-1–5386, pp.
346–349, 2017.
[8] C. C. Aggarwal and C. X. Zhai, “A SURVEY OF TEXT
CLASSIFICATION ALGORITHMS,” Elsevier, vol. 9781461432,
pp. 163–222, 2013.
[9] F. Wei, H. Qin, S. Ye, and H. Zhao, “Empirical Study of deep
learning for text classification in legal document review,” arXiv.
2019.

68

Authorized licensed use limited to: CAMBRIDGE UNIV. Downloaded on December 21,2022 at 19:37:24 UTC from IEEE Xplore. Restrictions apply.

You might also like