Multiclass Classification of DGA Based Malware Using NLP
Multiclass Classification of DGA Based Malware Using NLP
The binary experiment is designed to answer the ML question of separating legitimate FQDNs
from malicious AGDs, considering all malware families as a single category. Experiment 2
(Multiclass) The multiclass experiment is designed to go beyond the above-mentioned binary
experiment in order to classify not only the legitimate FQDN but also sort malware samples
according to their families (Mattia Zago et al, 2019).
Machine learning models that attempt to do DGA classification based only on the domain name
itself, such as the ones considered in this paper, might not be sufficient to detect a DGA like
CharBot. The result highlights the need for ML models that exploit additional context features
such as the IP-addresses that the domains are mapped to, or temporal access patterns (e.g.
how often the domain was requested, and when) [3], [16]–[18], as was done successfully for
dictionary DGAs [10] (JONATHAN PECK et al, 2019- *peck2019.pdf ).
Research question
which feature reduction strategy optimally approximates the data? Preliminary results using
nonlinear feature reduction techniques seem promising. character features, Unicode features,
Word‐bag model n‐gram
Can we increase the performance of multiclass classification by balancing our data?
Can we detect all malware families?
Can encoding technique improve the performance?
Objectives
To study and analyze the properties of each malware family
Apply multiclass classification solution on deep learning
Analyzing the statistical properties of malicious domains of specific family.
In the same year Daniel S. Berman proposed 1D Application of Capsule Networks to DGA Detection.
They used, CapsNet, CNN and LSTM algorithm to detect different types of DG malware []. Their
experiment was not successful to detect some of them such as vawtrak, Vidro, Sphinx, corebot, virut,
cryptowall.
The greatest weakness of all the models tested is their deficiencies in detecting really word-
based DGAs. In some cases, some of these real word-based DGAs use a limited dictionary to
generate domain names and change that dictionary after some time. This manifests in three
ways. The first is that when the model is trained on data from that DGA, time is not taken into
account and the model fails to detect the malicious domain names, as is the case for matsnu
and gozi. The second is when the model can only detect the malicious domain names when it is
trained on data from that DGA, regardless of time, but fails to detect it otherwise, as is the case
for unknowndropper. Finally, there are models that initially perform well but after time passes,
performance significantly declines because of a change in the DGA generator, as is the case
with pizd and suppobox. Developing a model capable of detecting malicious domains in all
three of these situations is critical, and all models tested here fail to do so [] ( Ryan R. Curtin et al.
, 2019-*info10050157.pdf).
JONATHAN PECK et al, presented a novel DGA called CharBot, which is capable of producing large
numbers of unregistered domain names. In their experiment they get very poor performance
by state-of-the-art classifiers for real-time detection of the DGAs, including the recently
published methods FANCI (a random forest based on human-engineered features) and LSTM.MI
(a deep learning approach). They tried to highlight a dangerous weakness of modern DGA
classifiers, namely their vulnerability to extremely simple attacks that make no use of
sophisticated machine learning techniques.
Yanchen Qiao et al, proposed a DGA domain name classification method based on Long Short-Term
Memory (LSTM) with attention mechanism []. They used the character sequence of the domain name as
a feature but due to imbalanced dataset they achieved poor performance for 10 of them out of 18
malware class.
Xiaochun Yun et al, proposed Khaos, a novel DGA with high anti-detection ability based on neural
language models and the Wasserstein Generative Adversarial Network (WGAN). The experiment results
show that Khaos outperforms the other nine in all detection indices of the detection approaches but the
others was detected with poor performance.
So many researches are done on the classification of DGA based malware detection but it was
unsuccessful in identification of some malware. There Is problem of categorizing them according to their
malware family in current detection systems due to the used reduction & classification algorithms
nature, inappropriate context, and unbalanced data (Duc Tran, ). This research is to fill the gap of
multiclass classification problem by using different encoding techniques in deep learning.