0% found this document useful (0 votes)
53 views12 pages

Darknet Traffic Analysis Investigating The Impact of Modified Tor Traffic On Onion Service Traffic Classification

This paper investigates the classification of Onion Service traffic within the Tor network, focusing on its distinguishability from standard Tor traffic and the impact of modifications to Tor traffic. The authors achieve over 99% accuracy in identifying Onion Service traffic but find that certain modifications can reduce this accuracy by more than 15%. Additionally, the study evaluates influential feature combinations for classification, highlighting the importance of feature selection in machine learning applications for traffic analysis.

Uploaded by

giang161004
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
53 views12 pages

Darknet Traffic Analysis Investigating The Impact of Modified Tor Traffic On Onion Service Traffic Classification

This paper investigates the classification of Onion Service traffic within the Tor network, focusing on its distinguishability from standard Tor traffic and the impact of modifications to Tor traffic. The authors achieve over 99% accuracy in identifying Onion Service traffic but find that certain modifications can reduce this accuracy by more than 15%. Additionally, the study evaluates influential feature combinations for classification, highlighting the importance of feature selection in machine learning applications for traffic analysis.

Uploaded by

giang161004
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

Received 14 June 2023, accepted 30 June 2023, date of publication 10 July 2023, date of current version 14 July 2023.

Digital Object Identifier 10.1109/ACCESS.2023.3293526

Darknet Traffic Analysis: Investigating the Impact


of Modified Tor Traffic on Onion Service
Traffic Classification
ISHAN KARUNANAYAKE 1,3 , (Graduate Student Member, IEEE), NADEEM AHMED 1,3 ,

ROBERT MALANEY 1,3 , (Senior Member, IEEE), RAFIQUL ISLAM 2,3 ,


AND SANJAY K. JHA 1,3 , (Senior Member, IEEE)
1 Institute
for Cybersecurity (IFCYBER), University of New South Wales (UNSW), Sydney, NSW 2052, Australia
2 School of Computing and Mathematics, Charles Sturt University, Albury, NSW 2640, Australia
3 Cyber Security Cooperative Research Centre, Joondalup, WA 6027, Australia

Corresponding author: Ishan Karunanayake ([email protected])


This work has been supported by the Cyber Security Research Centre Limited (CSCRC) whose activities are partially funded by the
Australian Government’s Cooperative Research Centres Programme.

ABSTRACT Classifying network traffic is important for traffic shaping and monitoring. In the last two
decades, with the emergence of privacy concerns, the importance of privacy-preserving technologies has
risen. The Tor network, which provides anonymity to its users and supports anonymous services known
as Onion Services, is a popular way to achieve online anonymity. However, this anonymity (especially
with Onion Services) is frequently misused, encouraging governments and law enforcement agencies to
de-anonymise them. Therefore, in this paper, we try to identify the classifiability of Onion Service traffic,
focusing on three main contributions. First, we try to identify Onion Service traffic from other Tor traffic.
The techniques we have used can identify Onion Service traffic with >99% accuracy. However, there
are several modifications that can be done to the Tor traffic to obfuscate its information leakage. In our
second contribution, we evaluate how our techniques perform when such modifications have been done
to the Tor traffic. Our experimental results show that these conditions make the Onion Service traffic less
distinguishable (in some cases, the accuracy drops by more than 15%.) In our final contribution, we identify
the most influential feature combinations for our classification problem and evaluate their impact.

INDEX TERMS Traffic classification, machine learning, onion services, tor, anonymity, feature selection.

I. INTRODUCTION types in Tor traffic, and [6] tried to classify Tor traffic from
Tor [1] is an anonymity network that hides the identity of other anonymity network traffic such as I2P traffic and Web-
its users by routing the traffic through multiple intermediary mix Traffic. However, in this work, we intend to explore
nodes. Tor also supports the provision of anonymous services the distinguishability of Onion Service traffic from standard
known as Onion Services (also known as hidden services) Tor traffic using traffic analysis. We formulate three research
with .onion as the top-level domain name. Tor’s ability to act questions to act as a foundation for our work.
as a censorship circumvention tool has encouraged security First, we try to answer the question, RQ1: Is it possible
experts, network defenders, and law enforcement agencies to to classify Onion Service Traffic from other standard Tor
identify Tor traffic from other encrypted and non-encrypted traffic? A standard Tor circuit that is created to visit a web
traffic [2], [3]. For example, [3], [4] tried to classify Tor traffic service on the Internet via Tor consists of three Tor nodes.
from non-Tor Traffic, [2], [5] tried to classify the application An Onion Service circuit, which is the only way to access an
Onion Service, consists of six Tor nodes. As the traffic in both
The associate editor coordinating the review of this manuscript and these circuits (standard Tor and Onion Service) is encrypted,
approving it for publication was Nazar Zaki . we assume that we can use the information leaked from the

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
VOLUME 11, 2023 70011
I. Karunanayake et al.: Darknet Traffic Analysis: Investigating the Impact of Modified Tor Traffic

metadata (e.g. direction, timestamps, packet size) to identify lead to major differences in latency. Therefore, we focused
unique patterns that can distinguish them. Onion Services on features that are focused on timing statistics. (ii) Also,
have been used to host illegal websites, and more recently, we use features that have a proven track record of working
they have been used as Command and Control (C&C) servers well in revealing patterns in network traffic [13]. However,
for botnets [7], [8]. Therefore, from the perspective of gov- we use three feature selection techniques to infer which fea-
ernments and law enforcement agencies, they want to track tures have a better relationship with the traffic types used in
and shut down such services and regulate the Onion Service our work and conduct experiments to evaluate the classifier
traffic [9]. Even businesses might find it useful to restrict performance with different feature combinations.
access to such websites in order to protect their systems from Overall, we make the following specific contributions in
potential bad actors (e.g. hackers) and attacks. As a result, this work.
having techniques for identifying Onion Service traffic can be 1) First, we try to classify standard Tor traffic and Onion
useful for two main reasons; 1. Such techniques can act as a Service traffic. We evaluate the applicability of three
stepping stones for fingerprinting of Onion Services. 2. They supervised machine learning algorithms, namely K-
can be useful to restrict Onion Service traffic in sensitive and Nearest Neighbor, Random Forest, and Support Vector
confidential systems. Machines, for our evaluations. We also extract fifty
Second, we try to investigate the same problem under different features, which are given as input to the
different settings. Specifically, we try to investigate RQ2: machine learning classifiers. Our results show that most
How do our results for RQ1 hold when we use modified of these algorithms can identify Onion Service traffic
Tor traffic? There are certain techniques that can be imple- from other Tor traffic with a high degree of accuracy
mented in Tor to change its traffic patterns. Introducing (given that we use effective features and sufficient
padding [10], using dummy bursts and delays [11], and split- samples). We find Random Forests can predict the
ting the traffic [12] are a few examples of such techniques. results more than 100 times faster than other tech-
These techniques1 have been developed with the intention of niques, making it an ideal candidate for real-world
obfuscating the information leakage of Tor traffic. The main online implementations.
importance of answering RQ2 is that we can confirm whether 2) Second, we further evaluate the identifiability of
our findings from RQ1 will hold true as and when such modi- Onion Service traffic, when certain modifications (e.g.
fications are introduced to the Tor traffic. If we are able to still padding, traffic splitting) are introduced to Tor traf-
distinguish Onion Service traffic, it is an indication that these fic. We use Tor traffic generated with two techniques,
modifications are not effective in masking Onion Service WTFPAD [10], and TrafficSliver [12] in our experi-
traffic, if they are realised in the future. If the modifications do ments. Our experiments show that the accuracy values
affect the Onion Service classifiability, it opens up questions we obtained for RQ1 drop considerably (in some cases,
about the validity of prior works, such as [3] and [6] in a even by more than 15%) when these modifications are
setting with those modifications implemented. As outcomes introduced. This provides a strong indication that such
of RQ2 can open up further research avenues on Tor traffic modifications can affect the results obtained in prior
classification, we argue RQ2 is worth evaluating. Tor traffic classification works (listed in Table 1).
In order to investigate RQ1 and RQ2, we employ passive 3) Then, we use different feature selection metrics (Infor-
network analysis, in which we utilise traces of network traffic mation Gain, Pearson Correlation, and Fisher Score) to
captured at points between the client and the Tor entry node. select different feature combinations that have the most
We first extract fifty features from each traffic trace, which effect on the classification and investigate the perfor-
creates a unique fingerprint of that trace. A traffic trace refers mance of the classifiers. Our results provide insights
to a set of consecutive packets transmitted between the client into the importance of different features and suggest
and the entry node in a given duration. We use three machine that the higher performance of the classifiers is, in fact,
learning classifiers that have shown promising results in net- highly dependent on the selection of features.
work traffic classification in the past and evaluate how they The rest of the paper is organised as follows. In Section II,
work in our scenarios. we provide background and related work for Tor traffic clas-
As our third research question, we investigate RQ3: What sification. We describe our dataset and features in Section III.
features impact Onion Service traffic classification most, In Section IV, we explain the experimentation process with
and what level of performance do they provide? We con- the results. In Section V, we discuss several insights we
sidered two factors when crafting the fifty features used in obtain from our work along with potential future directions.
this work. (i) We mentioned that the standard Tor traffic We conclude in Section VI.
passes through three Tor nodes, while the Onion Service
traffic passes through six. Intuitively, this difference should II. BACKGROUND AND RELATED WORK
In this section, we provide background information on the
1 These techniques are commonly identified as Website Fingerprinting
Tor network and Onion Services, which is useful in under-
defences. Website Fingerprinting refers to a passive de-anonymisation attack
executed on Tor users, where an adversary tries to identify a user’s online standing our work and present the related work on Tor Traffic
activity. More information is provided in Section II. Classification.

70012 VOLUME 11, 2023


I. Karunanayake et al.: Darknet Traffic Analysis: Investigating the Impact of Modified Tor Traffic

the Tor client and the entry node is used to extract features
for the classifier. Once the attacker has an effective model,
they can use it to determine the websites a Tor user is visiting
by intercepting the traffic and feeding its fingerprint to the
model. The success of this attack depends on the information
leakage of Tor traffic (usually inferred using metadata as the
payload is encrypted). Therefore, several defences such as
WTFPAD [10], and TrafficSliver [12] have been proposed to
FIGURE 1. Connections to a normal web service and an Onion Service via
obfuscate the information leakage in Tor traffic.
Tor. • WTFPAD: This defence uses adaptive padding and tries
to conceal traffic bursts and other traffic features. Here
adaptive padding refers to padding when the channel is
A. TOR NETWORK AND ONION SERVICES not being used. That way, there will be fake traffic during
The Tor network is comprised of multiple relays as shown channel idle times, affecting the formation of unique
in Figure 1. When a user connects to a destination service, patterns.
their traffic is encrypted and routed through a selected set • TrafficSliver: This technique uses a traffic splitting and
of intermediary relays. In the case where a user connects to multipathing strategy. It means that the network traffic
a normal web service through Tor, the Tor circuit generally is split and sent over multiple entry nodes. As this tech-
consists of three relays known as the entry, middle, and exit. nique does not add any additional packets or delays, it is
Although this setup adds some latency to the communication, more efficient than other defences.
it provides strong anonymity to the user. An entity that has In this paper, we aim to understand how these modifica-
the ability to monitor the traffic at any point of the Tor tions affect the classifiability of Tor Traffic.
circuit is unable to link a user with their destination (e.g.
finding their IP addresses). However, in this scenario, the web C. MACHINE LEARNING
service is not anonymous. Its IP address or the domain name Machine learning is a major sub-area under artificial intelli-
is public knowledge. Tor has introduced Onion Services to gence and is widely adopted for applications such as network
overcome this problem. An Onion Service is simply a normal traffic classification and malware detection. In general, when
web service that can only be accessed via the Tor network. using a machine learning system for encrypted traffic classifi-
A user accessing an Onion Service does not know the actual cation tasks, first, it is necessary to collect traffic traces under
IP address (hence the location) of the Onion Service. When a relevant (and realistic) conditions. Then, important features
user connects to an Onion Service, the circuit generally con- such as packet sizes, inter-arrival times, and the number of
sists of six relays and is created as follows. The Onion Service packets are extracted from these traffic traces. After obtaining
first selects a few random relays from the Tor network as its this data, it is important to process and clean the data before
Introduction Points. Then it advertises a service descriptor evaluation through machine learning algorithms. Checking
containing the addresses of the introduction points and the for missing values, removing duplicates, and handling cate-
Onion Service’s public key in another Tor node called the gorical values are some steps used in the data pre-processing
Hidden Service Directory (HSDir). Next, the Onion Service stage. Once there is a processed dataset, it is split into two
operator advertises the .onion address of the Onion Service main parts; training and testing. The next step is to use a
to potential users, normally via other Onion Services, blogs, suitable machine learning classifier on the training dataset
and social media. A user requires a Tor client (a small piece and build a model. Finally, the model is evaluated on the
of software that can manage Tor-related operations) to search testing dataset using metrics such as accuracy, precision,
for this Onion Service. The Tor client retrieves the service and recall (these metrics will be detailed in Section IV).
descriptor of the Onion Service from the relevant HSDir. Moreover, it can be useful to investigate the performance of
The Tor client then selects a random Tor relay known as the a model in terms of training and prediction times. There are
Rendezvous Point (RP), establishes a circuit to it and sends two main types of problems handled by machine learning;
a message to the Onion Service via the introduction points. regression and classification. In this work, we focus on the
This message contains the address of the RP and a one-time classification capabilities of machine learning and use three
cookie. Finally, the Onion Service initiates a Tor connection traditional machine learning algorithms in our experiments.
to the RP and completes the circuit [1]. We use the term traditional to describe machine learning algo-
rithms that do not use Artificial Neural Networks (ANNs).
B. MODIFIED TOR TRAFFIC While these traditional machine learning algorithms need
As mentioned previously, Website Fingerprinting is a passive more human intervention, especially with preparing data and
de-anonymisation attack that can be executed with minimal tuning hyper-parameters compared to ANNs, they have their
resources. In a conventional Website Fingerprinting attack, advantages. For example, traditional machine learning algo-
the attacker trains a machine learning classifier to identify rithms can work well with fewer data than ANNs in general,
the websites visited by a user. Tor traffic collected between and the results are much more interpretable.

VOLUME 11, 2023 70013


I. Karunanayake et al.: Darknet Traffic Analysis: Investigating the Impact of Modified Tor Traffic

• K-Nearest Neighbour (KNN): This algorithm selects features. They obtained accuracy figures up to 92% in their
the K (an integer) number of nearest data points for a experiments.
new data point and classifies the new data point into the Lashkari et al. [3] evaluated how Tor traffic can be dis-
most common category of the K nearest points. tinguished from non-Tor traffic by only using time-based
• Random Forest (RF): RF builds multiple decision trees features. They extracted 23 time-based features from the traf-
by using randomly selected features and combining fic they captured and used machine learning algorithms such
them together. A decision tree refers to a model that as K-Nearest Neighbor, Decision Trees and Random Forests.
learns from training data while developing decision rules Their techniques could classify Tor traffic from non-Tor traf-
for different outcomes. During the prediction phase, fic with a precision and recall of more than 95%. In addition,
these rules help to determine the target outcome for a they tried to classify Tor traffic into different applications,
given input. Having multiple decision trees enable RF including VoIP, Video-streaming, audio-streaming, browsing,
to provide more accurate predictions. chat, FTP, and P2P and achieved precision and recall values
• Support Vector Machines (SVM): SVM tries to find of around 80%.
the best hyperplane that can separate multiple classes. Deep Learning (DL) is a subset of machine learning, which
It tries to find the hyperplane, which divides the classes, is widely used in applications such as image classification
while having the largest distances between the hyper- and speech recognition. Generally, algorithms that consist
plane and the nearest elements of each class. ANNs fall under this category. Kim and Anpalagan [4]
applied Convolutional Neural Networks (CNN) to classify
Tor traffic from non-Tor traffic. They used hexadecimal raw
packets with a CNN and obtained an overall accuracy of
D. RELATED WORK 99.3% in identifying different application types. They used
There are prior works that tried to address other questions the same dataset used by Lashkari et al. [3] and showed that
related to classifying Tor traffic. For example, classifying their method outperformed the techniques in [3].
Tor traffic from non-Tor traffic, classifying Tor traffic from Although Tor is very popular as an anonymity network,
other anonymity networks’ traffic, etc. Different works in the it is not the only anonymity system that is currently in use.
literature have employed different machine learning-based Montieri et al. [6] carried out experiments to classify traffic
classification methods with different sets of features. We will from different anonymity systems, including Tor, I2P [16],
describe these works as related work in this section. and JonDonym (formerly known as Web- Mix) [15]. They
Bai et al. [14] attempted to identify Tor traffic and Web-mix used four sets of features, including flow-based statistics
traffic [15] by using a stepwise matching technique. They (e.g. flow direction, duration, inter-arrival time statistics),
extracted different types of fingerprints, including informa- histogram representations of packet lengths and inter-arrival
tion in the packet header, specific strings in the packets, and times, and sequence of packets. Bayesian classifiers and
some statistical features such as packet length and frequency. tree-based classifiers were used in their work. Their results
Their approach could identify Tor traffic with an accuracy of show an accuracy of 99.87% for classifying traffic belonging
95.98%. to different networks and 73.99% for determining the appli-
AlSabah et al. [2] tried to classify Tor traffic into the type cation type by using flow-based features.
of application that is being used. They considered interactive
web browsing and bulk downloading (mainly BitTorrent and III. DATASET AND FEATURE SELECTION
streaming applications) as the application types in their work. In this section, we present the details of the datasets we used
They identified that bulk downloading takes up a large band- in our study. Also, we provide information on the features and
width while contributing to a very small percentage of the feature selection metrics we used.
total connections. The aim of the authors in [2] was to provide
different Quality of Service (QoS) to different traffic classes. A. DATASETS
They used features such as cell Inter-Arrival Times (IAT), cir- We use two publicly available datasets, which we refer to in
cuit lifetime, amount of data sent upstream and downstream, this paper as TOR (representing standard Tor traffic) and OS
and classification algorithms such as Naive Bayes, Bayesian (representing Onion Service traffic), in our experiments. The
Networks, and Decision Trees. The experiments of [2] show TOR [17] dataset consists of 95000 traffic traces collected
an accuracy of over 95% in classifying the application type by accessing 95 websites (1000 traces per site) over the
of Tor circuits on a live Tor network. Tor Network. The OS [18] dataset contains 41503 traces
He et al. [5] also tried to determine the application from 539 Onion Services (77 traces per site). Both these
type the encrypted Tor Traffic contained. In contrast to datasets have been collected using a similar approach from
AlSabah et al.’s work, He et al. considered the following the real Tor network and do not contain any modifications
application types; P2P, Web, File Transfer Protocol (FTP), we need for our experiments in RQ2. We also created four
and Instant Messaging (IM). They used Profile Hidden new simulated datasets using the techniques proposed in [10]
Markov Models (PHMM) as their classifier and flow-based and [12]. We refer to these as ‘WTFPAD-TOR’, ‘WTFPAD-
features such as burst volumes and direction of packets as OS’, ‘TrafficSliver-TOR’, and ‘TrafficSliver-OS’. We refer

70014 VOLUME 11, 2023


I. Karunanayake et al.: Darknet Traffic Analysis: Investigating the Impact of Modified Tor Traffic

TABLE 1. Summary of related work.

readers to the original papers for more information regarding Similarly, we calculate the 50th percentile (15), 75th per-
the data collection process [17], [18] and the simulation of centile (16), and the total duration (17) for incoming packets.
traffic modifications [10], [12]. Then, we calculate the same features for the outgoing packets
(25th (18), 50th (19), 75th (20), total (21)) and finally for all
B. FEATURE EXTRACTION packets (25th (22), 50th (23), 75th (24), total (25)). Finally,
Each packet’s timestamp and direction (incoming or outgo- we calculate the sum of those duration-related features (26).
ing) are included in the datasets we have available. From
that information, we extract fifty statistical features from each 3) EIGHT FEATURES BASED ON THE NUMBER OF PACKETS
traffic trace to be used in our experiments. We selected many The number of incoming packets (27), outgoing packets (28)
features which involve timing statistics, and used insights and the total number of packets (29) in a flow were extracted
and features identified from prior works, [13]. We then carry from the dataset. In addition, we calculated the number of
out a set of experiments to identify which of those features incoming (30) and outgoing packets (31) among the first
works best for our research question (see next subsection). 30 packets and last 30 packets (incoming (32), outgoing (33)).
The details of these features are provided below, where we Finally, the sum of the values of these latter seven features
assign a number (1-50) to each feature so as to easily identify (34) was calculated.
them in Figure 2.
4) FIVE FEATURES BASED ON THE PACKET
1) THIRTEEN FEATURES BASED ON INTER-ARRIVAL TIMES CONCENTRATION
(IATS) When calculating these features, we first divided a trace into
IAT is the time difference between two successive packets segments of 20 successive packets. Then we calculated the
to arrive at a particular point. We extract the maximum (1), number of outgoing packets in each of the segments and
mean (2), Standard Deviation (SD) (3), and the 75th per- recorded the values sequentially. Finally, we calculated the
centile (4) of the IATs for all incoming packets (from the entry mean (35), min (36), max (37), SD (38), and median (39) of
node to the Tor client), max (5), mean (6), SD (7), and the those values.
75th percentile (8) of IATs for outgoing packets (from the Tor
client to the entry node), and max (9), mean (10), SD (11), and 5) FIVE FEATURES BASED ON THE PACKET FREQUENCY
the 75th percentile (12) for all packets. Finally, we calculate Defining the packet frequency as the average number of
the sum of all the IAT-related features (13). packets transmitted per second, we calculated the mean (40),
max (41), min (42), SD (43), and median (44) of the packet
2) THIRTEEN FEATURES BASED ON THE FLOW DURATION frequencies.
Here, we calculate features related to the duration of the flow
with respect to the starting time of that flow. For example, 6) FOUR FEATURES BASED ON THE PACKET ORDER
let us take the 25th percentile duration for incoming packets. These are a set of features where we first separate the incom-
To calculate this feature, we first separate all the incoming ing and outgoing packets in a flow. Then we number the
packets from that flow. Then we find the 25th percentile packets in each group from zero and create two separate lists.
packet in the incoming packet sequence, and calculate the For example, if there are 5 incoming packets and 3 outgoing
time that packet was received since the start of the flow (14). packets in a flow, we number the incoming packets as 0,1,2,3

VOLUME 11, 2023 70015


I. Karunanayake et al.: Darknet Traffic Analysis: Investigating the Impact of Modified Tor Traffic

and 4 and the outgoing packets as 0,1,2. Finally, we calculate strong positive correlation, and 0 indicates no correla-
the mean (45) and the SD (46) of the incoming packet list and tion. When doing feature selection, we want to identify
the outgoing packet list (mean (47), SD (48)). the features with a strong relationship with the class, and
it does not matter whether it is positive or negative as
7) TWO FEATURES BASED ON THE PACKET PERCENTAGE long as that feature improves the classification accuracy.
These last two features simply provides the number of incom- Therefore, when selecting the features, if a feature has a
ing (49) and outgoing packets (50) as a fraction of the total high correlation with the class, that means they have a
number of packets in the flow. stronger relationship and can be considered as a more
important feature. Second, we can only calculate r for
C. FEATURE SELECTION numerical values. When we have categorical variables
We use three metrics that help quantify the importance of such as the ones we use as labels in our dataset, they
a feature: Information gain, Pearson correlation, and Fisher have to be first converted into nominal values. Nominal
Score. More details about these metrics and how we use them values are discrete values that do not have any numerical
are described below. relationships. Therefore, when calculating r, each value
• Information gain (IG) evaluates the importance of a of the nominal variable is considered a separate category,
particular feature by measuring its information gain with and a binary indicator variable is created for each value.
respect to the class [19]. It is calculated as the difference For example, in our case, we have OS and TOR as
between the entropy of the class before and after splitting classes, which act as indicator variables. Therefore, for
the dataset on the feature as shown by Equation 1. every data point, the value of each indicator variable is
set to 1 if that variable is present and 0 otherwise. Then
IG (Class, Feature) = H(Class) - H(Class | Feature), (1) the individual correlation is calculated for each indicator
variable, and finally, the weighted average is taken. The
where H (Class) is the entropy of the class and
weight is usually proportional to the frequency of the
H (Class|Feature) is the conditional entropy of the class
indicator variable in the dataset [19].
given the feature. In general, the entropy H (X ) of a class
• Fisher Score is also a widely used metric in supervised
X is calculated as shown in Equation 2.
learning which simply tells how much information a
N
X feature can reveal about the class (e.g., OS/TOR). The
H (X ) = − pi log2 pi , (2) Fisher Score can be defined as below [20] and [21].
i=1 Pc 2
i=1 ni (x̄ki − x̄k )
where N is the number of classes and pi is the proportion Fk = P c 2
, (4)
i=1 ni (σki )
of instances of the ith class.
In a machine learning context, entropy is used to mea- where Fk is the Fisher Score of the k th feature, c is the
sure the degree of impurity (degree of uncertainty) in a number of classes, ni is the number of instances in the ith
dataset with respect to the class labels. As suggested by class, x̄k is the mean of the k th feature in all classes, and
Equations 1 and 2, a high information gain indicates that x̄ki and σki are the mean and variance of the k th feature
in th
the entropy of the class labels is significantly reduced Pcthe i class, respectively.
2
In the above Equation 4,
after the splitting of the dataset compared to the original i=1 ni (x̄ki − x̄k ) refers to the between-class scatter
dataset’s entropy. It shows that the split has reduced the of the k th feature, which is the variation of data points
degree of uncertainty and suggests that the feature is within each class. If this is high,P it means the classes can
important in distinguishing the different classes of the easily be separated. Likewise, ci=1 ni (σki )2 represents
dataset. the within-class scatter. If this number is low, it means
• Pearson’s correlation shows how strongly two vari- that the data points of the class are situated close to
ables are correlated linearly and is defined as follows. each other and easy to separate. Overall, a high Fisher
Pn Score implies that a feature can be used to easily sepa-
i=1 (xi − x̄)(yi − ȳ) rate classes and hence can be used to identify the most
r = qP qP , (3)
n
(x − x̄)2 n
(y − ȳ)2 important features.
i=1 i i=1 i

where r is the Pearson’s correlation coefficient, xi and yi IV. ANALYSIS


are the ith data points of the variables x and y, x̄ and ȳ For all experiments, we divide the dataset percentage wise
represents the sample means of the same variables, and 80:20 into training and testing. Then we do a grid search
n is the number of data points in the dataset. When using with 10-fold cross-validation on the training dataset, and
this coefficient for feature selection, there are a couple evaluate the best model (best hyperparameter combina-
of adjustments that are being considered. For one, the tion) on the testing dataset. We used the Scikit Learn
absolute value of the coefficient is considered instead of (https://scikit-learn.org/stable/) library for machine learning
the exact value. The values of r range from -1 to 1, where and Pandas (https://pandas.pydata.org/) for data analysis and
-1 indicates a strong negative correlation, 1 indicates a manipulation.

70016 VOLUME 11, 2023


I. Karunanayake et al.: Darknet Traffic Analysis: Investigating the Impact of Modified Tor Traffic

TABLE 2. Performance of machine learning classifiers to distinguish Onion Service traffic from other Tor traffic.

A. CLASSIFYING ONION SERVICE TRAFFIC FROM OTHER B. IDENTIFYING TRAFFIC CLASSES WHEN WEBSITE
STANDARD TOR TRAFFIC FINGERPRINTING DEFENCES DEPLOYED
Here, we answer RQ1: Is it possible to classify Onion Ser- Here, we answer RQ2: How do our results for RQ1 hold
vice Traffic from other standard Tor traffic? To identify the when we use modified Tor traffic? In order to obtain answers
classifiability of Onion Service traffic from other Tor traffic, to our second research question, we carried out the next set
we used the TOR and OS datasets. It gave us a dataset of of experiments. To recall, in RQ2, we intend to find how
136,503 samples, labelled either as TOR or OS, depending certain modifications to Tor traffic, produced by state-of-the-
on the original dataset. We used three machine learning algo- art Website Fingerprinting defences, namely, WTFPAD [10],
rithms on the dataset, namely, K-Nearest Neighbor (KNN), and TrafficSliver [12], affect Onion Service traffic classifi-
Random Forests (RF), and Support Vector Machine (SVM), cation. We used the simulated datasets we created using the
and evaluated their performance. We have provided more WTFPAD and TrafficSliver techniques - our experimental
information on these algorithms in Section II. Table 2 shows process is as follows.
the results we obtained when we evaluated the classifiers with We first combined the WTFPAD-TOR and WTFPAD-OS
the testing dataset. datasets to create the WTFPAD dataset. As the WTF-
For evaluating the machine learning classifiers, we used PAD technique uses a padding mechanism, it does not
TP+TN
a few metrics, including accuracy ( TP+TN +FP+FN ), preci- change the number of samples of the original TOR and
TP TP OS datasets. Next, we combined the TrafficSliver-TOR and
sion ( TP+FP ), recall ( TP+FN ), training time, and testing
time. Here, TP, TN, FP, and FN refer to true positives, true TrafficSliver-OS datasets to create the TrafficSliver dataset.
negatives, false positives, and false negatives, respectively. As TrafficSliver uses a traffic-splitting strategy, the number
Generally, if the dataset is balanced (the same number of of samples increased in this scenario. We configured Traffic-
samples available for each class), the accuracy is enough Sliver to split a trace into three sub-traces. We also used the
to provide a good assessment of a classifier’s performance. Batched Weighted Random splitting strategy, which is the best
However, as the TOR-OS dataset we use here is imbalanced, splitting strategy suggested in [12]. After obtaining the sub-
we are using precision and recall as additional metrics to traces, we removed those with a small number of packets (as
assess the performance. The reason for this is that precision they could not be used to extract some of our features). As a
and recall are not affected by the number of true negatives, result, our TrafficSliver dataset consisted of 376,763 traffic
which can provide misleading insights in an imbalanced data traces (271,778 TOR traces and 104,985 OS traces). We ran
setting. In Table 2, we show results for classiying Onion the same experiments (except for the ones with the least
Service traffic from other Tor traffic by using all fifty features important features) on the WTFPAD and the TrafficSliver
we have extracted. datasets. Table 2 shows our results for these experiments.
From Table 2, we can get the answer to our first research We can obtain several insights by analysing these results in
question - RQ1 from the experiments we did with the original detail.
(no defence) dataset. We can clearly see that Onion Service
traffic is highly distinguishable from other Tor traffic. 1) WTFPAD
All three classifiers we used performed with 99% accuracy. SVM and RF show a slight reduction (∼0.6%) in accuracy
It could be argued that such good accuracy is not surprising, on the WTFPAD dataset, while KNN shows a considerable
given the clear differences that are present in the datasets for reduction of more than 6%. Both RF and KNN show a
each class of traffic. Non machine learning-based algorithms larger reduction in precision and recall compared to SVM.
would likely provide for similar outcomes, with the machine From these results, we can say WTFPAD does reduce the
learning classifiers we have listed adding improvement at classifier performance to some extent but still allows them
the 10% level (see later discussion). The importance of our to distinguish Onion Service vs standard Tor traffic suc-
machine learning-based approach is perhaps more forthcom- cessfully. This behaviour is intuitive given that WTFPAD
ing in the next section, where we discuss the same problem does padding to both Onion Service and standard Tor traffic.
under more difficult circumstances. Although it may change the number of packets and some

VOLUME 11, 2023 70017


I. Karunanayake et al.: Darknet Traffic Analysis: Investigating the Impact of Modified Tor Traffic

inter-arrival timing statistics, those changes apply to both from Table 3. We can see that the top six features we selected
types of traffic, keeping their differences intact. contribute to the majority of the classification ability of the
classifiers. Almost all classifiers can classify Onion Service
2) TRAFFICSLIVER traffic with similar performance with only the top six features
When we ran our experiments on the TrafficSliver dataset, (all other features seem redundant). However, we can see
we observed a significant difference compared to all the that the RF classifier relies more on the feature set compared
previous results we obtained. For KNN, RF, and SVM, the to the other two classifiers. RF’s accuracy drops by 9.2%
accuracy drops are ∼17%, ∼10%, and ∼8%. It implies when using the least important features instead of the most
that the TrafficSliver modification is actually capable of important ones. For KNN, this number is only 1.6%, and for
reducing the identifiability of Onion Service traffic from SVM, it is 4%. These observations confirm that while the
standard Tor traffic. However, we can also observe that features we have used here play a major role in a classifier’s
the recall values stay above 90%. It is a result of the mod- performance, the classifiers themselves have a large impact
els predicting TOR more often than OS. As TrafficSliver on the overall result. We can also conclude that using all fifty
splits the traffic into multiple sub-traces, these sub-traces can features is redundant for RQ1.
have contrasting timestamps to the original TOR-OS dataset.
Therefore, we can assume that the split traces for OS and TOR 2) WHAT LEVEL OF PERFORMANCE DO DIFFERENT
have similar characteristics to a certain extent. In addition, FEATURES PROVIDE FOR RQ2?
we should mention that this performance reduction occurs In Table 4, we have presented the performance of each clas-
despite the number of samples increasing (and that the large sifier on the modified traffic with only the top six features.
increase in training and testing times is a consequence of this For WTFPAD, when we only use the top six features, we can
high number of samples). see a more visible reduction in performance than what we
saw earlier with all fifty features. Previously, we noticed that
C. FEATURE IMPACTS ON TRAFFIC CLASSIFICATION the top six features are sufficient to provide an almost ∼99%
Here, we answer RQ3: What features impact Onion Service accuracy, precision, and recall on the original (no defence)
traffic classification most, and what level of performance dataset. However, when implementing the WTFPAD modifi-
do they provide? We can answer the first part of RQ3 by cations top six features are not enough to reach that almost
looking at Figure 2. From Figure 2a, we can notice that perfect classifier performance. Still, all the classifiers show
four features, namely, mean (35) and median (39) outgoing precision and recall of more than 95%, which is quite good.
packet concentrations along with the 2 packet percentage When it comes to TrafficSliver, we can observe a drastic
features (49, 50), have more information gain with the two change in performance when we use only the top six features.
classes compared to other features (see Section III for more Both accuracy and precision values for all classifiers drop
information about the features and numbers associated with below 80% in this scenario, while the recall values stays
them). Similarly, Figure 2b shows that the above four fea- more than 90%. This observation is quite similar to what we
tures, along with the minimum (36) and SD (38) of packet observed in Table 2. Overall, we can conclude that the top
concentrations, have a high correlation with the class labels, six features are not sufficient to classify Onion Service traffic
while Figure 2c confirms those results using Fisher Scores. from other Tor traffic, when the modifications to the traffic
These are the top six features that seem to have a greater rela- are implemented.
tionship with the traffic type. All three of our feature selection
metrics filter out six features out of the fifty. Our next step is D. DETERMINING ONION SERVICE TRAFFIC CHANGES
to investigate the impact of these features on the performance WHEN DEFENCES APPLIED
of the classifiers we used to evaluate RQ1 and RQ2. Next, we combine three datasets, OS, WTFPAD-OS, and
TrafficSliver-OS, into a single dataset and label each trace
1) WHAT LEVEL OF PERFORMANCE DO DIFFERENT with either ORIG (for original), WTFPAD, and TrafficSliver,
FEATURES PROVIDE FOR RQ1? respectively. This new dataset consists of 187,991 traces.
In Table 3, we show the results for all three datasets we The main reason for doing this experiment is that we want
obtained when we only used the top six features we iden- to determine how similar these modified traces are to the
tified (Features 35, 36, 38, 39, 49, and 50). In addition, original Onion Service traffic. As this dataset has three differ-
we have used the least important features with the original ent classes, we have to carry out a multi-class classification
(no defence) dataset to obtain further insights into their per- experiment to evaluate it.
formance impact. These least important features consist of Table 5 shows the precision and recall for each individual
6 inter-arrival time statistics (Features 2, 3, 4, 10, 11, 12). class when we use all fifty features, and when we use the
The main objective of this experiments is to get an overall top six important features we mentioned in Section III. As all
sense about the performance of the classifiers, regardless of classes have a different number of samples (traces), precision
the input features. and recall are the best metrics for evaluation. From this table,
We can see some interesting insights on how the classifier we can get three important insights; (i) Overall, compared to
performance change with different (top and bottom) features KNN and RF, SVM has a slightly better ability to classify

70018 VOLUME 11, 2023


I. Karunanayake et al.: Darknet Traffic Analysis: Investigating the Impact of Modified Tor Traffic

FIGURE 2. Feature Selection Results for TOR-OS combined dataset. Features 1-13 are based on Inter-Arrival Times. Features 14-26 are based on the
flow duration. Features 27-34 are based on the number of packets. Features 35-39 are based on packet concentration. Features 40-44 are based on
packet frequency. Features 45-48 are based on the packet order. Features 49 and 50 are based on the packet percentage.

TABLE 3. Performance of machine learning classifiers on the original (no defence) dataset with subsets of features.

TABLE 4. Performance of machine learning classifiers with top six features when defences are applied.

FIGURE 3. Confusion Matrix of different classifiers to distinguish various traffic types.

these traffic types in a multi-class setting. (ii) TrafficSliver We also provide the confusion matrices for the three clas-
traffic seems to have very distinctive features from others, sifiers (with all fifty features) in Figure 3, which helps us
which supports our previous observations in Table 2 and to get a better understanding of the similarities between the
Table 4. (iii) The features play a more important role in this different traffic types. Out of 37,599 testing samples, there
classification task relative to the binary classification experi- are 8,346 Original OS samples, 20,961 TrafficSliver samples,
ments we discussed earlier. We can see that the precision and and 8,292 WTFPAD samples. By observing Figure 3, we can
recall values dropped by more than 20% in some cases when see that all traffic types are largely identified. This suggests
only the top six features were used. that when modifications are used, Onion Service traffic

VOLUME 11, 2023 70019


I. Karunanayake et al.: Darknet Traffic Analysis: Investigating the Impact of Modified Tor Traffic

TABLE 5. Distinguishability of Onion Service traffic with different


modifications.

FIGURE 4. TOR-OS experiment with the six least important features and
different number of samples per class.

application, the training would be carried out offline. There-


changes significantly from its original version. Another fore, the training time may not have much of an impact in a
important observation from Figure 3 is that WTFPAD modi- real-world setting. However, if the dataset is extremely large
fied traffic are more frequently mis-classified as original Tor and the system requires frequent training, it will be useful
traffic compared to TrafficSliver modified traffic, confirming to employ a technique with a low training overhead. KNN
our earlier observations and conclusions. shows the best training time figures for our problem, while
SVM shows the worst.
V. DISCUSSION When we consider the testing time, which is representative
A. INSIGHTS of the time taken to predict the class of a previously unseen
Earlier, we mentioned that traffic flows through six nodes traffic trace, we argue that it is vital to have a low value for
in an Onion Service circuit, while in a standard Tor circuit, that. To further elaborate, let us assume a system designed
this number is three. This can cause a difference in the to alert Onion Service activity inside a company. Such a
packet latency, which is captured by the feature set we use. system has to monitor all the network traffic going out of
Intuitively, the time-related features that should have a higher the company and flag any suspicious traces. If the prediction
classification ability among the features we used. However, time of a single traffic capture takes considerable time, that
the top six features in our experiments are not directly related system will either require lots of resources (e.g. computing
to time, but the number of packets. This shows that although power, memory) or go into overload. Out of the classifiers
the outcome of our experiments is as expected, the reasons we have used, RF shows the best testing times, which in some
behind them are not a direct cause of our initial intuition. cases are more than 100 times better than the other classifiers.
However, we cannot disregard that intuition completely as Therefore, RF would definitely be the ideal candidate to
the six least-important features, all consisting of IATs, also be used in a real-time application. We can also see another
provide a significant accuracy, precision, and recall when it interesting observation among the testing times. This is that
comes to identifying Onion Service traffic. KNN has higher testing times than training times, which
In addition, we argue that there are three main reasons for is quite striking in Table 2. The reason for this is that the
the good results we report for RQ1. These are, the features we computational complexity for KNN predictions is a function
have selected, the number of samples in the datasets we have of the number of samples and the number of features. That is
used, and the optimized classifiers we use. In order to support why when we reduce the number of features, the testing time
this argument, we conducted a supplementary experiment in drops drastically. This result shows why it is important to find
which we vary the number of samples we use with the least top-performing features for our classification problem.
important features for the TOR-OS dataset. Figure 4 shows
the results of this experiment. Here, we can clearly observe C. IMPACT
that if we did this experiment with subpar features, we would As we previously mentioned, traffic classification is useful
not have obtained 95% accuracy even with 10000 samples for traffic shaping and traffic monitoring (linking user activity
per class. and blocking anonymous traffic). In our case, as we are
trying to classify anonymous traffic, it is more relevant for
B. RUN TIMES traffic monitoring. As Tor is a censorship circumvention tool,
It is perhaps important we discuss the impact of training and some governments try to restrict access to Tor. If Tor traffic
testing times (refer to Table 2). When it comes to training (and Onion Service traffic, for that matter) can be identified
time, we have carried out our experiments in an offline easily, then it is easy to implement techniques to restrict
setting, i.e., we are not training the classifier in real time. that traffic. Also, if there are institutions and businesses that
Therefore, if our techniques were to be used in a real-world deal with sensitive information and want to restrict access

70020 VOLUME 11, 2023


I. Karunanayake et al.: Darknet Traffic Analysis: Investigating the Impact of Modified Tor Traffic

TABLE 6. Hyperparameter Tuning: Optimum values used for experiments that the performance of their classifiers may be reduced
related to RQ1 and RQ2.
significantly if modifications in WTFPAD or TrafficSliver
are installed in Tor. This argument applies to all other related
work mentioned in Table 1. Second, evaluation of how these
modifications affect the fingerprintability of individual Onion
Services could be attempted. These modifications were orig-
inally proposed as Website Fingerprinting defences, and the
original papers carried out experiments to show that these
changes to Tor traffic do reduce the fingerprintability of
websites. However, none of these latter works attempt to
to the dark web2 from their computers but do not require investigate the impact those changes have on the fingerprint-
to restrict overall Tor access, having techniques to isolate ability of Onion Services. This issue may be worth further
Onion Service traffic can be a great starting place to cater for investigation.
that requirement. Furthermore, the results from our second
research question shed some light on the validity of this work VI. CONCLUSION
when modifications are introduced to the Tor traffic. Our new In this work, we answered three research questions focused
results suggest that the results of prior works on Tor traffic on Onion Service traffic classification. We evaluated the
classification might be invalid ([3], [6]) in the event these applicability of supervised machine learning models to clas-
modifications are done to the Tor traffic. Finally, our results sify Onion Service traffic from other Tor traffic. We extracted
highlight that it is important to consider the overall classifia- fifty features from each traffic trace and used that feature
bility of Tor traffic when developing (Website Fingerprinting) set as input to the machine learning classifiers. Our results
defences to mask information leakage in Tor traffic. showed that KNN, RF, and SVM classifiers have the ability to
distinguish Onion Service traffic from Tor traffic with a 99%
D. SELECTION OF CLASSIFIERS AND FEATURES accuracy. Then, we tried to identify whether state-of-the-art
We used insights from existing literature when selecting our Website Fingerprinting defences affect the classifiability of
classifiers and features. For example, Lashkari et al. [3] used Tor traffic. These defences introduce different modifications
KNN and RF in their work to classify Tor traffic from non-Tor to try and obfuscate information leakage from traffic, and
traffic. In addition, most of our features have been previously we evaluated how those changes affect the Onion Service
used by Hayes and Danezis [13] for a Website Fingerprint- traffic classification. Our experiments showed that the above
ing attack. By evaluating the related literature, we decided classifiers, combined with our feature set, reduce the per-
to use these classifiers and features, which ultimately gave formance for Onion Service traffic classification. However,
good results for our specific problem. We also had to carry we observed that the modified Tor traffic is still distin-
out experiments to identify the best hyperparameters for our guishable. Moreover, we used three feature selection metrics,
models. In Table 6, we have mentioned some of the hyperpa- namely, information gain, Pearson’s correlation, and Fisher
rameters we selected. It should be noted that we tested other Score, to identify the top features for this task. Those top
additional parameters, but the best values we obtained for features were able to provide >98% success for classifying
them turned out to be the default values used in the Scikit Onion Service traffic from Tor traffic. However, they could
Learn library. Therefore, we have not mentioned them here. not provide such good results when modified Tor traffic traces
were used.
E. SCALABILITY
In this study, we used a relatively large dataset consisting of REFERENCES
136,503 data samples for the first experiment and 513,266 [1] R. Dingledine, N. Mathewson, and P. Syverson, ‘‘Tor: The second-
samples for the second. In Figure 4, we showed that our generation onion router,’’ in Proc. 13th USENIX Secur. Symp. (SSYM),
San Diego, CA, USA, Aug. 2004, pp. 303–320.
results actually get better with more samples. Therefore, [2] M. Al Sabah, K. Bauer, and I. Goldberg, ‘‘Enhancing Tor’s performance
we can argue that our techniques are scalable. using real-time traffic classification,’’ in Proc. ACM Conf. Comput. Com-
mun. Secur. (CCS), New York, NY, USA, Oct. 2012, pp. 73–84.
[3] A. H. Lashkari, G. D. Gil, M. S. I. Mamun, and A. A. Ghorbani, ‘‘Charac-
F. FUTURE WORK terization of Tor traffic using time based features,’’ in Proc. 3rd Int. Conf.
Now that we have established Onion Service traffic is Inf. Syst. Secur. Privacy (ICISSP), Porto, Portugal, Feb. 2017, pp. 253–262.
distinguishable, and the modifications can reduce this dis- [4] M. Kim and A. Anpalagan, ‘‘Tor traffic classification from raw packet
header using convolutional neural network,’’ in Proc. 1st IEEE Int. Conf.
tinguishability, we mention potential future work associated Knowl. Innov. Invention (ICKII), Jeju Island, South Korea, Jul. 2018,
with this line of research. First, the impact of similar modi- pp. 187–190.
fications on some of the related work could be investigated. [5] G. He, M. Yang, J. Luo, and X. Gu, ‘‘Inferring application type information
from Tor encrypted traffic,’’ in Proc. 2nd Int. Conf. Adv. Cloud Big Data
For example, although Lashkari et al. [3] show that Tor traffic (CBD), Washington, DC, USA, Nov. 2014, pp. 220–227.
can be easily differentiated from non-Tor traffic, we argue [6] A. Montieri, D. Ciuonzo, G. Aceto, and A. Pescapé, ‘‘Anonymity services
tor, I2P, JonDonym: Classifying in the dark (web),’’ IEEE Trans. Depend-
2 dark web simply refers to Onion Services. able Secure Comput., vol. 17, no. 3, pp. 662–675, May 2020.

VOLUME 11, 2023 70021


I. Karunanayake et al.: Darknet Traffic Analysis: Investigating the Impact of Modified Tor Traffic

[7] (May 2017). WCry Ransomware Analysis. Accessed: Apr. 26, 2023. NADEEM AHMED received the M.S. and Ph.D.
[Online]. Available: https://www.secureworks.com/research/wcry- degrees in computer science and engineering from
ransomware-analysis the University of New South Wales, Sydney, NSW,
[8] (Jul. 2019). Keeping a Hidden Identity: Mirai C&Cs in Tor Network. Australia, in 2000 and 2007, respectively. He was
Accessed: Apr. 26, 2023. [Online]. Available: https://blog.trendmicro. the Head of the Computing Department, School
com/trendlabs-security-intelligence/keeping-a-hidden-identity-mirai-ccs- of Electrical Engineering and Computer Science,
in-tor-network/ National University of Sciences and Technology,
[9] (Nov. 2014). Global Action Against Dark Markets on Tor Network.
Pakistan. He is currently a Senior Research Fel-
Accessed: Aug. 4, 2020. [Online]. Available: https://www.europol.
low with the Cyber Security Cooperative Research
europa.eu/newsroom/news/global-action-against-dark-markets-tor-
network
Centre (CSCRC), Australia. His research interests
[10] M. Juarez, M. Imani, M. Perry, C. Diaz, and M. Wright, ‘‘Toward an include cyber security, the IoT, wireless sensor networks, and software-
efficient website fingerprinting defense,’’ in Proc. 21st Eur. Symp. Res. defined networking.
Comput. Secur. (ESORICS), Heraklion, Greece, Sep. 2016, pp. 27–46.
[11] T. Wang and I. Goldberg, ‘‘Walkie-talkie: An efficient defense against
passive website fingerprinting attacks,’’ in Proc. 26th USENIX Secur.
Symp. (SEC), Vancouver, BC, Canada, Aug. 2017, pp. 1375–1390.
[12] W. De la Cadena, A. Mitseva, J. Hiller, J. Pennekamp, S. Reuter, J. Filter, ROBERT MALANEY (Senior Member, IEEE)
T. Engel, K. Wehrle, and A. Panchenko, ‘‘TrafficSliver: Fighting web- received the B.Sc. degree in physics from the
site fingerprinting attacks with traffic splitting,’’ in Proc. ACM SIGSAC University of Glasgow, and the Ph.D. degree
Conf. Comput. Commun. Secur. (CCS), New York, NY, USA, Nov. 2020, in physics from the University of St Andrews,
pp. 1971–1985. Scotland. He has previously held research posi-
[13] J. Hayes and G. Danezis, ‘‘k-fingerprinting: A robust scalable website fin- tions with Caltech, University of California,
gerprinting technique,’’ in Proc. 25th USENIX Conf. Secur. Symp. (SEC), Berkeley, Berkeley, CA, USA, and with the Uni-
Austin, TX, USA, Aug. 2016, pp. 1187–1203. versity of Toronto. He was a farmer Principal
[14] X. Bai, Y. Zhang, and X. Niu, ‘‘Traffic identification of Tor and web-
Research Scientist with CSIRO. He is currently a
mix,’’ in Proc. 8th Int. Conf. Intell. Syst. Design Appl. (ISDA), Kaohsiung,
Professor with the School of Electrical Engineer-
Taiwan, vol. 1, Nov. 2008, pp. 548–551.
[15] O. Berthold, H. Federrath, and S. Köpsell, ‘‘Web MIXes: A system for
ing and Telecommunications, University of New South Wales, Australia.
anonymous and unobservable Internet access,’’ in Proc. Int. Workshop He has more than 200 publications.
Design Issues Anonymity Unobservability, in Lecture Notes in Computer
Science, vol. 2009, H. Federrath, Ed., Berkeley, CA, USA, Jul. 2000,
pp. 115–129.
[16] B. Zantout and R. Haraty, ‘‘I2P data communication system,’’ in Proc.
10th Int. Conf. Netw. (ICN), Sint Maarten, The Netherlands, Jan. 2011, RAFIQUL ISLAM is currently an Associate Pro-
pp. 401–409. fessor with the School of Computing and Mathe-
[17] P. Sirinam, M. Imani, M. Juarez, and M. Wright, ‘‘Deep fingerprint- matics, Charles Sturt University, Australia. He was
ing: Undermining website fingerprinting defenses with deep learning,’’ in leading the Cybersecurity Research Team and
Proc. ACM SIGSAC Conf. Comput. Commun. Secur. (CCS), Toronto, ON, has developed a strong background in leadership,
Canada, Oct. 2018, pp. 1928–1943. sustainability, and collaborative research in the
[18] R. Overdorf, M. Juárez, G. Acar, R. Greenstadt, and C. Díaz, ‘‘How unique area. He has a strong publication record and has
is your.onion?: An analysis of the fingerprintability of Tor onion services,’’ published more than 170 peer-reviewed research
in Proc. ACM SIGSAC Conf. Comput. Commun. Secur. (CCS), Dallas, TX, papers, book chapters, and books. He has a strong
USA, Oct. 2017, pp. 2021–2036.
research background in cybersecurity with a spe-
[19] I. H. Witten, E. Frank, and M. A. Hall, Data Mining: Practical Machine
cific focus on malware analysis and classification, authentication, security in
Learning Tools and Techniques, 3rd ed. San Francisco, CA, USA:
Morgan Kaufmann, 2011.
the cloud, privacy in social media, the IoT, and dark web. He is an Associate
[20] X. He, D. Cai, and P. Niyogi, ‘‘Laplacian score for feature selection,’’ Editor of the International Journal of Computers and Applications and a
in Proc. Adv. Neural Inf. Process. Syst. (NIPS), Vancouver, BC, Canada, Guest Editor of various reputed journals.
Dec. 2005, pp. 507–514.
[21] M. Gan and L. Zhang, ‘‘Iteratively local Fisher score for feature selection,’’
Appl. Intell., vol. 51, pp. 6167–6181, Aug. 2021.

SANJAY K. JHA (Senior Member, IEEE) is


currently a Full Professor with the School of Com-
puter Science and Engineering, University of New
South Wales (UNSW), Australia, where he leads
ISHAN KARUNANAYAKE (Graduate Student the Cybersecurity Cooperative Research Centre.
Member, IEEE) received the B.Sc. degree in elec- He is a Chief Scientist and the Research Director
tronic and telecommunication engineering from of the UNSW Institute for Cybersecurity (IFCY-
the University of Moratuwa, Sri Lanka, in 2018. BER). He is the principal author of the book
He is currently pursuing the Ph.D. degree with Engineering Internet QoS. His research interests
the School of Computer Science and Engineer- include network security, wireless mesh and sensor
ing, University of New South Wales (UNSW), networks, and the Internet of Things. His editorial affiliations include the
Australia. He is affiliated with the Cyber Security IEEE TRANSACTIONS ON MOBILE COMPUTING and the IEEE TRANSACTIONS ON
Cooperative Research Centre (CSCRC), Australia. DEPENDABLE AND SECURE COMPUTING. He is a Co-Editor of the book Wireless
His research interests include network security, Sensor Networks: A Systems Perspective.
malware analysis, anonymity and privacy, and machine learning.

70022 VOLUME 11, 2023

You might also like