Detecting Distributed Denial of Service Attacks Using Source IP Address Monitoring
Detecting Distributed Denial of Service Attacks Using Source IP Address Monitoring
1
=
2
= ... =
n
, which means the time slots are of
equal length. The choice of
n
is a compromise between
making
n
small so that the detection engine can quickly
detect an attack, and making
n
large so that the detec-
tion engine has less computation load because it checks
the trafc less often.
Let T
n
represent the set of unique IP addresses and T
n
represent the items of IP Address Database (IAD) at the
end of the time interval
n
(n = 1, 2, 3, ...). As we dis-
cussed before, [T
n
T
n
T
n
[ ,which represents the num-
ber of new IP addresses in
n
, can be used to detect the
DDoS attack. However, [T
n
T
n
T
n
[ varies accord-
ing to the position of the network trafc monitoring point
(NTMP) and different
n
. We can normalize this value
by dening X
n
=
|T
n
T
n
D
n
|
Tn
, which will not be affected
by the NTMP and
n
. Consequently, we use X
n
for our
detection mechanism.
C. Implementation of Our Source IP Address Monitoring
(SIM) Scheme
1) System Architecture: Figure 4 provides an
overviewof our SIMscheme. The SIMscheme consists of
three parts: detection engine, decision engine, and lter-
ing engine. The detection engine analyzes the incoming
trafc pattern to detect any abnormalities. The decision
engine summarizes the results from the detection engine
and decides whether an attack is occurring. The ltering
engine lters the attack trafc according to the identied
6
Fig. 5. The placement of the SIM scheme
attack trafc pattern. Note that there are two detection
engines. The rst detection engine is used to detect non-
distributed attacks from a single source, while the second
detection engine is used to detect highly distributed denial
of service attacks.
There are two steps in the detection engines. First, the
detection engine sorts the incoming IP ows according to
source IP addresses, and identies whether there is an IP
ow with an unusually large number of packets. If there
is, we activate the ltering engine to block this abnormal
IP ow. This step is very effective for defending against
some naive DoS attacks launched from a single or small
number of sources. The second step is the core technology
of our SIM scheme, which is shown in the shadow part
of Figure 4. This step is designed to to defend against
sophisticated DDoS attacks and is described in detail in
the following sections. As we can see from Figure 4,
the detection engine monitors the trafc through a pas-
sive (read-only) interface which is pre-congured with a
non-routable IP address. This implementation feature can
make the detection engine immune to the attacks since it
is invisible to the attacker. When no attack is detected in
the detection engine, a control signal is sent to the edge
router
1
to stop the ltering engine.
2) Placement of the Detection Mechanism: Wang et
al. [30] discussed how attack detection can be performed
at either the rst-mile or last-mile edge routers. Our
Source IP address Monitoring (SIM) scheme can be in-
stalled at either the rst-mile or the last-mile edge router,
or both. However, each edge router can be both the rst-
mile and last-mile router, depending on the direction of
trafc ows between the local network and the Internet,
as shown in Figure 5. For the packets going out of the
1
We use the term edge router to refer to the router that provides ac-
cess to the Internet for the victims subnetwork that we are defending.
local network, the edge router is their rst-mile router. On
the other hand, for the incoming packets into the local net-
work, the edge router is their last-mile router. Thus we can
deploy the SIM in both inbound and outbound interfaces
of the edge router.
The rst-mile SIM of the edge router plays the primary
role in detecting a ooding attack, due mainly to its prox-
imity to the sources of the ooding attack. However, the
detection sensitivity may decline with the increase of the
size of the attack group. In a large-scale DDoS attack,
the ooding sources can be orchestrated so that individual
attack trafc ows cause only an insignicant deviation
from the normal trafc pattern.
In contrast, the last-mile SIM can quickly detect the at-
tacks as all of the ooding trafc is aggregated at the last-
mile router. Although it cannot provide any hint about the
ooding sources, a ltering engine, such as History-based
IP Filtering [24] can be triggered to protect the victim.
To bring down the victim under protection, the ooding
sources have to signicantly increase their ooding rates.
However, this increased ooding trafc makes it easier to
detect the ooding attack and its sources at the rst-mile
routers.
V. ABRUPT CHANGE DETECTION
In order to detect a DDoS attack, we need to be able to
detect changes in our detection feature over time. How-
ever, our detection feature is a random variable due to
to the stochastic nature of Internet trafc. Consequently,
before describing the proposed ooding detection mecha-
nism, we discuss the details of the theoretical background
of our detection algorithm.
A. Change Detection Modelling
Internet trafc can be viewed as a complex stochas-
tic model and any trafc abnormalities, for example, a
HDDoS attack, can lead the abrupt change of the model.
Our goal is to detect the change in the number of new
IP addresses. There are two approaches to detect this
change. One is xed-size batch detection, which monitors
the change of mean value every xed time period. An-
other is sequential change-point detection, which moni-
tors the variables successively. The latter is designed to
detect a change in the model as soon as possible after its
occurrence, which meets the key design requirement for
our detection engine. Thus, we can model our task as a
sequential change point detection problem. Consider the
illustrative example in Figure 6. For the random sequence
X
n
, there is a step change of the mean value at m from
to + h. We require an algorithm to detect changes of
7
m
0
1
n
m
0
n
m
0
n
N
X
n
Z
n
y
n
N
+h
a=
a+h
Fig. 6. The CUSUM algorithm
at least step size h and estimate m in a sequential manner
so that the detection delay and false positive rate are both
minimized. The random sequence X
n
can be formal-
ized as follows:
X
n
= +
n
I(n < m) + (h +
n
)I(n m), (5)
where =
n
n=1
, =
n
n=1
are random sequences
such that E(
n
) = E(
n
) 0, h ,= 0. I(H) is the
indicator function, it equals 1 when the condition H is
satised and 0 otherwise.
B. The CUSUM Algorithm
The CUSUM (Cumulative Sum) algorithm is a com-
monly used algorithm in statistical process control, which
can detect the change of mean value of a statistical pro-
cess. CUSUM relies on the fact that if a change occurs,
the probability distribution of the random sequence will
also change. Generally, CUSUM requires a parametric
model for the random sequence so that the probability
density function can be applied to monitor the sequence.
Unfortunately, the Internet is a very dynamic and com-
plicated entity, and the theoretical construction of Inter-
net trafc models is a complex open problem, which is
beyond the scope of this paper. Thus, a key challenge
is how to model X
n
. Since non-parametric methods
are not model-specic, they are more suitable for analyz-
ing the Internet. In our experiment, we applied the non-
parametric CUSUM (Cumulative Sum) method [4] in our
detection algorithm. This general approach is based on the
model presented in Wang et al. [30] for attack detection
using CUSUM. The main idea behind the non-parametric
CUSUM algorithm is that we accumulate values of X
n
that are signicantly higher than the mean level under nor-
mal operation. One of the advantages of this algorithm is
that it monitors the input random variables in a sequential
manner so that real-time detection is achieved.
Let us begin by dening our notation before we give
a formal denition of our algorithm. As we mentioned
in Sec IV-B.3, X
n
represents the fraction of new IP ad-
dresses in the measurement interval
n
. The top graph
in Figure 6 shows an illustrative example of X
n
. In
normal operation, this fraction will be close to 0, i.e.
E(X
n
) = , since there is only a small proportion
of IP addresses that are new to the network under normal
conditions [16] [24]. However, one of the assumptions
for the nonparametric CUSUM algorithm [4] is that mean
value of the random sequence is negative during normal
conditions, and becomes positive when a change occurs.
Thus, without loss of any statistical feature, X
n
is trans-
formed into another random sequence Z
n
with negative
mean a, i.e. Z
n
= X
n
, where a = (See the
middle graph of Figure 6). Parameter is a constant value
for a given network condition, and it helps to produce a
random sequence Z
n
with a negative mean so that all
the negative values of Z
n
will not accumulate accord-
ing to time. When an attack happens, Z
n
will suddenly
become large and positive, i.e. h + a > 0, where h can
be viewed as a lower bound of the increase in Z
n
during
an attack. Hence, Z
n
with a positive value (h + a > 0)
is accumulated to indicate whether an attack happens or
not (See the bottom graph of Figure 6). One thing worth
noting is that h is dened as the minimum increase of the
mean value during an attack and it is not the threshold
for the bandwidth attack detection. The attack detection
threshold N is used for the y
n
, accumulated positive val-
ues of Z
n
, which is illustrated in Figure 6. Our change
detection is based on the observation of h . Now
our detection problem is to nd the abrupt change in the
random sequence Z
n
which is described as follows:
Z
n
= a +
n
I(n < m) + (h +
n
)I(n m), (6)
where a < 0, a < h < 1, and other conditions are the
same as Eq. 5.
The formal denition of the non-parametric CUSUM
algorithm is illustrated as follows:
y
n
= S
n
min
1kn
S
k
, (7)
where S
k
=
k
i=1
Z
i
, with S
0
= 0 at the beginning, and
y
n
is our test statistic. In order to reduce the overhead
for online implementation, we use the recursive version of
non-parametric CUSUM algorithm [1][4][3][30] which is
shown as follows:
8
y
n
= (y
n1
+Z
n
)
+
,
y
0
= 0, (8)
where x
+
is equal to x if x > 0 and 0 otherwise. A large
y
n
is a strong indication of an attack.
As we see in the bottom graph of Figure 6, y
n
repre-
sents the cumulative positive values of Z
n
. We consider
the change to have occurred at time
N
if y
N
N. The
decision function can be described as follows:
d
N
(y
n
) =
0 if y
n
N;
1 if y
n
> N.
N is the threshold for attack detection and d
N
(y
n
) rep-
resents the decision at time n: 1 if the test statistic y
n
is larger than N, which indicates an attack, and 0 other-
wise, which indicates the normal operation (no statistical
feature change for the random sequence Z
n
).
C. Analysis of the CUSUM algorithm
It has been proved in earlier literature [1][4] that if the
values in a time series are independent and identically
distributed with a parametric model, CUSUM is asymp-
totically optimal for a variety of Change Point Detection
problems. There are two requirements to apply CUSUM
to the aforementioned random sequence Z
n
. First, the
dependence between random variables decreases with the
increase of time. Second, the random variable is bounded
by a nite value. This has been formalized in [4][30] as
follows:
A: (s), the -mixing coefcient of Z
n
, ap-
proaches 0 as s . Let Z
n
n=1
be a random se-
quence on a probability space , T, T. Let T
k
j
=
: Z
j
, Z
j+1
, ..., Z
k
(1 j k < ), which
is -algebra generated by random vectors Z
n
k
n=j
.
The -mixing coefcient is dened as follows:
(s)
def
= sup
t1
sup
AF
t
1
,
BF
t+s
P(A)P(B)=0
[
P(AB)
P(A)P(B)
1[, (9)
where sup stands for supremum. As we see from the
equation above, if the dependency among Z
n
is
very weak, for example, long range dependent arrival
processes, (s) will approach 0 as s .
B: One-dimensional distribution of Z
n
satises
the following regularity condition: H > 0 such that
E(e
tZn
) < for [t[ H, which means Z
n
will not
be innitely large.
In [1][4], the two conditions are described in more de-
tail. Since the Z
n
is derived from the Internet trafc,
where long range dependent arrival processes are com-
mon, the dependency among Z
n
samples decays as the
time interval increases. Thus, condition A is satised.
Since 0 X
n
1 and Z
n
= X
n
, where is a
nite constant, Z
n
is also a nite value. Therefore, con-
dition B is satised. Consequently, our detection variable
Z
n
can easily satisfy these two weak requirements.
There are two key measures that are used to evaluate
bandwidth attack detection systems. The rst is the false
alarm rate, which is one of the biggest concerns among
the anomaly detection community. If a system produces
too many false alarms, it will require lots of time to inves-
tigate whether the alarm indicate a real attack or not. If
the attack reaction (such as packet ltering) is taken ac-
cording to the false alarm, innocent trafc will be unfairly
punished and normal network services are disturbed. The
second is the detection time. One of the advantages for a
bandwidth detection system is to detect the attack as soon
as possible so that proper reaction schemes can be done
earlier to minimize or eliminate the attack damage.
Unfortunately, these two parameters are a conicting
pair. It is hard to shorten the detection time and reduce
the false alarm rate at the same time. Therefore, a trade-
off must be made between these two. As we mentioned
before, CUSUM algorithm is said to be optimal in min-
imizing the detection time as well as reducing the false
alarm rate [1][4][30].
According to previous theoretical work [4] in the non-
parametric CUSUM algorithm, the detection time
N
and
the normalized detection time after a change occurs
N
are dened as follows:
N
= infn : d
N
(.) = 1,
N
=
(
N
m)
+
N
, (10)
where inf represent inmum, and m represents the start-
ing time of the attack, which are illustrated in Figure 6.
The relation between
N
and the lower bound of actual
increase h during an attack is described as follows:
N
=
1
h [a[
, (11)
where a(a < 0) is the mean of Z
n
during normal oper-
ation and h [a[ is the lower bound of the mean of Z
n
N
=
(m+1m)
20
= 0.05.
However, the attack trafc at the rst-mile router
(close to the attack source) is much more diluted. This is
because sophisticated attackers can generate attack traf-
Fig. 7. The trace-driven simulation experiment
c from multiple sources so that the attack sources do
not standout from the background trafc, i.e., the change
value h contributed by the attack trafc will be small. In
order to nd a balance between detection sensitivity and
false alarm rate, we choose [a[ = 0.01, h = 0.02 at the
rst-mile router. For the rst-mile router, the most chal-
lenging task is to reduce the false positive rate because of
the sparse attack trafc. Thus, we let
N
= m+3 and get
= 100 and N = 0.03. All these derived values satisfy
the requirement for an asymptotical optimal CUSUM al-
gorithm. However, all these values can be adjusted to suit
the local network conditions.
VI. PERFORMANCE EVALUATION
The CUSUM algorithm detects changes based on the
cumulative effect of the changes made in the random se-
quence instead of using a single threshold to check every
variable. Therefore, with the deployment of the CUSUM
algorithm, the performance of our detection scheme will
not be affected by whether the attack rate is bursty or con-
stant. To evaluate the efcacy of our detection scheme
SIM, we conducted the following simulation experiments.
As shown in Figure 7, we created different types of
DDoS attack trafc and merged them with the normal
trafc. SIM was then applied to detect the attacks from
the merged trafc. The normal trafc traces used in our
study are collected at different times from three different
sources. The rst set was gathered at the University of
Auckland [14] with an OC3 (155.52 Mbps) Internet ac-
cess link [13]. The second data trace is taken from the
DARPA intrusion detection data set [17]. The third data
trace was taken on a 9 MBit/sec Internet Connection in
Bell Labs [25]. A summary of the data traces used in our
experiment is listed in Table I.
In order to evaluate the effectiveness of our detection
feature and algorithm, we add simulated attacks to the
normal background trafc traces in Talbe I. We embed-
ded a 5 minute DDoS attack trafc with an attack rate of
160 packets/s in the Auck-IV-in trace on 19 March, 2001.
10
TABLE I
SUMMARY OF THE PACKET TRACES USED FOR TESTING
Trace Trace Length Created Time Trafc Type
Auck-IV-in 3 weeks March 2001 Uni-directional
Auck-IV-out 3 weeks March 2001 Uni-directional
DARPA 3 weeks 1999 Bi-directional
Bell-I 1 week May 2002 Bi-directional
12pm 1pm 2pm 3pm 4pm
0.5
1
1.5
x 10
4
T
r
a
f
f
i
c
v
o
l
u
m
e
Number of packets every 10 seconds
12pm 1pm 2pm 3pm 4pm
0
0.2
0.4
0.6
0.8
1
X
n
Percentage of new IP addresses every 10 seconds
attack
attack
Fig. 8. Effect of choice of detection feature on detecting the occur-
rence of an attack.
Both the attack length and the attack rate are represen-
tative values that are commonly observed in the Internet
[19] [21]. As shown in Figure 8, we can hardly observe
any sign of attack when analyzing the trafc purely on
trafc volume because of the bursty nature of the Inter-
net trafc. In contrast, we can easily observe a large peak
caused by the attack trafc when analyzing the percentage
of new IP addresses in the measurement interval. This
is because the percentage of new IP addresses stays at a
very low value during normal operation. This makes the
attacks detectable using our Source IP address Monitor-
ing (SIM) scheme, even when the attacks are highly dis-
tributed.
A. Normal Trafc Behavior
Auck-IV-in, Auck-IV-out represent the normal trafc
behavior for a medium network (OC-3 connection to the
backbone Internet), while Bell-I represents normal trafc
behavior for an intranet (with 100Mbit ethernet connec-
tion to a local ISP). For evaluating the rst-mile router
SIM, we use the trafc which goes from the local net-
work to the Internet as the background trafc. For evalu-
ating the last-mile router SIM, we use trafc which goes
from the Internet to the local network as the background
trafc. Our detection feature is the percentage of new IP
addresses observed in each 10 second interval (X
n
). Fig-
ure 9 shows the behavior of this detecting feature when
applied to the three traces. The performance of variable
X
n
in the Auck-IV-out Trace (Figure 9(b)) is more stable
than in the Auck-IV-in and Bell-I traces (Figure 9 (a) and
Figure 9 (c)). The reason lies in the fact that the popula-
tion of users within a local network, such as the University
of Auckland, is more stable than the population of users
who access that network from the Internet. Thus, there
are very few IP addresses which are new to the IP Ad-
dress Database (IAD). In contrast, the Bell-I data trace is
bi-directional and contains the trafc from users outside
the network, which results in its large variance. In our ex-
periment, we use the Bell-I data trace as the background
trafc for the last-mile router.
Figure 10 illustrates the corresponding CUSUM statis-
tics y
n
which are derived by applying our detection
algorithm to the aforementioned three traces. Consider
the Auck-IV-out trace as an example to demonstrate how
we obtain the y
n
. The mean value of X
n
, which is
E(X
n
) = , can be obtained by the learning engine using
the trafc statistics before detection. For the Auck-IV-in
trace, = 0.0205. Since the implementation we are deal-
ing with is in the last-mile router, then we use [a[ = 0.05
and N = 0.05 according to the discussion in Section V-
D. Thus, = 0.0705, and Z
n
= X
n
0.0705. Now,
we can calculate y
n
according to Eq. 8. According to Fig-
ure 10(a), y
n
is very stable. It is interesting to see that
there are some separated bursts in Figure 10. These bursts
are caused by the bursty feature of the Internet trafc.
However, the burst for the Internet trafc is normally very
short, which will not provide a large accumulated value as
the attack trafc does in respect of y
n
. Thus, these sep-
arated bursts are far below the threshold N = 0.05, as
shown in Figure 10, which provides a large safety margin.
Therefore, the false alarm rate in our trace-driven experi-
ment is reduced to be zero. It is worth noting that is up-
dated periodically, in order to ensure that it represents the
most accurate estimation of the random sequence X
n
.
B. DDoS Detection
1) Randomly Spoofed DDoS attacks: We used the la-
belled DDoS attack scenario in the DARPA Intrusion De-
tection Data Set [17] as an example to demonstrate the
performance of our detection algorithm. The DDoS at-
tack we observed here is a naive one, which uses randomly
spoofed IP addresses. The labelled attack started at time
t = 3s and lasted for 5 seconds. Since the labelled attack
is very short, we set the measurement interval to be 0.01
second. As we see from Figure 11, there is an abrupt
change in the curve, which represents the percentage of
11
0 10 20 30 40 50 60
0.1
0.05
0
0.05
0.1
0.15
0.2
Time(minutes)
T
h
e
v
a
lu
e
o
f
X
n
AuckIVin Trace
Mean Value of X
n
0 10 20 30 40 50 60
0.1
0.05
0
0.05
0.1
0.15
0.2
Time(minutes)
T
h
e
v
a
lu
e
o
f
X
n
AuckIVout Trace
Mean Value of X
n
0 10 20 30 40 50 60
0.1
0.05
0
0.05
0.1
0.15
0.2
Time(minutes)
T
h
e
v
a
lu
e
o
f
X
n
BellI Trace
The mean value of X
n
(a) Auck-IV-in Trace (b) Auck-IV-out Trace (c) Bell-I Trace
Fig. 9. The percentage of new IP addresses calculated in the time bin size of 10 seconds for each packet trace
0 10 20 30 40 50 60
0.02
0.01
0
0.01
0.02
0.03
0.04
0.05
0.06
Time(minutes)
T
h
e
v
a
lu
e
o
f
y
n
AuckIVin Trace
Detection Threshold
0 10 20 30 40 50 60
0.02
0.01
0
0.01
0.02
0.03
0.04
0.05
0.06
Time(minutes)
T
h
e
v
a
lu
e
o
f
y
n
AuckIVout Trace
Detection Threshold
0 10 20 30 40 50 60
0.02
0.01
0
0.01
0.02
0.03
0.04
0.05
0.06
Time(minutes)
T
h
e
v
a
lu
e
o
f
y
n
BellI Trace
Detection Threshold
(a) Auck-IV-in Trace (b) Auck-IV-out Trace (c) Bell-I Trace
Fig. 10. CUSUM test statistics under normal operation for each packet trace
0 1 2 3 4 5 6 7 8 9 10
0
0.03
0.06
0.12
0.18
0.24
0.3
Time(minutes)
V
a
lu
e
o
f
y
n
CUSUM statistics y
n
Detection Threshold
0 1 2 3 4 5 6 7 8 9 10
0
0.03
0.06
0.12
0.18
0.24
0.3
Time(minutes)
V
a
lu
e
o
f
y
n
CUSUM statistics y
n
Detection Threshold
0 1 2 3 4 5 6 7 8 9 10
0
0.03
0.06
0.12
0.18
0.24
0.3
Time(minutes)
V
a
lu
e
o
f
y
n
CUSUM statistics y
n
Detection Threshold
(a) Attacks with 10 new IP addresses (b) Attacks with 4 new IP addresses (c) Attack with 2 new IP addresses
Fig. 12. The DDoS attack detection sensitivity in the rst-mile router using the Auck-IV-out trace
0 1 2 3 4 5
0
0.05
0.1
0.2
0.3
0.4
Time(minutes)
V
a
lu
e
o
f
y
n
CUSUM statistics y
n
Detection Threshold
0 1 2 3 4 5
0
0.05
0.1
0.2
0.3
0.4
Time(minutes)
V
a
lu
e
o
f
y
n
CUSUM statistics y
n
Detection Threshold
0 1 2 3 4 5 6 7 8 9 10
0
0.05
0.1
0.2
0.3
0.4
Time(minutes)
V
a
lu
e
o
f
y
n
CUSUM statistics y
n
Detection Threshold
(a) Attacks with 200 new IP addresses (b) Attacks with 40 new IP addresses (c) Attack with 18 new IP addresses
Fig. 13. The DDoS attack detection sensitivity for the last-mile router using the Auck-IV-in trace
12
0 1 2 3 4 5 6 7 8 9
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
X
n
time (in second)
Percentage of new IP addresses in 0.01 second
Fig. 11. 2000 DARPA Dataset: DDoS Attack Scenario 1.
new IP addresses in time period of 0.01 second. Thus,
it is obviously easy to detect the DDoS attack with ran-
domly spoofed source IP addresses. However, this is not
the focus of our detection algorithm, which is designed
to defend against much more sophisticated DDoS attack
scenarios.
2) DDoS attacks with a small number of randomly
spoofed IP addresses: In an attempt to avoid detection
by our scheme, attackers may try to constrain the number
of spoofed IP addresses that they use. Similarly, in the
case of distributed reector denial of service (DRDoS) at-
tacks, the number of source IP address of the attack trafc
depends on the number of reectors. Thus, the attacker
can control the number of new IP addresses used in the
attack. However, there is a lower bound on the number of
new IP addresses used, since the number of IP packets for
a single IP address will increase with the decrease in the
number of source IP addresses used. Therefore, this type
of attack will be detected by our rst detection engine as
we mentioned in Sec. IV-C.1.
To test the detection sensitivity for DDoS attacks with
different numbers of new IP addresses, we conducted the
following experiment. We used the Auck-IV-in trace as
the background trafc for the last-mile router detection
evaluation, and Auck-IV-out trace as the background traf-
c for the rst-mile router detection evaluation. As men-
tioned before, our detection algorithm is not affected by
whether the attack trafc is bursty or constant since the
detection is based on the cumulative effect of the attack
trafc. For the simplicity of the experiment design, we
assume the attack trafc rate to be constant. The attack
period is set to be 5 minutes, which is a commonly ob-
served attack period in the Internet [19]. The attack trafc
rate for the last-mile router is set to be 500 Kbps in order
to constitute an effective bandwidth attack to medium-size
victim networks, which in our case is the network of the
University of Auckland.
Let J represent the number IP addresses in the attack
TABLE II
DETECTION PERFORMANCE OF THE FIRST-MILE ROUTER
W Detection Accuracy Detection Time (seconds)
2 99% 69.7
4 100% 20.1
6 100% 18.9
8 100% 10
10 100% 10
trafc which are new to the network. We tested different
values of J in our simulation, and the detection perfor-
mance for the rst and last-mile routers are shown in Fig-
ure 12 and Figure 13 respectively. We repeated the attack
detection under a variety of different network conditions,
and listed both the average detection accuracy and detec-
tion time in Table II and Table III.
As we can see from the simulation results, our detection
algorithm is very robust in both the rst-mile and last-mile
routers. For the last-mile router, we can detect the DDoS
attack with J = 18 within 81.1 seconds with 100% ac-
curacy, and detect the DDoS attack with J = 15 within
127.3 seconds with 90% accuracy. Given the attack trafc
length is no more than 5 minutes, only the attack trafc
with J < 18 has the possibility of sometimes avoiding
our detection. However, by forcing the attacker to use
a small number of new IP addresses, we can detect the
attack by observing the abrupt change of the number of
packets per IP source address using the rst detection en-
gine which is described in Sec. IV-C.1.
For the rst-mile router, we can achieve 99% detection
accuracy even when there are only 2 new IP address in
the attack trafc. The reason lies in the fact that the back-
ground trafc for the rst-mile router is very clear. Gen-
erally, there will be very few IP addresses that are new to
the network since all the valid IP packets originated from
within the same network. Since the IP addresses in the IP
Address Database (IAD) will expire and be removed after
a certain time period, the IP addresses within the subnet-
works which have not been used recently will be new to
IAD. This is very similar to ingress ltering [10]. How-
ever, ingress ltering cannot detect the attack when the
spoofed IP addresses are within the subnetworks. In con-
trast, our rst-mile router detection algorithm can detect
the spoofed IP addresses within the subnetworks if they
are new to the IAD.
It is worth noting that we choose our detection inter-
val
n
= 10s in our experiment, which is a conservative
choice for a real implementation. If we decrease the de-
tection interval by using more computing resources, we
can reduce the detection time accordingly.
13
TABLE III
DETECTION PERFORMANCE OF THE LAST-MILE ROUTER
W Detection Accuracy Detection Time (seconds)
15 90% 127.3
18 100% 81.1
40 100% 18.9
60 100% 10
200 100% 10
VII. RELATED WORK
Gil proposes a scheme called MULTOPS [12] to de-
tect denial of service attacks by monitoring the packet rate
in both the up and down links. This scheme is based on
the fact that a router observing a certain packet rate in
one direction with a signicantly lower packet rate in the
opposite direction can suspect that the slower side is un-
able to cope with the trafc it is receiving and may, there-
fore, be under an attack. However, this scheme assumes
that packet rates between two hosts are symmetric, which
is not always the case. For example, real audio/video
streams are asymmetric, and with the wide use of online
movie and online news, where the packet rate from the
server is much higher than from the client, false positive
rates will become a big concern for this scheme. MUL-
TOPS indexes the IP ows using source IP addresses and
assumes that IP addresses cannot be spoofed with the im-
plementation of ingress ltering [10]. Unfortunately, this
assumption is also not valid in the current Internet for the
following reasons. First, some ISPs might not be will-
ing to implement ingress ltering due to the overhead of
the scheme or just lack of motivation. Second, ingress
ltering is normally implemented in the edge router, and
there is no control of the spoofed IP packets once they get
through the edge router.
Wang et al. [30] developed a scheme to detect SYN
ooding attacks by observing the ratio of the number of
SYNpackets and number of FINpackets. This scheme as-
sumes that SYN packets and FIN packets always come in
pairs in normal TCP connections. As it is also addressed
in their paper, the attacker can easily bypass this scheme
by sending mixed SYN and FIN attack packets. They also
proposed to utilize the SYN-SYN/ACK pair for detection
instead of the SYN-FIN pair, which makes it harder for
the attacker to counter this scheme. This alternative has
the following problems. First, it is very difcult to man-
age the cooperation between the rst-mile router and the
last-mile router. Second, this scheme is not effective for
attacks that use non-adaptive protocols, such as UDP.
While the single denial of service attack is character-
ized by a large trafc volume, the distributed denial of
service (DDoS) attack is not only characterized by a large
trafc volume but also multiple attack sources. Further-
more, the attacker can use randomly spoofed IP addresses
to dilute the attack trafc volume per source IP address.
This adds more difculties to DDoS detection schemes
that are based on trafc volume, especially when the de-
tection scheme is deployed several hops away from the
victim.
We propose a new detection scheme called Source IP
address Monitoring (SIM) which monitors the increase in
new IP addresses over a certain period. SIM is based on
an intrinsic feature of DDoS attacks, i.e., the presence of
a large number of spoofed IP addresses. By detecting an
abnormal increase in the new IP addresses, we can inden-
tify a distributed denial of service attack. Our detection
scheme has put the attackers into a dilemma. They either
choose to expose their identities by using a small number
of IP addresses or risk being quickly detected by using
randomly spoofed IP addresses.
The efcacy of our detection mechanism is validated by
trace-driven simulations. The evaluation results show that
Source IP address Monitoring (SIM) has both a short de-
tection time and high detection accuracy. Moreover, due
to its close proximity to the ooding sources, our detec-
tion mechanism not only alarms on the ongoing DDoS at-
tacks but also reveals the location of the ooding sources.
We have demonstrated the sensitivity of our scheme for
detecting distributed denial of service attacks by investi-
gating the minimum number of sources SIM can detect.
VIII. CONCLUSION AND FUTURE WORK
In this paper we proposed a scheme to detect distributed
denial of service attacks by monitoring the increase of
new IP addresses. We have also presented a sequential
change point detection algorithm that can identify when
an attack has occurred. We demonstrated the efciency
and robustness of this scheme by using trace-driven sim-
ulations. The experimental results in the Auckland traces
show that we can detect DDoS attacks with 100% accu-
racy using as few as 18 new IP addresses in the last-mile
router and DDoS attacks using as few as 2 new IP address
in the rst-mile router. Our online detection algorithm
is fast and has a very low computing overhead. Further-
more, our rst-mile router SIM has the advantage over
ingress ltering [10] that it can detect attack trafc with
spoofed source IP addresses within the subnetworks. Our
future work will include combining other network trafc
statistics to detect bandwidth attacks and using distributed
detection to detect DDoS attacks.
14
ACKNOWLEDGEMENT
We would like to thank the Waikato Applied Network
Dynamics Research Group, the Internet Trafc Research
group in Bell Labs and the Information Systems Technol-
ogy Group in MIT Lincoln Laboratory for making avail-
able their data traces.
REFERENCES
[1] M. Basseville and I. V. Nikiforov. Detection of Abrupt Changes:
Theory and Application. Prentice Hall, 1993.
[2] S. Bellovin. The ICMP traceback message. Inter-
net Draft, IETF, March 2000. draft-bellovin-itrace-05.txt.
http://www.research.att.com/smb.
[3] Rudolf B. Bla zek, Hongjoong Kim, Boris Rozovskii, and
Alexander Tartakovsky. A novel approach to detection of
denial-of-service attacks via adaptive sequential and batch-
sequential change-point detection methods. In Proceedings of
IEEE Systems, Man and Cybernetics Information Assurance
Workshop, June 2001.
[4] B. E. Brodsky and B. S. Darkhovsky. Nonparametric Methods in
Change-point Problems. Kluwer Academic Publishers, 1993.
[5] CNN. Cyber-attacks batter web heavyweights.
http://www.cnn.com/2000/TECH/computing/02/09/-
cyber.attacks.01/index.html, February 2000.
[6] CNN. Immense network assault takes down ya-
hoo. http://www.cnn.com/2000/TECH/computing/02/08/-
yahoo.assault.idg/index.html, February 2000.
[7] CNN. Denial-of-service attacks on the rise?
http://www.cnn.com/2002/TECH/internet/04/09/dos.threat.idg/-
index.html, April 2002.
[8] Drew Dean, Matt Franklin, and Adam Stubbleeld. An algebraic
approach to ip traceback. In Network and Distributed System
Security Symposium, NDSS 01, Feburary 2001.
[9] Sven Dietrich, Neil Long, and David Dittrich. Analyzing dis-
tributed denial of service attack tools: The shaft case. In Proceed-
ings of 14th Systems Administration Conference, LISA, 2000.
[10] P. Ferguson and D. Senie. Network ingress ltering: Defeating
denial of service attacks which employ IP source address spoof-
ing. RFC2267, IETF, January 1998.
[11] Steve Gibson. Distributed reection denial of service.
http://grc.com/dos/drdos.htm, Feb. 2002.
[12] Thomer M. Gil and Massimiliano Poletto. Multops: a data-
structure for bandwidth attack detection. In Proceedings of the
10th USENIX Security Symposium, 2001.
[13] Waikato Applied Network Dynamic Research Group.
http://wand.cs.waikato.ac.nz/wand/wits/auck/4/.
[14] Waikato Applied Network Dynamics Research Group. Auckland
university data traces. http://wand.cs.waikato.ac.nz/wand/wits/.
[15] John Ioannidis and Steven M. Bellovin. Implementing pushback:
Router-based defense against DDoS attacks. In Proceedings of
Network and Distributed System Security Symposium, Catama-
ran Resort Hotel San Diego, California. 6-8 February 2002. The
Internet Society, February 2002.
[16] Jaeyeon Jung, Balachander Krishnamurthy, and Michael Rabi-
novich. Flash crowds and denial of service attacks: Character-
ization and implications for CDNs and web sites. Proceeding
of 11th Word Wide Web conference. May 7-11, 2002, Honolulu,
Hawaii, USA.
[17] MIT Lincoln Laboratory. 2000 DARPA intrusion detection sce-
nario specic data sets. http://www.ll.mit.edu/IST/.
[18] Ratul Mahajan, Steven M. Bellovin, Sally Floyd, John Ioannidis,
Vern Paxson, and Scott Shenker. Controlling high bandwidth
aggregates in the network. Technical report, AT&T Center for
Internet Research at ICSI (ACIRI) and AT&T Labs Research,
February 2001.
[19] David Moore, Geoffrey M. Voeker, and Stefan Savage. Infer-
ring internet Denial-of-Service acitivity. In Proceedings of the
USENIX Security Symposium, pages 922, August 2001.
[20] news.com. Attack disables music industry web site, July 2002.
http://news.com.com/2100-1023-947072.html?tag=politech.
[21] Ross Oliver. Countering SYN ood
denial-of-service attacks, August 29 2001.
http://www.usenix.org/events/sec01/invitedtalks/oliver.pdf.
[22] Vern Paxson. An analysis of using reectors for distributed
denial-of-service attacks. Computer Communication Review
31(3), July 2001.
[23] Vern Paxson and Sally Floyd. Wide-area trafc: The failure
of Poisson modeling. IEEE/ACM Transactions on Networking,
3(3), June 1995.
[24] Tao Peng, Christopher Leckie, and Kotagiri Ramamohanarao.
Prevention from distributed denial of service attacks using
history-based IP ltering. In Proceeding of ICC 2003 (to ap-
pear), Anchorage, Alaska, USA, August 2003.
[25] NLANR PMA and the Internet Trafc Research group. Bell Labs
- I data set. http://pma.nlanr.net/Traces/long/bell1.html.
[26] Y. Rekhter and T. Li. A border gateway protocol 4 (BGP-4).
http://www.ietf.org/rfc/rfc1771.txt, March 1995.
[27] Stefan Savage, David Wetherall, Anna Karlin, and Tom Ander-
son. Practical network support for IP traceback. In Proceedings
of the 2000 ACM SIGCOMM Conference, August 2000.
[28] F. D. Smith, F. H. Campos, K. Jeffay, and D. Ott. What TCP/IP
protocol header can tell us about the web. In Proceedings of ACM
SIGMETRICS2001, June 2001.
[29] Dawn X. Song and Adrian Perrig. Advanced and
authenticated marking schemes for ip traceback.
In Proceedings of IEEE INFOCOM 2001, 2001.
http://paris.cs.berkeley.edu/ perrig/projects/iptraceback/tr-
iptrace.ps.gz.
[30] Haining Wang, Danlu Zhang, and Kang G. Shin. Detecting SYN
ooding attacks. In Proceedings of IEEE Infocom2002, June
2002.
[31] S. Felix Wu, Lixia Zhang, Dan Massey, and Allison Mankin.
Intension-Driven ICMP Trace-Back. Internet Draft, IETF, Febru-
ary 2001. draft-wu-itrace-intension-00.txt.
[32] David K. Y. Yau, John C. S. Lui, and Feng Liang. Defending
against distributed denial-of-service attacks with max-min fair
server-centric router throttles. In Proceedings of IEEE Interna-
tional Workshop on Quality of Service (IWQoS), Miami Beach,
FL, May 2002.