0% found this document useful (0 votes)

72 views36 pages

Anomaly Detection Overview

The document discusses anomaly detection, defining anomalies as patterns that do not conform to expected behavior. It covers different types of anomalies including point, contextual, and collective anomalies. The document also discusses challenges in anomaly detection like defining normal behavior and evaluating techniques as well as aspects like the nature of input data and output of anomaly detection models.

Uploaded by

Rishi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

72 views36 pages

Anomaly Detection Overview

Uploaded by

Rishi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 36

Anomaly Detection

Adapted from slides by

Jing Gao
SUNY Buffalo

1
Definition of Anomalies

• Anomaly is a pattern in the data that does not

conform to the expected behavior
• Also referred to as outliers, exceptions,
peculiarities, surprise, etc.
• Anomalies translate to significant (often critical)
real life entities
– Cyber intrusions
– Credit card fraud

2
Real World Anomalies

• Credit Card Fraud

– An abnormally high purchase
made on a credit card

• Cyber Intrusions
– Computer virus spread over
Internet

3
Simple Example
Y
• N1 and N2 are
regions of normal N1 o1
O3
behavior
• Points o1 and o2 are
anomalies
o2
• Points in region O3 N2
are anomalies

4
Related problems

• Rare Class Mining

• Chance discovery
• Novelty Detection
• Exception Mining
• Noise Removal

5
Key Challenges

• Defining a representative normal region is challenging

• The boundary between normal and outlying behavior is
often not precise
• The exact notion of an outlier is different for different
application domains
• Limited availability of labeled data for
training/validation
• Malicious adversaries
• Data might contain noise
• Normal behaviour keeps evolving
6
Aspects of Anomaly Detection Problem

• Nature of input data

• Availability of supervision
• Type of anomaly: point, contextual, structural
• Output of anomaly detection
• Evaluation of anomaly detection techniques

7
Data Labels

• Supervised Anomaly Detection

– Labels available for both normal data and anomalies
– Similar to skewed (imbalanced) classification
• Semi-supervised Anomaly Detection
– Limited amount of labeled data
– Combine supervised and unsupervised techniques
• Unsupervised Anomaly Detection
– No labels assumed
– Based on the assumption that anomalies are very rare
compared to normal data

8
Type of Anomalies
• Point Anomalies

• Contextual Anomalies

• Collective Anomalies

9
Point Anomalies

• An individual data instance is anomalous w.r.t.

the data
Y

N1 o1
O3

X
10
Contextual Anomalies

• An individual data instance is anomalous within a context

• Requires a notion of context
• Also referred to as conditional anomalies

Anomaly
Normal

11
Collective Anomalies

• A collection of related data instances is anomalous

• Requires a relationship among data instances
– Sequential Data
– Spatial Data
– Graph Data
• The individual instances within a collective anomaly are not
anomalous by themselves

Anomalous Subsequence

12
Output of Anomaly Detection

• Label
– Each test instance is given a normal or anomaly label
– This is especially true of classification-based
approaches
• Score
– Each test instance is assigned an anomaly score
• Allows the output to be ranked
• Requires an additional threshold parameter

13
Metrics for Performance Evaluation
PREDICTED CLASS

+ -
+ a b
ACTUAL (TP) (FN)
CLASS c d
-
(FP) (TN)

• Measure used in classification:

a d TP TN
Accuracy  
a  b  c  d TP  TN  FP  FN
14
Limitation of Accuracy

• Anomaly detection
– Number of negative examples = 9990
– Number of positive examples = 10

• If model predicts everything to be class 0,

accuracy is 9990/10000 = 99.9 %
– Accuracy is misleading because model does not
detect any positive examples

15
Cost Matrix

PREDICTED CLASS

C(i|j) + -
+ C(+|+) C(-|+)
ACTUAL
CLASS C(+|-) C(-|-)
-

C(i|j): Cost of misclassifying class j example as class i

20
Computing Cost of Classification
Cost PREDICTED CLASS
Matrix
C(i|j) + -
ACTUAL + -1 100
CLASS
- 1 0

Model PREDICTED CLASS Model PREDICTED CLASS

M1 M2
+ - + -
ACTUAL + 150 40 ACTUAL + 250 45
CLASS CLASS
- 60 250 - 5 200

Accuracy = 80% Accuracy = 90%

Cost = 3910 Cost = 4255
17
Cost-Sensitive Measures

a
Precision (p) 
a c
a
Recall (r)
a b
2rp 2a
F - measure (F)  
r  p 2a  b  c

wa  w d
Weighted Accuracy 1 4

wa  w b  w c  w d
1 2 3 4

18
ROC (Receiver Operating Characteristic)

• ROC curve plots TPR (Recall) on the y-axis

against FPR (FP/#N) on the x-axis

• Performance of each classifier represented as

a point on the ROC curve
– changing the threshold of algorithm, sample
distribution or cost matrix changes the location of
the point

19
ROC Curve
- 1-dimensional data set containing 2 classes (positive and negative)
- any points located at x > t is classified as positive

At threshold t:
TP=0.5, FN=0.5, FP=0.12, FN=0.88
20
ROC Curve
(TPR,FPR):
• (0,0): declare everything
to be negative class
• (1,1): declare everything
to be positive class
• (1,0): ideal

• Diagonal line:
– Random guessing
– Below diagonal line:
• prediction is opposite of the
true class

21
Using ROC for Model Comparison
 Comparing two models
 M1 is better for small
FPR
 M2 is better for large
FPR

 Area Under the ROC

curve
 Ideal:
 Area =1
 Random guess:
 Area = 0.5

22
How to Construct an ROC curve
Instance Score Label •Calculate the outlier scores of
1 0.95 + the given instances
2 0.93 +
3 0.87 -
•Sort the instances according to
the scores in decreasing order
4 0.85 -
5 0.85 - • Apply threshold at each unique
6 0.85 + value of the score
7 0.76 -
• Count the number of TP, FP,
8 0.53 +
TN, FN at each threshold
9 0.43
PREDICTED CLASS
-
10 0.25 + • TP rate, TPR = TP/(TP+FN)
+ -

+ a b
• FP rate, FPR = FP/(FP + TN)
ACTUAL
CLASS (TP) (FN)
- c d
(FP) (TN) 27
How to construct an ROC curve
Class + - + - - - + - + +
0.25 0.43 0.53 0.76 0.85 0.85 0.85 0.87 0.93 0.95 1.00
Threshold > =
TP 5 4 4 3 3 3 3 2 2 1 0

FP 5 5 4 4 3 2 1 1 0 0 0

TN 0 0 1 1 2 3 4 4 5 5 5

FN 0 1 1 2 2 2 2 3 3 4 5

TPR 1 0.8 0.8 0.6 0.6 0.6 0.6 0.4 0.4 0.2 0

FPR 1 1 0.8 0.8 0.6 0.4 0.2 0.2 0 0 0

Area under ROC curve = prob.

ROC Curve:
that a randomly sample positive
example will score higher than a
randomly sampled negative
example

24
Applications of Anomaly Detection

• Network intrusion detection

• Insurance / Credit card fraud detection
• Healthcare Informatics / Medical diagnostics
• Image Processing / Video surveillance
• …

25
Anomaly Detection Schemes
• General Steps
– Build a profile of the “normal” behavior
• Profile can be patterns or summary statistics for the overall population
– Use the “normal” profile to detect anomalies
• Anomalies are observations whose characteristics
differ significantly from the normal profile

• Methods
– Statistical-based
– Distance-based
– Model-based

26
Statistical Approaches
• Assume a parametric model describing the
distribution of the data (e.g., normal distribution)

• Apply a statistical test that depends on

– Data distribution
– Parameter of distribution (e.g., mean, variance)
– Number of expected outliers (confidence limit)

27
Limitations of Statistical Approaches

• Most of the tests are for a single attribute

• In many cases, data distribution may not be

known

• For high dimensional data, it may be difficult

to estimate the true distribution

28
Distance-based Approaches

• Data is represented as a vector of features

• Three major approaches

– Nearest-neighbor based
– Density based
– Clustering based

40
Nearest-Neighbor Based Approach

• Approach:
– Compute the distance between every pair of data
points

– There are various ways to define outliers:

• Data points for which there are fewer than p
neighboring points within a distance D

• The top n data points whose distance to the k-th

nearest neighbor is greatest

• The top n data points whose average distance to the k

nearest neighbors is greatest
30
Distance-Based Outlier Detection
• For each object o, examine the # of other objects in the r-
neighborhood of o, where r is a user-specified distance
threshold
• An object o is an outlier if most (taking π as a fraction
threshold) of the objects in D are far away from o, i.e., not in
the r-neighborhood of o

• An object o is a DB(r, π) outlier if

• Equivalently, one can check the distance between o and its k-
th nearest neighbor ok, where . o is an outlier if
dist(o, ok) > r

31
Density-based Approach
• For each point, compute the density of its local
neighborhood
• Points whose local density is significantly lower than its
nearest neighbor‘s local density are consider outliers

Example:
Distance from p3 to In the NN approach, p2 is
nearest neighbor
p3  not considered as outlier,
while a density based
Distance from p2 to approach may find both
nearest neighbor p1 and p2 as outliers

p2
 p1

Clustering-Based
• Basic idea:
– Cluster the data into
groups of different density
– Choose points in small
cluster as candidate
outliers
– Compute the distance
between candidate points
and non-candidate
clusters.
• If candidate points are far
from all other non-
candidate points, they are
outliers

33
Classification-Based Methods
• Idea: Train a classification model that can distinguish “normal”
data from outliers
• Consider a training set that contains samples labeled as
“normal” and others labeled as “outlier”
– But, the training set is typically heavily biased: # of “normal”
samples likely far exceeds # of outlier samples
• Handle the imbalanced distribution
– Oversampling positives and/or undersampling negatives
– Alter decision threshold
– Cost-sensitive learning

34
One-Class Model
 One-class model: A classifier is built to describe only the
normal class
 Learn the decision boundary of the normal class using

classification methods such as SVM

 Any samples that do not belong to the normal class (not

within the decision boundary) are declared as outliers

 Adv: can detect new outliers that may not appear close to

any outlier objects in the training set

35
Take-away Message

• Definition of outlier detection

• Applications of outlier detection
• Evaluation of outlier detection techniques
• Unsupervised approaches (statistical, distance,
density-based)

Anomaly Detection: Jing Gao
No ratings yet
Anomaly Detection: Jing Gao
51 pages
5 Anomaly Detection Annotated Section 100 300
No ratings yet
5 Anomaly Detection Annotated Section 100 300
48 pages
17 dm2 Anomaly Detection 2022 23
No ratings yet
17 dm2 Anomaly Detection 2022 23
113 pages
Outlier Detection Techniques
100% (2)
Outlier Detection Techniques
56 pages
Anomaly Detection Unit 5
No ratings yet
Anomaly Detection Unit 5
9 pages
Anomaly Detection Techniques
No ratings yet
Anomaly Detection Techniques
14 pages
Data Mining Slide Contents
No ratings yet
Data Mining Slide Contents
22 pages
Anomaly Detection and Curve Fitting
No ratings yet
Anomaly Detection and Curve Fitting
72 pages
ADII11 Metode Deteksi Outlier
No ratings yet
ADII11 Metode Deteksi Outlier
50 pages
Anomoly Detection - Ensemble - Classifiers
No ratings yet
Anomoly Detection - Ensemble - Classifiers
68 pages
Unit 4
No ratings yet
Unit 4
17 pages
Outlier Detection Techniques
No ratings yet
Outlier Detection Techniques
55 pages
Outlier Detection Techniques
100% (1)
Outlier Detection Techniques
13 pages
Lec3. Outlier Analysis
No ratings yet
Lec3. Outlier Analysis
54 pages
Data Mining:: Concepts and Techniques
No ratings yet
Data Mining:: Concepts and Techniques
44 pages
Lecture23 2
No ratings yet
Lecture23 2
10 pages
Unit 5
No ratings yet
Unit 5
47 pages
Lecture 12
No ratings yet
Lecture 12
54 pages
07 Outlier Detection
No ratings yet
07 Outlier Detection
54 pages
Missing and Outlier
No ratings yet
Missing and Outlier
20 pages
Unit-5 Outlier Analysis
No ratings yet
Unit-5 Outlier Analysis
32 pages
12outlier 1
No ratings yet
12outlier 1
45 pages
Outlier Detection Techniques
No ratings yet
Outlier Detection Techniques
12 pages
Anomaly Detection Class
No ratings yet
Anomaly Detection Class
24 pages
12 Outlier
No ratings yet
12 Outlier
18 pages
Unit 5 - Lecture 1 - Outlier Detection
No ratings yet
Unit 5 - Lecture 1 - Outlier Detection
30 pages
12 Outlier
No ratings yet
12 Outlier
16 pages
Feature Engineering
No ratings yet
Feature Engineering
66 pages
Data Outlier Detection Techniques
No ratings yet
Data Outlier Detection Techniques
17 pages
Ecmlpkdd08 Lazarevic Dmfa
No ratings yet
Ecmlpkdd08 Lazarevic Dmfa
116 pages
741 Outlier Detection
No ratings yet
741 Outlier Detection
55 pages
Unit 5
No ratings yet
Unit 5
70 pages
20 Cs 112
No ratings yet
20 Cs 112
11 pages
ISAT 600 Progress Report 3
No ratings yet
ISAT 600 Progress Report 3
4 pages
Outlier Detection Methods Guide
No ratings yet
Outlier Detection Methods Guide
2 pages
Anomaly-Fraud-Detection
No ratings yet
Anomaly-Fraud-Detection
50 pages
Data Mining Chapter 6 Anomaly & Fraud Detection
No ratings yet
Data Mining Chapter 6 Anomaly & Fraud Detection
41 pages
Unit 5 - Lecture 2 - Statistical - Methods - Mining - Techniques
No ratings yet
Unit 5 - Lecture 2 - Statistical - Methods - Mining - Techniques
41 pages
Unit 5-2
No ratings yet
Unit 5-2
41 pages
Distance-Based Outlier Detection: Consolidation and Renewed Bearing
No ratings yet
Distance-Based Outlier Detection: Consolidation and Renewed Bearing
12 pages
Unit 4-2
No ratings yet
Unit 4-2
7 pages
Outliers EXTD
No ratings yet
Outliers EXTD
24 pages
10 - Anomaly Detection
No ratings yet
10 - Anomaly Detection
12 pages
6anomaly Fraud Detection
No ratings yet
6anomaly Fraud Detection
5 pages
Clustering and Outlier Analysis
No ratings yet
Clustering and Outlier Analysis
36 pages
Outlier Detection
No ratings yet
Outlier Detection
45 pages
Feature Engineering
No ratings yet
Feature Engineering
63 pages
Outlier Detection
No ratings yet
Outlier Detection
30 pages
Module 11 (C)
No ratings yet
Module 11 (C)
4 pages
Data Minning Unit 4-1
No ratings yet
Data Minning Unit 4-1
10 pages
Anomaly Detection and Outlier Analysis
No ratings yet
Anomaly Detection and Outlier Analysis
25 pages
Make 05 00042 v3
No ratings yet
Make 05 00042 v3
21 pages
Distance Based Outlier Detection
No ratings yet
Distance Based Outlier Detection
40 pages
Data Mining-Outlier Analysis
No ratings yet
Data Mining-Outlier Analysis
6 pages
Introtoanomalydetection 170421012904
No ratings yet
Introtoanomalydetection 170421012904
53 pages
ADS Ut2
No ratings yet
ADS Ut2
23 pages
Small Founders Content Growth
No ratings yet
Small Founders Content Growth
39 pages
EPGP Brochure 2026 27
No ratings yet
EPGP Brochure 2026 27
4 pages
Data Engineering Study Guide - Outline (Make A Copy - ) and Go From There)
No ratings yet
Data Engineering Study Guide - Outline (Make A Copy - ) and Go From There)
3 pages
AWS Serverless Design For IoT
No ratings yet
AWS Serverless Design For IoT
28 pages
Adding Up To 10
No ratings yet
Adding Up To 10
1 page
Sight Words Pre-K Worksheet
No ratings yet
Sight Words Pre-K Worksheet
1 page
Spark SQL
100% (1)
Spark SQL
25 pages
Bigdata Interview Preparation Guide
No ratings yet
Bigdata Interview Preparation Guide
292 pages
AI & ML Executive Program IIM Kozhikode
No ratings yet
AI & ML Executive Program IIM Kozhikode
8 pages
World Health Organization COVID-19 Update-March 20
No ratings yet
World Health Organization COVID-19 Update-March 20
9 pages
Sleeba S Resume
No ratings yet
Sleeba S Resume
4 pages
Back of Words
No ratings yet
Back of Words
21 pages
25+ Python Challenging Programming Exercises
No ratings yet
25+ Python Challenging Programming Exercises
24 pages
Docker Quick Guide PDF
No ratings yet
Docker Quick Guide PDF
127 pages
RPG Creation for Gamers
50% (2)
RPG Creation for Gamers
1 page
EDA Concepts and Outlier Handling
No ratings yet
EDA Concepts and Outlier Handling
5 pages
ORPN Algorithm Used To Diagnosis and Identify Plant Diseases Based On Image Segmentation Using Deep Learn
No ratings yet
ORPN Algorithm Used To Diagnosis and Identify Plant Diseases Based On Image Segmentation Using Deep Learn
5 pages
K-Nearest Neighbor Learning
No ratings yet
K-Nearest Neighbor Learning
19 pages
Smart Health Prediction System
No ratings yet
Smart Health Prediction System
5 pages
K - Nearest Neighbors
No ratings yet
K - Nearest Neighbors
33 pages
Non-Parametric Density Estimation
No ratings yet
Non-Parametric Density Estimation
3 pages
Lecture3 2020classification PDF
No ratings yet
Lecture3 2020classification PDF
124 pages
Data Analyst's Professional Portfolio
No ratings yet
Data Analyst's Professional Portfolio
1 page
Unit 1 Supervised Learning
No ratings yet
Unit 1 Supervised Learning
33 pages
Supply Chain Management Using Machine Learning
No ratings yet
Supply Chain Management Using Machine Learning
43 pages
Machine Learning Lab Workbook
No ratings yet
Machine Learning Lab Workbook
160 pages
Cluster Validation Techniques
No ratings yet
Cluster Validation Techniques
37 pages
Paper 1
No ratings yet
Paper 1
12 pages
K-Nearest Neighbourhood
100% (1)
K-Nearest Neighbourhood
7 pages
Lead Time Forecasting With Machine Learning Techniques For A Pharmaceutical Supply Chain
No ratings yet
Lead Time Forecasting With Machine Learning Techniques For A Pharmaceutical Supply Chain
8 pages
Biotech Bioengineering - 2023 - Park - Data Driven Prediction Models For Forecasting Multistep Ahead Profiles of
No ratings yet
Biotech Bioengineering - 2023 - Park - Data Driven Prediction Models For Forecasting Multistep Ahead Profiles of
15 pages
An Efficient Detection of Fake Currency KNN Method
No ratings yet
An Efficient Detection of Fake Currency KNN Method
7 pages
Scikit-Learn Cheat Sheet
No ratings yet
Scikit-Learn Cheat Sheet
1 page
KNN Numerical
No ratings yet
KNN Numerical
4 pages
Nnachaal@gmail - Com WeatherAnalysis
No ratings yet
Nnachaal@gmail - Com WeatherAnalysis
13 pages
Animal Detection in Farms Using Machine Learning
No ratings yet
Animal Detection in Farms Using Machine Learning
6 pages
Summary Business Analytics
No ratings yet
Summary Business Analytics
24 pages
ML Training Data Csam Report-2023!12!23
No ratings yet
ML Training Data Csam Report-2023!12!23
19 pages
Supervised Learning With Scikit-Learn
No ratings yet
Supervised Learning With Scikit-Learn
178 pages
WIREs Comput Mol Sci - 2014 - Lewis - Modern 2D QSAR For Drug Discovery
No ratings yet
WIREs Comput Mol Sci - 2014 - Lewis - Modern 2D QSAR For Drug Discovery
18 pages
Sridhar 2020
No ratings yet
Sridhar 2020
6 pages
Data Mining Techniques Overview
No ratings yet
Data Mining Techniques Overview
11 pages
Romi DM Aug2020
No ratings yet
Romi DM Aug2020
722 pages
1-Deep Learning Human
No ratings yet
1-Deep Learning Human
9 pages
Laptop Price Predictor Final Report
No ratings yet
Laptop Price Predictor Final Report
7 pages

Anomaly Detection Overview

Uploaded by

Anomaly Detection Overview

Uploaded by

Anomaly Detection

Adapted from slides by

• Anomaly is a pattern in the data that does not

• Credit Card Fraud

• Rare Class Mining

• Defining a representative normal region is challenging

• Nature of input data

• Supervised Anomaly Detection

• An individual data instance is anomalous w.r.t.

• An individual data instance is anomalous within a context

• A collection of related data instances is anomalous

• Measure used in classification:

• If model predicts everything to be class 0,

C(i|j): Cost of misclassifying class j example as class i

Model PREDICTED CLASS Model PREDICTED CLASS

Accuracy = 80% Accuracy = 90%

• ROC curve plots TPR (Recall) on the y-axis

• Performance of each classifier represented as

 Area Under the ROC

FPR 1 1 0.8 0.8 0.6 0.4 0.2 0.2 0 0 0

Area under ROC curve = prob.

• Network intrusion detection

• Apply a statistical test that depends on

• Most of the tests are for a single attribute

• In many cases, data distribution may not be

• For high dimensional data, it may be difficult

• Data is represented as a vector of features

• Three major approaches

– There are various ways to define outliers:

• The top n data points whose distance to the k-th

• The top n data points whose average distance to the k

• An object o is a DB(r, π) outlier if

classification methods such as SVM

within the decision boundary) are declared as outliers

any outlier objects in the training set

• Definition of outlier detection

You might also like