0% found this document useful (0 votes)
85 views

Docs Slides Lecture15

This document discusses anomaly detection using multivariate Gaussian distributions. It begins by introducing anomaly detection and some example applications. It then discusses modeling normal data points with a multivariate Gaussian distribution parameterized by a mean vector and covariance matrix. It describes fitting these parameters from a training set and using the fitted distribution to calculate an anomaly score for new data points. A lower anomaly score indicates a point is more likely to be normal.
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
85 views

Docs Slides Lecture15

This document discusses anomaly detection using multivariate Gaussian distributions. It begins by introducing anomaly detection and some example applications. It then discusses modeling normal data points with a multivariate Gaussian distribution parameterized by a mean vector and covariance matrix. It describes fitting these parameters from a training set and using the fitted distribution to calculate an anomaly score for new data points. A lower anomaly score indicates a point is more likely to be normal.
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 37

Anomaly

detection
Problem
motivation
Machine Learning
Anomaly detection example
Aircraft engine features: Dataset:
= heat generated
= vibration intensity New engine:

(vibration)

(heat)
Andrew Ng
Density estimation

Dataset:
Is anomalous?
(vibration)

(heat)
Andrew Ng
Anomaly detection example
Fraud detection:
= features of user ’s activities
Model from data.
Identify unusual users by checking which have
Manufacturing
Monitoring computers in a data center.
= features of machine
= memory use, = number of disk accesses/sec,
= CPU load, = CPU load/network traffic.

Andrew Ng
Anomaly
detection
Gaussian
distribution
Machine Learning
Gaussian (Normal) distribution
Say . If is a distributed Gaussian with mean , variance .

Andrew Ng
Gaussian distribution example

Andrew Ng
Parameter estimation
Dataset:

Andrew Ng
Anomaly
detection
Algorithm
Machine Learning
Density estimation
Training set:
Each example is

Andrew Ng
Anomaly detection algorithm
1. Choose features that you think might be indicative of
anomalous examples.
2. Fit parameters

3. Given new example , compute :

Anomaly if
Andrew Ng
Anomaly detection example

Andrew Ng
Anomaly
detection
Developing and
evaluating an anomaly
detection system
Machine Learning
The importance of real-number evaluation
When developing a learning algorithm (choosing features, etc.),
making decisions is much easier if we have a way of evaluating
our learning algorithm.
Assume we have some labeled data, of anomalous and non-
anomalous examples. ( if normal, if anomalous).
Training set: (assume normal examples/not
anomalous)
Cross validation set:
Test set:

Andrew Ng
Aircraft engines motivating example
10000 good (normal) engines
20 flawed engines (anomalous)

Training set: 6000 good engines


CV: 2000 good engines ( ), 10 anomalous ( )
Test: 2000 good engines ( ), 10 anomalous ( )

Alternative:
Training set: 6000 good engines
CV: 4000 good engines ( ), 10 anomalous ( )
Test: 4000 good engines ( ), 10 anomalous ( )
Andrew Ng
Algorithm evaluation
Fit model on training set
On a cross validation/test example , predict

Possible evaluation metrics:


- True positive, false positive, false negative, true negative
- Precision/Recall
- F1-score

Can also use cross validation set to choose parameter


Andrew Ng
Anomaly
detection
Anomaly detection
vs. supervised
learning
Machine Learning
Anomaly detection vs. Supervised learning
Very small number of positive Large number of positive and
examples ( ). (0-20 is negative examples.
common).
Large number of negative ( )
examples.
Many different “types” of Enough positive examples for
anomalies. Hard for any algorithm algorithm to get a sense of what
to learn from positive examples positive examples are like, future
what the anomalies look like; positive examples likely to be
future anomalies may look nothing similar to ones in training set.
like any of the anomalous
examples we’ve seen so far.
Andrew Ng
Anomaly detection vs. Supervised learning
• Fraud detection • Email spam classification

• Manufacturing (e.g. aircraft • Weather prediction


engines) (sunny/rainy/etc).

• Monitoring machines in a data • Cancer classification


center

Andrew Ng
Anomaly
detection
Choosing what
features to use
Machine Learning
Non-gaussian features
Error analysis for anomaly detection
Want large for normal examples .
small for anomalous examples .
Most common problem:
is comparable (say, both large) for normal
and anomalous examples
Monitoring computers in a data center
Choose features that might take on unusually large or
small values in the event of an anomaly.
= memory use of computer
= number of disk accesses/sec
= CPU load
= network traffic
Anomaly
detection
Multivariate
Gaussian distribution
Machine Learning
Motivating example: Monitoring machines in a data center
(Memory Use)

(CPU Load)

(CPU Load)
(Memory Use)
Andrew Ng
Multivariate Gaussian (Normal) distribution
. Don’t model etc. separately.
Model all in one go.
Parameters: (covariance matrix)

Andrew Ng
Multivariate Gaussian (Normal) examples

Andrew Ng
Multivariate Gaussian (Normal) examples

Andrew Ng
Multivariate Gaussian (Normal) examples

Andrew Ng
Multivariate Gaussian (Normal) examples

Andrew Ng
Multivariate Gaussian (Normal) examples

Andrew Ng
Multivariate Gaussian (Normal) examples

Andrew Ng
Anomaly
detection
Anomaly detection using
the multivariate
Gaussian distribution
Machine Learning
Multivariate Gaussian (Normal) distribution
Parameters

Parameter fitting:
Given training set

Andrew Ng
Anomaly detection with the multivariate Gaussian
1. Fit model by setting

2. Given a new example , compute

Flag an anomaly if
Andrew Ng
Relationship to original model
Original model:

Corresponds to multivariate Gaussian

where
Andrew Ng
Original model vs. Multivariate Gaussian

Manually create features to Automatically captures


capture anomalies where correlations between features
take unusual combinations of
values.

Computationally cheaper Computationally more expensive


(alternatively, scales better to large
)
OK even if (training set size) is Must have , or else is
small non-invertible.
Andrew Ng

You might also like