0% found this document useful (0 votes)
19 views

Baseline Monitoring Anomaly Detection

zabbix baseline

Uploaded by

reza sepehri
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views

Baseline Monitoring Anomaly Detection

zabbix baseline

Uploaded by

reza sepehri
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 28

BASELINE MONITORING AND

ANOMALY DETECTION
• all our microphones are muted
• ask your questions in Q&A, not in the Chat
• use Chat for discussion, networking or applauses
• use the hashtag if you post something in your social media channels: #ZabbixMeetingOnline
ZABBIX 6.0 LTS
One of the new features of Zabbix 6.0 LTS is focus on anomaly detection

Zabbix 6.0 offers:


Possibility to use trends to analyze large periods of data
Baseline calculation using baselinewma and baselinedev functions
Anomalous metric detection using trendstl function

2
01
BASELINE MONITORING
OVERVIEW
BASELINE AND ANOMALIES
Anomaly detection is a type of data analytics whose goal is detecting unusual patterns in a dataset:
Data must be normally distributed, following a set of rules
Anomaly detection measures how far a data point is away from the mean
When the value deviates too much from the mean, it is considered to be anomalous

Anomaly

4
BASELINE VS FIXED TRIGGERS
BASELINE MONITORING CAN MONITOR VALUES WHICH FOLLOW A PATTERN
Fixed trigger thresholds will either give false alarms or ignore the problem
Baseline monitoring can adapt to such situation

This situation was


"normal" a few
weeks ago

Anomaly
ANOMALY DETECTION OVERVIEW
Anomaly detection is based on a set of statistic functions using:
Standard deviation (σ)
Mean absolute deviation (MAD)
Weighted moving average (WMA)
Seasonal and Trend decomposition using Loess (STL)

6
STANDARD DEVIATION
The standard deviation σ defines how far the normal distribution is spread around the mean
When a metric is normally distributed it follows some interesting laws:
• 68% of all values fall between [mean-σ, mean+σ]
• 95% of all values fall between [mean-2*σ, mean+2*σ]
• 99,7% of all values fall between [mean-3*σ, mean+3*σ]

7
MEAN ABSOLUTE DEVIATION (MAD)
The mean absolute deviation (MAD) of a dataset is the average distance between each data point
and the mean
Calculate the mean
Calculate how far away each data point is from the mean using positive distances (deviations)
Sum those deviations together
Divide the sum by the number of data points

8
WEIGHTED MOVING AVERAGE ALGORITHM
A weighted moving average (WMA) puts more weight on recent data and less on past data
The most recent data is more heavily weighted, and contributes more to the final WMA value
The weighting factor used to calculate the WMA is determined by the period

For example, a 5 period WMA would be calculated as follows:

WMA = (P1 * 5) + (P2 * 4) + (P3 * 3) + (P4 * 2) + (P5 * 1) / (5 + 4 + 3 + 2 + 1)

9
02
TIMESHIFTS
TIMESHIFT SYNTAX
Zabbix can use absolute or relative timeshift to compare current and past periods of data
Relative timeshift specifies time period relatively to the current time:

trendavg(/host/key,1d:now-3d)

Absolute timeshift specifies the time period for analysis:

trendavg(/host/key,1d:now/d-3d)
RELATIVE TIMESHIFT
Relative timeshift specifies sliding time period relatively to the current time:

3 days

1d
trendavg(//key,1d:now-3d) trendavg(//key,1d)

Three days ago Two days ago Yesterday Today Now


2022-04-24 2022-04-24 2022-04-25 2022-04-26 2022-04-27 2022-04-27
00:00:00 16:00:00 00:00:00 00:00:00 00:00:00 16:00:00

12
ABSOLUTE TIMESHIFT
Absolute timeshift specifies fixed time period calculated from the end of the period:

3 days

trendavg(//key,1d:now/d-3d) trendavg(//key,1d:now/d+1d)

Three days ago Two days ago Yesterday Today Now Tomorrow
2022-04-24 2022-04-25 2022-04-26 2022-04-27 2022-04-27 2022-04-28
00:00:00 00:00:00 00:00:00 00:00:00 17:00:00 00:00:00

13
03
TREND FUNCTIONS
TREND FUNCTIONS
Zabbix 6.0 offers eight different trend functions for long-term data analysis
Trends analysis:
• trendsum (/host/key,time period:time shift)
• trendavg (/host/key,time period:time shift)
• trendcount (/host/key,time period:time shift)
• trendmax (/host/key,time period:time shift)
• trendmin (/host/key,time period:time shift)
• trendstl (/host/key,eval period:time shift,detection period,season,<dev>,<devalg>,<s_window>)
Baseline calculation:
• baselinedev (/host/key,data period:time shift,season_unit,num_seasons)
• baselinewma (/host/key,data period:time shift,season_unit,num_seasons)

15
STORING TRENDS
Trends in the trends cache are calculated in real-time independently from historical data
TrendCache always has the actual trends value for every item
• to calculate the new average after then nth number, you multiply the old average by n−1, add
the new number, and divide the total by n.

It is possible to store only trends data

16
TREND CACHES
Zabbix trends are written to database at the beginning of each hour
Trends for the current hour are unavailable for trend functions
Two different Zabbix server internal caches are used by trends

### Option: TrendCacheSize


# Size of trend write cache, in bytes. Shared memory size for storing trends data.
# Range: 128K-2G
# Default:
TrendCacheSize=512M

### Option: TrendFunctionCacheSize


# Size of trend function cache, in bytes.
# Shared memory size for caching calculated trend function data.
# Range: 128K-2G
# Default:
TrendFunctionCacheSize=256M

17
TRENDS VS HISTORY
TRENDS UTILIZES LESS SPACE THAN HISTORY IN BOTH DATABASE AND MEMORY CACHES
When functions with time-shift are used for analysis, all data between now and function period are
stored in the memory cache
History data could utilize gigabytes of data in such scenarios, trends are much more efficient

trendsum(//key,1d:now/d-3d)

Data in the TrendFunctionCache


1d:now/d-3d

time

Three days ago Two days ago Yesterday Today Now


2022-04-24 2022-04-25 2022-04-26 2022-04-27 2022-04-27
00:00:00 00:00:00 00:00:00 00:00:00 17:00:00
18
04
BASELINE MONITORING
BASELINE MONITORING OVERVIEW
Baseline monitoring can be used to analyze recent data by:
Comparing it to baseline from previous periods using the (baselinewma)
Calculating the number of deviations from previous periods (baselinedev)

Previous data periods must be defined as seasons using:


Season units (h, d, w, M, y) - cannot be smaller than data period
Number of seasons

baseline*(/host/key,1h:now/h,"d",3)
baseline function based on the last full hour within the last 3-day period

20
MORE BASELINE FUNCTION EXAMPLES

baseline*(/host/key,1d:now/d,"M",6)
baseline function based on the previous day and the same day of month in the previous 6 months
If the date does not exist in a previous month, last day of month will be used

baseline*(/host/key,2h:now/h,"d",7)
baseline function based on the last two hours and the same hours within a 7-day period

baseline*(/host/key,1w:now/w,"m",3)
baseline function based on the previous week and other weeks within a 3-month period
BASELINEWMA() FUNCTION
Calculates baseline data by averaging data from the same timeframe in multiple equal time periods
Weighted moving average algorithm (WMA) is used
Baseline can be compared to recent trends data to detect anomalies

baselinewma (/host/key,data period:time shift,season_unit,num_seasons)

trendavg (/Web/nginx.req,1d:now/d) > baselinewma (/Web/nginx.req,1d,4w) * 2

Web requests yesterday (1d:now/d) is more than twice as high


than baseline on the same weekdays(1d) over last 4 weeks

22
BASELINEDEV() FUNCTION
Calculates number of deviations (σ) between the last data period and periods in preceding seasons
stddevpop algorithm is used (calculates standard deviation based on the entire population)
High number of deviations indicates anomalies

baselinedev (/host/key,data period:time shift,season_unit,num_seasons)

baselinedev(/Production server/system.cpu.load,1h,10d,10) > 3

Check if load for last hour is more than 3


deviations away from mean using 10
one-hour periods over last 10 days

23
05
ANOMALY DETECTION USING
PATTERNS
TRENDSTL FUNCTION
STL will decompose data in predefined intervals and will find anomalies based on a repeating
pattern.

trendstl() function use standard deviation to detect anomalies and returns anomaly rate (0 - 1):
• Function compares smaller detection period to larger evaluation period
• A standard deviation means how far values are from the average
• By default, the MAD algorithm is used, can be also stddevpop or stddevsamp
• The number of deviations can be specified (default is 3)
• s_window is the span (in lags) of the loess window for seasonal extraction

trendstl(/host/key,eval period,detection period,season,<dev>,<devalg>,<s_window>)

25
SEASONAL-TREND DECOMPOSITION USING LOESS
Seasonal-Trend decomposition using LOESS (STL) is a robust method of time series decomposition
The STL method uses locally fitted regression models to decompose a time series into
• trend components
• seasonal components
• residual components

26
TRENDSTL FUNCTION EXAMPLE

trendstl(/host/net.if.out[eth0],30d:now/d,7d,12h) > 0.1

Analyzing the last 30 days of trend data


Find the anomalies rate for the previous 7days of that period
Expecting the periodicity to be 12h (traffic varies between day and night time)
The number of deviations to count as anomaly equals 3 (default)
MAD algorithm is used (default)
Anomaly rate is larger than 0.1 (10% of all values)

27
Thank you
www.zabbix.com

You might also like