Lab_2_Grouping_and_Aggregation
Lab_2_Grouping_and_Aggregation
2
1. Introduction
This lab builds on your previous knowledge of DQL with the addition of methods for grouping and aggregating
data, as well as adding fields to your log entries.
2. Command: Summarize
The summarize command groups together records with the same values for a specified field and aggregates
them. It's important to remember that you can only use summarize once in a query, but you can run multiple
steps within the command.
Resources
� summarize
count()
countIf()
filterOut
This is returning the number of logs found within the last minute.
3
⚙ 2.2 Lab – Log Source
In many cases, you will want to know the origin of log files. To do this, you can use the log.source parameter
within your query.
1. Building on the query from 2.1, add the log source as shown here:
fetch logs, from:now() -1m
| summarize count(), by:{log.source}
2. Take note of the surrounding braces.
3. Run query.
The results will look something like this:
4
2.3 Additional Practice
Try these additional options for practice:
1. Fetch the count of all logs summarized by status:
fetch logs
| summarize count(), by:{status}
2. Try adding in the count by status for the last hour.
3. Try filtering out (filterOut) logs that do not have a status. i.e., status == “NONE”
fetch logs, from:now() -1h
| filterOut status == "NONE"
| summarize count(), by:{status}
3. Command: Sort
Sorting records is standard practice with any type of query. In DQL, use the sort command to classify results
in either ascending (default) or descending order.
Resources
� sort
5
⚙ 3.2 Lab – Sort by Log Source Count
In the previous labs, we added a summary of count by log source. To sort by the count of log sources, we can
add this line: | sort `count()`
1. Enter the full query as:
fetch logs, from:now() -10m
| summarize count(), by:{log.source}
| sort `count()`
2. Add the option so the list is sorted with the log source with the most logs at the top.
The results will be similar to:
6
4. Naming
You may have noticed above in 3.2, that when we both summarized and sorted by the count() of logs that it
was a little awkward. A best practice is to use naming for these values, instead of referring to the command.
Then these values can be more easily reused.
7
5. Working with Fields
Resources
� fields
fieldsAdd
� formatTimestamp
fieldsRemove
fieldsRename
8
6. Calculations
Now that you have seen how to summarize data, use functions, add names, and create fields, let’s pull that all
together into a query that performs a calculation and then stores it in a field.
Resources
� round
toString
concat
3. Next, use those values to calculate the percentage of ERROR logs and add it to a field. This is where
naming is important because we can use totallogs and errorlogs in the formulas:
fetch logs, from:now() -1h
| summarize totallogs = count(), errorlogs = countIf(loglevel=="ERROR")
| fieldsAdd percentErrorLogs = ((toDouble(errorlogs)/toDouble(totallogs))*100
9
6.2 Additional Practice
Try using the following functions to clean up the display of the percentage:
a. Use round to remove decimal places.
b. Use toString to convert it to a string and concat to add a “%” sign.
c. Remove the fields so only the percentErrorLogs remains.
d. Go to Actions and select ‘Pin to dashboard’ to display the value.
fetch logs, from:now() -1h
| summarize totallogs = count(), errorlogs = countIf(loglevel=="ERROR")
| fieldsAdd percentErrorLogs =
concat(toString(round((toDouble(errorlogs)/toDouble(totallogs))*100)),"%")
| fieldsRemove errorlogs, totallogs
10