Apache Druid: Sudhindra Tirupati Nagaraj
Apache Druid: Sudhindra Tirupati Nagaraj
http://static.druid.io/docs/druid.pdf
● Initially considered HBase, which was too slow for aggregate queries
Load Balancing
● Coordinators periodically read segments and assign to historicals (discovered via zk)
● Use query patterns to determine cost based optimization to spread or co-locate segments from different
data sources
Availability
Rules
● Gains of supporting joins is offset by problems due to high throughput, join heavy
workloads
● Possible to materialize columns into streams and perform hash-based/sort-merge join, but
requires lot of computation
Storage Format
● Data tables called Data Sources (similar to Rockset collections). Table partitioned into
“segments”. Segment is typically 5-10 million rows, spanning a period of time. Segments are
immutable.
● Filtering on aggregated results (eg: sum of all expenses for San Francisco)
● Binary bitmap as indices. For example, for each city, there can be a binary bitmap indicating the
rows containing that city. Example: San Francisco -> [0, 1, 2] -> [1, 1, 1, 0], New York -> [3] -> [0, 0, 0, 1]
The bitmap can be compressed further using bitmap compression algorithms (Druid uses
Concise). We can also combine 2 bitmaps. For example, sum of all expenses in San Francisco
and New York is obtained by: [1, 1, 1, 0] OR [0, 0, 0, 1] -> [1, 1, 1, 1]
● No SQL.
[
Request: {
"timestamp": "2012-01-01T00:00:00.000Z",
{ "result": {
"rows": 393298
"queryType": "timeseries",
}
"dataSource": "wikipedia", },
"intervals": "2013-01-01/2013-01-08", {
"filter": { "timestamp": "2012-01-02T00:00:00.000Z",
"type": "selector", "result": {
"dimension": "page", "rows": 382932
"value": "Ke$ha" }
}, },
"granularity": "day", ...
{
"aggregations": [
"timestamp": "2012-01-07T00:00:00.000Z",
{ "result": {
"type": "count", "rows": 1337
"name": "rows" }
} }
] ]
}
Query Performance
Test data: