SlideShare a Scribd company logo
IFQL and the future of
InfluxData
Paul Dix

Founder & CTO

@pauldix

paul@influxdata.com
Evolution of a query
language…
REST API
SQL-ish
Vaguely Familiar
select percentile(90, value) from cpu
where time > now() - 1d and
“host” = ‘serverA’
group by time(10m)
0.8 -> 0.9
Breaking API change, addition of tags
Functional or SQL?
Afraid to switch…
InfluxData Platform Future and Vision
InfluxData Platform Future and Vision
InfluxData Platform Future and Vision
InfluxData Platform Future and Vision
InfluxData Platform Future and Vision
InfluxData Platform Future and Vision
InfluxData Platform Future and Vision
Difficult to improve & change
It’s not SQL!
Kapacitor
Fall of 2015
Kapacitor’s TICKscript
stream
|from()
.database('telegraf')
.measurement('cpu')
.groupBy(*)
|window()
.period(5m)
.every(5m)
.align()
|mean('usage_idle')
.as('usage_idle')
|influxDBOut()
.database('telegraf')
.retentionPolicy('autogen')
.measurement('mean_cpu_idle')
.precision('s')
Hard to debug
Steep learning curve
Not Recomposable
Second Language
Rethinking Everything
Kapacitor is Background
Processing
Stream or Batch
InfluxDB is batch interactive
IFQL and unified API
Building towards 2.0
Project Goals
Photo by Glen Carrie on Unsplash
One Language to Unite!
Feature Velocity
Decouple storage from
compute
Iterate & deploy
more frequently
Scale
independently
Workload
Isolation
InfluxData Platform Future and Vision
Decouple language from
engine
{
"operations": [
{
"id": "select0",
"kind": "select",
"spec": {
"database": "foo",
"hosts": null
}
},
{
"id": "where1",
"kind": "where",
"spec": {
"expression": {
"root": {
"type": "binary",
"operator": "and",
"left": {
"type": "binary",
"operator": "and",
"left": {
"type": "binary",
"operator": "==",
"left": {
"type": "reference",
"name": "_measurement",
"kind": "tag"
},
"right": {
"type": "stringLiteral",
"value": "cpu"
}
},
Query represented as DAG in JSON
InfluxData Platform Future and Vision
A Data Language
Design Philosophy
UI for Many
because no one wants to actually write a query
Readability
over terseness
Flexible
add to language easily
Testable
new functions and user queries
Easy to Contribute
inspiration from Telegraf
Code Sharing & Reuse
no code > code
A few examples
// get the last value written for anything from a given host
from(db:"mydb")
|> filter(fn: (r) => r["host"] == "server0")
|> last()
// get the last value written for anything from a given host
from(db:"mydb")
|> filter(fn: (r) => r["host"] == "server0")
|> last()
Result: _result
Block: keys: [_field, _measurement, host, region] bounds: [1677-09-21T00:12:43.145224192Z, 2018-02-12T15:53:04.361902250Z)
_time _field _measurement host region _value
------------------------------ --------------- --------------- --------------- --------------- ----------------------
2018-02-12T15:53:00.000000000Z usage_system cpu server0 east 60.6284
Block: keys: [_field, _measurement, host, region] bounds: [1677-09-21T00:12:43.145224192Z, 2018-02-12T15:53:04.361902250Z)
_time _field _measurement host region _value
------------------------------ --------------- --------------- --------------- --------------- ----------------------
2018-02-12T15:53:00.000000000Z usage_user cpu server0 east 39.3716
// get the last minute of data from a specific
// measurement & field & host
from(db:"mydb")
|> filter(fn: (r) =>
r["host"] == "server0" and
r["_measurement"] == "cpu" and
r["_field"] == "usage_user")
|> range(start:-1m)
// get the last minute of data from a specific
// measurement & field & host
from(db:"mydb")
|> filter(fn: (r) =>
r["host"] == "server0" and
r["_measurement"] == "cpu" and
r["_field"] == "usage_user")
|> range(start:-1m)
Result: _result
Block: keys: [_field, _measurement, host, region] bounds: [2018-02-12T16:01:45.677502014Z, 2018-02-12T16:02:45.677502014Z)
_time _field _measurement host region _value
------------------------------ --------------- --------------- --------------- --------------- ----------------------
2018-02-12T16:01:50.000000000Z usage_user cpu server0 east 50.549
2018-02-12T16:02:00.000000000Z usage_user cpu server0 east 35.4458
2018-02-12T16:02:10.000000000Z usage_user cpu server0 east 30.0493
2018-02-12T16:02:20.000000000Z usage_user cpu server0 east 44.3378
2018-02-12T16:02:30.000000000Z usage_user cpu server0 east 11.1584
2018-02-12T16:02:40.000000000Z usage_user cpu server0 east 46.712
// get the mean in 10m intervals of last hour
from(db:"mydb")
|> filter(fn: (r) =>
r["host"] == "server0" and
r["_measurement"] == "cpu")
|> range(start:-1h)
|> window(every:15m)
|> mean()
Result: _result
Block: keys: [_field, _measurement, host, region] bounds: [2018-02-12T15:05:06.708945484Z, 2018-02-12T16:05:06.708945484Z)
_time _field _measurement host region _value
------------------------------ --------------- --------------- --------------- --------------- ----------------------
2018-02-12T15:28:41.128654848Z usage_user cpu server0 east 50.72841444444444
2018-02-12T15:43:41.128654848Z usage_user cpu server0 east 51.19163333333333
2018-02-12T15:13:41.128654848Z usage_user cpu server0 east 45.5091088235294
2018-02-12T15:58:41.128654848Z usage_user cpu server0 east 49.65145555555555
2018-02-12T16:05:06.708945484Z usage_user cpu server0 east 46.41292368421052
Block: keys: [_field, _measurement, host, region] bounds: [2018-02-12T15:05:06.708945484Z, 2018-02-12T16:05:06.708945484Z)
_time _field _measurement host region _value
------------------------------ --------------- --------------- --------------- --------------- ----------------------
2018-02-12T15:28:41.128654848Z usage_system cpu server0 east 49.27158555555556
2018-02-12T15:58:41.128654848Z usage_system cpu server0 east 50.34854444444444
2018-02-12T16:05:06.708945484Z usage_system cpu server0 east 53.58707631578949
2018-02-12T15:13:41.128654848Z usage_system cpu server0 east 54.49089117647058
2018-02-12T15:43:41.128654848Z usage_system cpu server0 east 48.808366666666664
Elements of IFQL
Functional
// get the last 1 hour written for anything from a given host
from(db:"mydb")
|> filter(fn: (r) => r["host"] == "server0")
|> range(start:-1m)
Functional
// get the last 1 hour written for anything from a given host
from(db:"mydb")
|> filter(fn: (r) => r["host"] == "server0")
|> range(start:-1m)
built in functions
Functional
// get the last 1 hour written for anything from a given host
from(db:"mydb")
|> filter(fn: (r) => r["host"] == "server0")
|> range(start:-1m)
anonymous functions
Functional
// get the last 1 hour written for anything from a given host
from(db:"mydb")
|> filter(fn: (r) => r["host"] == "server0")
|> range(start:-1m)
pipe forward operator
Named Parameters
// get the last 1 hour written for anything from a given host
from(db:"mydb")
|> filter(fn: (r) => r["host"] == "server0")
|> range(start:-1m)
named parameters only!
Readability
Flexibility
Functions have inputs &
outputs
Testability
Builder
Inputs
// get the last 1 hour written for anything from a given host
from(db:"mydb")
|> filter(fn: (r) => r["host"] == "server0")
|> range(start:-1m)
no input
Outputs
// get the last 1 hour written for anything from a given host
from(db:"mydb")
|> filter(fn: (r) => r["host"] == "server0")
|> range(start:-1m)
output is entire db
Outputs
// get the last 1 hour written for anything from a given host
from(db:"mydb")
|> filter(fn: (r) => r["host"] == "server0")
|> range(start:-1m)
pipe that output to filter
Filter function input
// get the last 1 hour written for anything from a given host
from(db:"mydb")
|> filter(fn: (r) => r["host"] == "server0")
|> range(start:-1m)
anonymous filter function
input is a single record
{“_measurement”:”cpu”, ”_field”:”usage_user", “host":"server0", “region":"west", "_value":23.2}
Filter function input
// get the last 1 hour written for anything from a given host
from(db:"mydb")
|> filter(fn: (r) => r["host"] == "server0")
|> range(start:-1m)
A record looks like a flat object
or row in a table
{“_measurement”:”cpu”, ”_field”:”usage_user", “host":"server0", “region":"west", "_value":23.2}
Record Properties
// get the last 1 hour written for anything from a given host
from(db:"mydb")
|> filter(fn: (r) => r["host"] == "server0")
|> range(start:-1m)
tag key
{“_measurement”:”cpu”, ”_field”:”usage_user", “host":"server0", “region":"west", "_value":23.2}
Record Properties
// get the last 1 hour written for anything from a given host
from(db:"mydb")
|> filter(fn: (r) => r.host == "server0")
|> range(start:-1m)
same as before
{“_measurement”:”cpu”, ”_field”:”usage_user", “host":"server0", “region":"west", "_value":23.2}
Special Properties
starts with _
reserved for system
attributes
from(db:"mydb")
|> filter(fn: (r) =>
r["host"] == "server0" and
r["_measurement"] == "cpu" and
r["_field"] == "usage_user")
|> range(start:-1m)
|> max()
{“_measurement”:”cpu”, ”_field”:”usage_user", “host":"server0", “region":"west", "_value":23.2}
Special Properties
works other way
from(db:"mydb")
|> filter(fn: (r) =>
r["host"] == "server0" and
r._measurement == "cpu" and
r._field == "usage_user")
|> range(start:-1m)
|> max()
{“_measurement”:”cpu”, ”_field”:”usage_user", “host":"server0", “region":"west", "_value":23.2}
Special Properties
_measurement and _field
present for all InfluxDB data
from(db:"mydb")
|> filter(fn: (r) =>
r["host"] == "server0" and
r["_measurement"] == "cpu" and
r["_field"] == "usage_user")
|> range(start:-1m)
|> max()
{“_measurement”:”cpu”, ”_field”:”usage_user", “host":"server0", “region":"west", "_value":23.2}
Special Properties
_value exists in all series
from(db:"mydb")
|> filter(fn: (r) =>
r["host"] == "server0" and
r["_measurement"] == "cpu" and
r["_field"] == “usage_user" and
r[“_value"] > 50.0)
|> range(start:-1m)
|> max()
{“_measurement”:”cpu”, ”_field”:”usage_user", “host":"server0", “region":"west", "_value":23.2}
Filter function output
// get the last 1 hour written for anything from a given host
from(db:"mydb")
|> filter(fn: (r) => r["host"] == "server0")
|> range(start:-1m)
filter function output
is a boolean to determine if record is in set
Filter Operators
// get the last 1 hour written for anything from a given host
from(db:"mydb")
|> filter(fn: (r) => r["host"] == "server0")
|> range(start:-1m)
!=
=~
!~
in
Filter Boolean Logic
// get the last 1 hour written for anything from a given host
from(db:"mydb")
|> filter(fn: (r) => (r[“host"] == “server0" or
r[“host"] == “server1") and
r[“_measurement”] == “cpu")
|> range(start:-1m)
parens for precedence
Function with explicit return
// get the last 1 hour written for anything from a given host
from(db:"mydb")
|> filter(fn: (r) => {return r[“host"] == “server0"})
|> range(start:-1m)
long hand function definition
Outputs
// get the last 1 hour written for anything from a given host
from(db:"mydb")
|> filter(fn: (r) => r["host"] == "server0")
|> range(start:-1m)
filter output
is set of data matching filter function
Outputs
// get the last 1 hour written for anything from a given host
from(db:"mydb")
|> filter(fn: (r) => r["host"] == "server0")
|> range(start:-1m)
piped to range
which further filters by a time range
Outputs
// get the last 1 hour written for anything from a given host
from(db:"mydb")
|> filter(fn: (r) => r["host"] == "server0")
|> range(start:-1m)
range output is the final query result
Function Isolation
(but the planner may do otherwise)
Does order matter?
from(db:"mydb")
|> filter(fn: (r) =>
r["host"] == "server0" and
r["_measurement"] == "cpu" and
r["_field"] == "usage_user")
|> range(start:-1m)
|> max()
from(db:"mydb")
|> range(start:-1m)
|> filter(fn: (r) =>
r["host"] == "server0" and
r["_measurement"] == "cpu" and
r["_field"] == "usage_user")
|> max()
Does order matter?
from(db:"mydb")
|> filter(fn: (r) =>
r["host"] == "server0" and
r["_measurement"] == "cpu" and
r["_field"] == "usage_user")
|> range(start:-1m)
|> max()
from(db:"mydb")
|> range(start:-1m)
|> filter(fn: (r) =>
r["host"] == "server0" and
r["_measurement"] == "cpu" and
r["_field"] == "usage_user")
|> max()
range and filter switched
Does order matter?
from(db:"mydb")
|> filter(fn: (r) =>
r["host"] == "server0" and
r["_measurement"] == "cpu" and
r["_field"] == "usage_user")
|> range(start:-1m)
|> max()
from(db:"mydb")
|> range(start:-1m)
|> filter(fn: (r) =>
r["host"] == "server0" and
r["_measurement"] == "cpu" and
r["_field"] == "usage_user")
|> max()
results the same
Result: _result
Block: keys: [_field, _measurement, host, region] bounds: [2018-02-12T17:52:02.322301856Z, 2018-02-12T17:53:02.322301856Z)
_time _field _measurement host region _value
------------------------------ --------------- --------------- --------------- --------------- ----------------------
2018-02-12T17:53:02.322301856Z usage_user cpu server0 east 97.3174
Does order matter?
from(db:"mydb")
|> filter(fn: (r) =>
r["host"] == "server0" and
r["_measurement"] == "cpu" and
r["_field"] == "usage_user")
|> range(start:-1m)
|> max()
from(db:"mydb")
|> range(start:-1m)
|> filter(fn: (r) =>
r["host"] == "server0" and
r["_measurement"] == "cpu" and
r["_field"] == "usage_user")
|> max()
is this the same as the top two?
from(db:"mydb")
|> filter(fn: (r) =>
r["host"] == "server0" and
r["_measurement"] == "cpu" and
r["_field"] == "usage_user")
|> max()
|> range(start:-1m)
Does order matter?
from(db:"mydb")
|> filter(fn: (r) =>
r["host"] == "server0" and
r["_measurement"] == "cpu" and
r["_field"] == "usage_user")
|> range(start:-1m)
|> max()
from(db:"mydb")
|> range(start:-1m)
|> filter(fn: (r) =>
r["host"] == "server0" and
r["_measurement"] == "cpu" and
r["_field"] == "usage_user")
|> max()
moving max to here
changes semantics
from(db:"mydb")
|> filter(fn: (r) =>
r["host"] == "server0" and
r["_measurement"] == "cpu" and
r["_field"] == "usage_user")
|> max()
|> range(start:-1m)
Does order matter?
from(db:"mydb")
|> filter(fn: (r) =>
r["host"] == "server0" and
r["_measurement"] == "cpu" and
r["_field"] == "usage_user")
|> range(start:-1m)
|> max()
from(db:"mydb")
|> range(start:-1m)
|> filter(fn: (r) =>
r["host"] == "server0" and
r["_measurement"] == "cpu" and
r["_field"] == "usage_user")
|> max()
here it operates on
only the last minute of data
from(db:"mydb")
|> filter(fn: (r) =>
r["host"] == "server0" and
r["_measurement"] == "cpu" and
r["_field"] == "usage_user")
|> max()
|> range(start:-1m)
Does order matter?
from(db:"mydb")
|> filter(fn: (r) =>
r["host"] == "server0" and
r["_measurement"] == "cpu" and
r["_field"] == "usage_user")
|> range(start:-1m)
|> max()
from(db:"mydb")
|> range(start:-1m)
|> filter(fn: (r) =>
r["host"] == "server0" and
r["_measurement"] == "cpu" and
r["_field"] == "usage_user")
|> max()
here it operates on
data for all time
from(db:"mydb")
|> filter(fn: (r) =>
r["host"] == "server0" and
r["_measurement"] == "cpu" and
r["_field"] == "usage_user")
|> max()
|> range(start:-1m)
Does order matter?
from(db:"mydb")
|> filter(fn: (r) =>
r["host"] == "server0" and
r["_measurement"] == "cpu" and
r["_field"] == "usage_user")
|> range(start:-1m)
|> max()
from(db:"mydb")
|> range(start:-1m)
|> filter(fn: (r) =>
r["host"] == "server0" and
r["_measurement"] == "cpu" and
r["_field"] == "usage_user")
|> max()
then that result
is filtered down to
the last minute
(which will likely be empty)
from(db:"mydb")
|> filter(fn: (r) =>
r["host"] == "server0" and
r["_measurement"] == "cpu" and
r["_field"] == "usage_user")
|> max()
|> range(start:-1m)
Planner Optimizes
maintains query semantics
Optimization
from(db:"mydb")
|> filter(fn: (r) =>
r["host"] == "server0" and
r["_measurement"] == "cpu" and
r["_field"] == "usage_user")
|> range(start:-1m)
|> max()
from(db:"mydb")
|> range(start:-1m)
|> filter(fn: (r) =>
r["host"] == "server0" and
r["_measurement"] == "cpu" and
r["_field"] == "usage_user")
|> max()
Optimization
from(db:"mydb")
|> filter(fn: (r) =>
r["host"] == "server0" and
r["_measurement"] == "cpu" and
r["_field"] == "usage_user")
|> range(start:-1m)
|> max()
from(db:"mydb")
|> range(start:-1m)
|> filter(fn: (r) =>
r["host"] == "server0" and
r["_measurement"] == "cpu" and
r["_field"] == "usage_user")
|> max()
this is more efficient
Optimization
from(db:"mydb")
|> filter(fn: (r) =>
r["host"] == "server0" and
r["_measurement"] == "cpu" and
r["_field"] == "usage_user")
|> range(start:-1m)
|> max()
from(db:"mydb")
|> range(start:-1m)
|> filter(fn: (r) =>
r["host"] == "server0" and
r["_measurement"] == "cpu" and
r["_field"] == "usage_user")
|> max()
query DAG different
plan DAG same as one on left
Optimization
from(db:"mydb")
|> filter(fn: (r) =>
r["host"] == "server0" and
r["_measurement"] == "cpu" and
r["_field"] == “usage_user”
r[“_value"] > 22.0)
|> range(start:-1m)
|> max()
from(db:"mydb")
|> range(start:-1m)
|> filter(fn: (r) =>
r["host"] == "server0" and
r["_measurement"] == "cpu" and
r["_field"] == “usage_user"
r[“_value"] > 22.0)
|> max()
this does a full table scan
Variables & Closures
db = "mydb"
measurement = "cpu"
from(db:db)
|> filter(fn: (r) => r._measurement == measurement and
r.host == "server0")
|> last()
Variables & Closures
db = "mydb"
measurement = "cpu"
from(db:db)
|> filter(fn: (r) => r._measurement == measurement and
r.host == "server0")
|> last()
anonymous filter function
closure over surrounding context
User Defined Functions
db = "mydb"
measurement = “cpu"
fn = (r) => r._measurement == measurement and
r.host == "server0"
from(db:db)
|> filter(fn: fn)
|> last()
assign function to variable fn
User Defined Functions
from(db:"mydb")
|> filter(fn: (r) =>
r["_measurement"] == "cpu" and
r["_field"] == "usage_user" and
r["host"] == "server0")
|> range(start:-1h)
User Defined Functions
from(db:"mydb")
|> filter(fn: (r) =>
r["_measurement"] == "cpu" and
r["_field"] == "usage_user" and
r["host"] == "server0")
|> range(start:-1h)
get rid of some common boilerplate?
User Defined Functions
select = (db, m, f) => {
return from(db:db)
|> filter(fn: (r) => r._measurement == m and r._field == f)
}
User Defined Functions
select = (db, m, f) => {
return from(db:db)
|> filter(fn: (r) => r._measurement == m and r._field == f)
}
select(db: "mydb", m: "cpu", f: "usage_user")
|> filter(fn: (r) => r["host"] == "server0")
|> range(start:-1h)
User Defined Functions
select = (db, m, f) => {
return from(db:db)
|> filter(fn: (r) => r._measurement == m and r._field == f)
}
select(m: "cpu", f: "usage_user")
|> filter(fn: (r) => r["host"] == "server0")
|> range(start:-1h)
throws error
error calling function "select": missing required keyword argument "db"
Default Arguments
select = (db="mydb", m, f) => {
return from(db:db)
|> filter(fn: (r) => r._measurement == m and r._field == f)
}
select(m: "cpu", f: "usage_user")
|> filter(fn: (r) => r["host"] == "server0")
|> range(start:-1h)
Default Arguments
select = (db="mydb", m, f) => {
return from(db:db)
|> filter(fn: (r) => r._measurement == m and r._field == f)
}
select(m: "cpu", f: "usage_user")
|> filter(fn: (r) => r["host"] == "server0")
|> range(start:-1h)
Multiple Results to Client
data = from(db:"mydb")
|> filter(fn: (r) r._measurement == "cpu" and
r._field == "usage_user")
|> range(start: -4h)
|> window(every: 5m)
data |> min() |> yield(name: "min")
data |> max() |> yield(name: "max")
data |> mean() |> yield(name: "mean")
Multiple Results to Client
data = from(db:"mydb")
|> filter(fn: (r) r._measurement == "cpu" and
r._field == "usage_user")
|> range(start: -4h)
|> window(every: 5m)
data |> min() |> yield(name: "min")
data |> max() |> yield(name: "max")
data |> mean() |> yield(name: "mean")
Result: min
Block: keys: [_field, _measurement, host, region] bounds: [2018-02-12T16:55:55.487457216Z, 2018-02-12T20:55:55.487457216Z)
_time _field _measurement host region _value
------------------------------ --------------- --------------- --------------- --------------- ----------------------
name
User Defined Pipe Forwardable Functions
mf = (m, f, table=<-) => {
return table
|> filter(fn: (r) => r._measurement == m and
r._field == f)
}
from(db:"mydb")
|> mf(m: "cpu", f: "usage_user")
|> filter(fn: (r) => r.host == "server0")
|> last()
User Defined Pipe Forwardable Functions
mf = (m, f, table=<-) => {
return table
|> filter(fn: (r) => r._measurement == m and
r._field == f)
}
from(db:"mydb")
|> mf(m: "cpu", f: "usage_user")
|> filter(fn: (r) => r.host == "server0")
|> last()
takes a table
from a pipe forward
by default
User Defined Pipe Forwardable Functions
mf = (m, f, table=<-) => {
return table
|> filter(fn: (r) => r._measurement == m and
r._field == f)
}
from(db:"mydb")
|> mf(m: "cpu", f: "usage_user")
|> filter(fn: (r) => r.host == "server0")
|> last()
calling it, then chaining
Passing as Argument
mf = (m, f, table=<-) => {
return table
|> filter(fn: (r) => r._measurement == m and
r._field == f)
}
sending the from as argument
mf(m: "cpu", f: "usage_user", table: from(db:"mydb"))
|> filter(fn: (r) => r.host == "server0")
|> last()
Passing as Argument
mf = (m, f, table=<-) =>
filter(fn: (r) => r._measurement == m and r._field == f,
table: table)
rewrite the function to use argument
mf(m: "cpu", f: "usage_user", table: from(db:"mydb"))
|> filter(fn: (r) => r.host == "server0")
|> last()
Any pipe forward function can use arguments
min(table:
range(start: -1h, table:
filter(fn: (r) => r.host == "server0", table:
from(db: "mydb"))))
Make you a Lisp
Easy to add Functions
like plugins in Telegraf
code file
test file
package functions
import (
"fmt"
"github.com/influxdata/ifql/ifql"
"github.com/influxdata/ifql/query"
"github.com/influxdata/ifql/query/execute"
"github.com/influxdata/ifql/query/plan"
)
const CountKind = "count"
type CountOpSpec struct {
}
func init() {
ifql.RegisterFunction(CountKind, createCountOpSpec)
query.RegisterOpSpec(CountKind, newCountOp)
plan.RegisterProcedureSpec(CountKind, newCountProcedure, CountKind)
execute.RegisterTransformation(CountKind, createCountTransformation)
}
func createCountOpSpec(args map[string]ifql.Value, ctx ifql.Context) (query.OperationSpec, error) {
if len(args) != 0 {
return nil, fmt.Errorf(`count function requires no arguments`)
}
return new(CountOpSpec), nil
}
func newCountOp() query.OperationSpec {
return new(CountOpSpec)
}
func (s *CountOpSpec) Kind() query.OperationKind {
return CountKind
}
type CountProcedureSpec struct {
}
func newCountProcedure(query.OperationSpec) (plan.ProcedureSpec, error) {
return new(CountProcedureSpec), nil
}
func (s *CountProcedureSpec) Kind() plan.ProcedureKind {
return CountKind
}
func (s *CountProcedureSpec) Copy() plan.ProcedureSpec {
return new(CountProcedureSpec)
}
func (s *CountProcedureSpec) PushDownRule() plan.PushDownRule {
return plan.PushDownRule{
Root: SelectKind,
Through: nil,
}
}
func (s *CountProcedureSpec) PushDown(root *plan.Procedure, dup func() *plan.Procedure) {
selectSpec := root.Spec.(*SelectProcedureSpec)
if selectSpec.AggregateSet {
root = dup()
selectSpec = root.Spec.(*SelectProcedureSpec)
selectSpec.AggregateSet = false
selectSpec.AggregateType = ""
return
}
selectSpec.AggregateSet = true
selectSpec.AggregateType = CountKind
}
type CountAgg struct {
count int64
}
func createCountTransformation(id execute.DatasetID, mode execute.AccumulationMode, spec plan.ProcedureSpec, ctx execute.Context
(execute.Transformation, execute.Dataset, error) {
t, d := execute.NewAggregateTransformationAndDataset(id, mode, ctx.Bounds(), new(CountAgg))
return t, d, nil
}
func (a *CountAgg) DoBool(vs []bool) {
a.count += int64(len(vs))
}
func (a *CountAgg) DoUInt(vs []uint64) {
a.count += int64(len(vs))
}
func (a *CountAgg) DoInt(vs []int64) {
a.count += int64(len(vs))
}
func (a *CountAgg) DoFloat(vs []float64) {
a.count += int64(len(vs))
}
func (a *CountAgg) DoString(vs []string) {
a.count += int64(len(vs))
}
func (a *CountAgg) Type() execute.DataType {
return execute.TInt
}
func (a *CountAgg) ValueInt() int64 {
return a.count
}
Defines parser, validation,
execution
Imports and Namespaces
from(db:"mydb")
|> filter(fn: (r) => r.host == "server0")
|> range(start: -1h)
// square the value
|> map(fn: (r) => r._value * r._value)
shortcut for this?
Imports and Namespaces
from(db:"mydb")
|> filter(fn: (r) => r.host == "server0")
|> range(start: -1h)
// square the value
|> map(fn: (r) => r._value * r._value)
square = (table=<-) {
table |> map(fn: (r) => r._value * r._value)
}
Imports and Namespaces
import "github.com/pauldix/ifqlmath"
from(db:"mydb")
|> filter(fn: (r) => r.host == "server0")
|> range(start: -1h)
|> ifqlmath.square()
Imports and Namespaces
import "github.com/pauldix/ifqlmath"
from(db:"mydb")
|> filter(fn: (r) => r.host == "server0")
|> range(start: -1h)
|> ifqlmath.square()
namespace
MOAR EXAMPLES!
Math across measurements
foo = from(db: "mydb")
|> filter(fn: (r) => r._measurement == "foo")
|> range(start: -1h)
bar = from(db: "mydb")
|> filter(fn: (r) => r._measurement == "bar")
|> range(start: -1h)
join(
tables: {foo:foo, bar:bar},
fn: (t) => t.foo._value + t.bar._value)
|> yield(name: "foobar")
Having Query
from(db:"mydb")
|> filter(fn: (r) => r._measurement == "cpu")
|> range(start:-1h)
|> window(every:10m)
|> mean()
// this is the having part
|> filter(fn: (r) => r._value > 90)
Grouping
// group - average utilization across regions
from(db:"mydb")
|> filter(fn: (r) => r._measurement == "cpu" and
r._field == "usage_system")
|> range(start: -1h)
|> group(by: ["region"])
|> window(every:10m)
|> mean()
Get Metadata
from(db:"mydb")
|> filter(fn: (r) => r._measurement == "cpu")
|> range(start: -48h, stop: -47h)
|> tagValues(key: "host")
Get Metadata
from(db:"mydb")
|> filter(fn: (r) => r._measurement == "cpu")
|> range(start: -48h, stop: -47h)
|> group(by: ["measurement"], keep: ["host"])
|> distinct(column: "host")
Get Metadata
tagValues = (table=<-) =>
table
|> group(by: ["measurement"], keep: ["host"])
|> distinct(column: "host")
Get Metadata
from(db:"mydb")
|> filter(fn: (r) => r._measurement == "cpu")
|> range(start: -48h, stop: -47h)
|> tagValues(key: “host")
|> count()
Functions Implemented as IFQL
// _sortLimit is a helper function, which sorts
// and limits a table.
_sortLimit = (n, desc, cols=["_value"], table=<-) =>
table
|> sort(cols:cols, desc:desc)
|> limit(n:n)
// top sorts a table by cols and keeps only the top n records.
top = (n, cols=["_value"], table=<-) =>
_sortLimit(table:table, n:n, cols:cols, desc:true)
Project Status and Timeline
API 2.0 Work
Lock down query request/response format
Apache Arrow
We’re contributing the Go
implementation!
https://github.com/influxdata/arrow
Finalize Language
(a few months or so)
Ship with Enterprise 1.6
(summertime)
Hack & workshop day
tomorrow!
Ask the registration desk today
Thank you!
Paul Dix

paul@influxdata.com

@pauldix

More Related Content

What's hot (20)

PDF
Inside the InfluxDB storage engine
InfluxData
 
PPTX
InfluxDB 1.0 - Optimizing InfluxDB by Sam Dillard
InfluxData
 
PDF
Creating and Using the Flux SQL Datasource | Katy Farmer | InfluxData
InfluxData
 
PDF
Extending Flux to Support Other Databases and Data Stores | Adam Anthony | In...
InfluxData
 
PPTX
Taming the Tiger: Tips and Tricks for Using Telegraf
InfluxData
 
PDF
A Deeper Dive into EXPLAIN
EDB
 
PDF
Time Series Data with InfluxDB
Turi, Inc.
 
PDF
Finding OOMS in Legacy Systems with the Syslog Telegraf Plugin
InfluxData
 
PDF
Introduction to InfluxDB
Jorn Jambers
 
PPTX
Anais Dotis-Georgiou & Faith Chikwekwe [InfluxData] | Top 10 Hurdles for Flux...
InfluxData
 
PDF
Lessons Learned: Running InfluxDB Cloud and Other Cloud Services at Scale | T...
InfluxData
 
PDF
Obtaining the Perfect Smoke By Monitoring Your BBQ with InfluxDB and Telegraf
InfluxData
 
PDF
Flux and InfluxDB 2.0 by Paul Dix
InfluxData
 
PDF
InfluxDB IOx Tech Talks: Intro to the InfluxDB IOx Read Buffer - A Read-Optim...
InfluxData
 
PPTX
Scott Anderson [InfluxData] | InfluxDB Tasks – Beyond Downsampling | InfluxDa...
InfluxData
 
PDF
Downsampling your data October 2017
InfluxData
 
PPTX
9:40 am InfluxDB 2.0 and Flux – The Road Ahead Paul Dix, Founder and CTO | ...
InfluxData
 
PPTX
Samantha Wang [InfluxData] | Best Practices on How to Transform Your Data Usi...
InfluxData
 
PDF
Write your own telegraf plugin
InfluxData
 
PDF
Meet the Experts: Visualize Your Time-Stamped Data Using the React-Based Gira...
InfluxData
 
Inside the InfluxDB storage engine
InfluxData
 
InfluxDB 1.0 - Optimizing InfluxDB by Sam Dillard
InfluxData
 
Creating and Using the Flux SQL Datasource | Katy Farmer | InfluxData
InfluxData
 
Extending Flux to Support Other Databases and Data Stores | Adam Anthony | In...
InfluxData
 
Taming the Tiger: Tips and Tricks for Using Telegraf
InfluxData
 
A Deeper Dive into EXPLAIN
EDB
 
Time Series Data with InfluxDB
Turi, Inc.
 
Finding OOMS in Legacy Systems with the Syslog Telegraf Plugin
InfluxData
 
Introduction to InfluxDB
Jorn Jambers
 
Anais Dotis-Georgiou & Faith Chikwekwe [InfluxData] | Top 10 Hurdles for Flux...
InfluxData
 
Lessons Learned: Running InfluxDB Cloud and Other Cloud Services at Scale | T...
InfluxData
 
Obtaining the Perfect Smoke By Monitoring Your BBQ with InfluxDB and Telegraf
InfluxData
 
Flux and InfluxDB 2.0 by Paul Dix
InfluxData
 
InfluxDB IOx Tech Talks: Intro to the InfluxDB IOx Read Buffer - A Read-Optim...
InfluxData
 
Scott Anderson [InfluxData] | InfluxDB Tasks – Beyond Downsampling | InfluxDa...
InfluxData
 
Downsampling your data October 2017
InfluxData
 
9:40 am InfluxDB 2.0 and Flux – The Road Ahead Paul Dix, Founder and CTO | ...
InfluxData
 
Samantha Wang [InfluxData] | Best Practices on How to Transform Your Data Usi...
InfluxData
 
Write your own telegraf plugin
InfluxData
 
Meet the Experts: Visualize Your Time-Stamped Data Using the React-Based Gira...
InfluxData
 

Similar to InfluxData Platform Future and Vision (20)

PDF
Flux and InfluxDB 2.0
InfluxData
 
PDF
Observability of InfluxDB IOx: Tracing, Metrics and System Tables
InfluxData
 
PDF
Influxdb and time series data
Marcin Szepczyński
 
PPTX
HBaseCon 2013: How (and Why) Phoenix Puts the SQL Back into NoSQL
Cloudera, Inc.
 
PDF
OPTIMIZING THE TICK STACK
InfluxData
 
PDF
MongoDB World 2018: Overnight to 60 Seconds: An IOT ETL Performance Case Study
MongoDB
 
PDF
Optimizing InfluxDB Performance in the Real World | Sam Dillard | InfluxData
InfluxData
 
PPTX
OPTIMIZING THE TICK STACK
InfluxData
 
PPTX
MongoDB for Time Series Data: Setting the Stage for Sensor Management
MongoDB
 
PDF
5-minute Practical Streaming Techniques that can Save You Millions
HostedbyConfluent
 
PDF
Fluentd meetup #3
Treasure Data, Inc.
 
PDF
Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...
MongoDB
 
PDF
Dynamodb
Jean-Tiare LE BIGOT
 
KEY
From Batch to Realtime with Hadoop - Berlin Buzzwords - June 2012
larsgeorge
 
PPTX
High Performance Applications with MongoDB
MongoDB
 
PDF
IOT with PostgreSQL
EDB
 
PPTX
System insight without Interference
Tony Tam
 
PDF
Lab pratico per la progettazione di soluzioni MongoDB in ambito Internet of T...
festival ICT 2016
 
PDF
MongoDB Solution for Internet of Things and Big Data
Stefano Dindo
 
PDF
Optimizing InfluxDB Performance in the Real World by Dean Sheehan, Senior Dir...
InfluxData
 
Flux and InfluxDB 2.0
InfluxData
 
Observability of InfluxDB IOx: Tracing, Metrics and System Tables
InfluxData
 
Influxdb and time series data
Marcin Szepczyński
 
HBaseCon 2013: How (and Why) Phoenix Puts the SQL Back into NoSQL
Cloudera, Inc.
 
OPTIMIZING THE TICK STACK
InfluxData
 
MongoDB World 2018: Overnight to 60 Seconds: An IOT ETL Performance Case Study
MongoDB
 
Optimizing InfluxDB Performance in the Real World | Sam Dillard | InfluxData
InfluxData
 
OPTIMIZING THE TICK STACK
InfluxData
 
MongoDB for Time Series Data: Setting the Stage for Sensor Management
MongoDB
 
5-minute Practical Streaming Techniques that can Save You Millions
HostedbyConfluent
 
Fluentd meetup #3
Treasure Data, Inc.
 
Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...
MongoDB
 
From Batch to Realtime with Hadoop - Berlin Buzzwords - June 2012
larsgeorge
 
High Performance Applications with MongoDB
MongoDB
 
IOT with PostgreSQL
EDB
 
System insight without Interference
Tony Tam
 
Lab pratico per la progettazione di soluzioni MongoDB in ambito Internet of T...
festival ICT 2016
 
MongoDB Solution for Internet of Things and Big Data
Stefano Dindo
 
Optimizing InfluxDB Performance in the Real World by Dean Sheehan, Senior Dir...
InfluxData
 
Ad

More from InfluxData (20)

PPTX
Announcing InfluxDB Clustered
InfluxData
 
PDF
Best Practices for Leveraging the Apache Arrow Ecosystem
InfluxData
 
PDF
How Bevi Uses InfluxDB and Grafana to Improve Predictive Maintenance and Redu...
InfluxData
 
PDF
Power Your Predictive Analytics with InfluxDB
InfluxData
 
PDF
How Teréga Replaces Legacy Data Historians with InfluxDB, AWS and IO-Base
InfluxData
 
PDF
Build an Edge-to-Cloud Solution with the MING Stack
InfluxData
 
PDF
Meet the Founders: An Open Discussion About Rewriting Using Rust
InfluxData
 
PDF
Introducing InfluxDB Cloud Dedicated
InfluxData
 
PDF
Gain Better Observability with OpenTelemetry and InfluxDB
InfluxData
 
PPTX
How a Heat Treating Plant Ensures Tight Process Control and Exceptional Quali...
InfluxData
 
PDF
How Delft University's Engineering Students Make Their EV Formula-Style Race ...
InfluxData
 
PPTX
Introducing InfluxDB’s New Time Series Database Storage Engine
InfluxData
 
PDF
Start Automating InfluxDB Deployments at the Edge with balena
InfluxData
 
PDF
Understanding InfluxDB’s New Storage Engine
InfluxData
 
PDF
Streamline and Scale Out Data Pipelines with Kubernetes, Telegraf, and InfluxDB
InfluxData
 
PPTX
Ward Bowman [PTC] | ThingWorx Long-Term Data Storage with InfluxDB | InfluxDa...
InfluxData
 
PDF
Scott Anderson [InfluxData] | New & Upcoming Flux Features | InfluxDays 2022
InfluxData
 
PDF
Steinkamp, Clifford [InfluxData] | Closing Thoughts | InfluxDays 2022
InfluxData
 
PDF
Steinkamp, Clifford [InfluxData] | Welcome to InfluxDays 2022 - Day 2 | Influ...
InfluxData
 
PDF
Steinkamp, Clifford [InfluxData] | Closing Thoughts Day 1 | InfluxDays 2022
InfluxData
 
Announcing InfluxDB Clustered
InfluxData
 
Best Practices for Leveraging the Apache Arrow Ecosystem
InfluxData
 
How Bevi Uses InfluxDB and Grafana to Improve Predictive Maintenance and Redu...
InfluxData
 
Power Your Predictive Analytics with InfluxDB
InfluxData
 
How Teréga Replaces Legacy Data Historians with InfluxDB, AWS and IO-Base
InfluxData
 
Build an Edge-to-Cloud Solution with the MING Stack
InfluxData
 
Meet the Founders: An Open Discussion About Rewriting Using Rust
InfluxData
 
Introducing InfluxDB Cloud Dedicated
InfluxData
 
Gain Better Observability with OpenTelemetry and InfluxDB
InfluxData
 
How a Heat Treating Plant Ensures Tight Process Control and Exceptional Quali...
InfluxData
 
How Delft University's Engineering Students Make Their EV Formula-Style Race ...
InfluxData
 
Introducing InfluxDB’s New Time Series Database Storage Engine
InfluxData
 
Start Automating InfluxDB Deployments at the Edge with balena
InfluxData
 
Understanding InfluxDB’s New Storage Engine
InfluxData
 
Streamline and Scale Out Data Pipelines with Kubernetes, Telegraf, and InfluxDB
InfluxData
 
Ward Bowman [PTC] | ThingWorx Long-Term Data Storage with InfluxDB | InfluxDa...
InfluxData
 
Scott Anderson [InfluxData] | New & Upcoming Flux Features | InfluxDays 2022
InfluxData
 
Steinkamp, Clifford [InfluxData] | Closing Thoughts | InfluxDays 2022
InfluxData
 
Steinkamp, Clifford [InfluxData] | Welcome to InfluxDays 2022 - Day 2 | Influ...
InfluxData
 
Steinkamp, Clifford [InfluxData] | Closing Thoughts Day 1 | InfluxDays 2022
InfluxData
 
Ad

Recently uploaded (20)

PPTX
Presentation on Social Media1111111.pptx
tanamlimbu
 
PPTX
1.10-Ruta=1st Term------------------------------1st.pptx
zk7304860098
 
PPTX
Slides ZPE - QFS Eco Economic Epochs.pptx
Steven McGee
 
PPTX
Finally, My Best IPTV Provider That Understands Movie Lovers Experience IPTVG...
Rafael IPTV
 
PDF
Digital Security in 2025 with Adut Angelina
The ClarityDesk
 
PDF
The Complete Guide to Chrome Net Internals DNS – 2025
Orage Technologies
 
PDF
How to Fix Error Code 16 in Adobe Photoshop A Step-by-Step Guide.pdf
Becky Lean
 
PPTX
Simplifying and CounFounding in egime.pptx
Ryanto10
 
PPTX
InOffensive Security_cybersecurity2.pptx
wihib17507
 
PDF
AiDAC – Custody Platform Overview for Institutional Use.pdf
BobPesakovic
 
PDF
DORA - MobileOps & MORA - DORA for Mobile Applications
Willy ROUVRE
 
PPTX
ipv6 very very very very vvoverview.pptx
eyala75
 
PDF
Azure Devops Introduction for CI/CD and agile
henrymails
 
PPTX
Template Timeplan & Roadmap Product.pptx
ImeldaYulistya
 
PPTX
Internet_of_Things_Presentation_KaifRahaman.pptx
kaifrahaman27593
 
PDF
Real Cost of Hiring a Shopify App Developer_ Budgeting Beyond Hourly Rates.pdf
CartCoders
 
PDF
The Power and Impact of Promotion most useful
RajaBilal42
 
PPTX
Random Presentation By Fuhran Khalil uio
maniieiish
 
PPTX
Internet Basics for class ix. Unit I. Describe
ASHUTOSHKUMAR1131
 
PPTX
Birth-after-Previous-Caesarean-Birth (1).pptx
fermann1
 
Presentation on Social Media1111111.pptx
tanamlimbu
 
1.10-Ruta=1st Term------------------------------1st.pptx
zk7304860098
 
Slides ZPE - QFS Eco Economic Epochs.pptx
Steven McGee
 
Finally, My Best IPTV Provider That Understands Movie Lovers Experience IPTVG...
Rafael IPTV
 
Digital Security in 2025 with Adut Angelina
The ClarityDesk
 
The Complete Guide to Chrome Net Internals DNS – 2025
Orage Technologies
 
How to Fix Error Code 16 in Adobe Photoshop A Step-by-Step Guide.pdf
Becky Lean
 
Simplifying and CounFounding in egime.pptx
Ryanto10
 
InOffensive Security_cybersecurity2.pptx
wihib17507
 
AiDAC – Custody Platform Overview for Institutional Use.pdf
BobPesakovic
 
DORA - MobileOps & MORA - DORA for Mobile Applications
Willy ROUVRE
 
ipv6 very very very very vvoverview.pptx
eyala75
 
Azure Devops Introduction for CI/CD and agile
henrymails
 
Template Timeplan & Roadmap Product.pptx
ImeldaYulistya
 
Internet_of_Things_Presentation_KaifRahaman.pptx
kaifrahaman27593
 
Real Cost of Hiring a Shopify App Developer_ Budgeting Beyond Hourly Rates.pdf
CartCoders
 
The Power and Impact of Promotion most useful
RajaBilal42
 
Random Presentation By Fuhran Khalil uio
maniieiish
 
Internet Basics for class ix. Unit I. Describe
ASHUTOSHKUMAR1131
 
Birth-after-Previous-Caesarean-Birth (1).pptx
fermann1
 

InfluxData Platform Future and Vision

  • 1. IFQL and the future of InfluxData Paul Dix Founder & CTO @pauldix paul@influxdata.com
  • 2. Evolution of a query language…
  • 5. Vaguely Familiar select percentile(90, value) from cpu where time > now() - 1d and “host” = ‘serverA’ group by time(10m)
  • 6. 0.8 -> 0.9 Breaking API change, addition of tags
  • 26. InfluxDB is batch interactive
  • 27. IFQL and unified API Building towards 2.0
  • 28. Project Goals Photo by Glen Carrie on Unsplash
  • 29. One Language to Unite!
  • 32. Iterate & deploy more frequently
  • 37. { "operations": [ { "id": "select0", "kind": "select", "spec": { "database": "foo", "hosts": null } }, { "id": "where1", "kind": "where", "spec": { "expression": { "root": { "type": "binary", "operator": "and", "left": { "type": "binary", "operator": "and", "left": { "type": "binary", "operator": "==", "left": { "type": "reference", "name": "_measurement", "kind": "tag" }, "right": { "type": "stringLiteral", "value": "cpu" } }, Query represented as DAG in JSON
  • 41. UI for Many because no one wants to actually write a query
  • 46. Code Sharing & Reuse no code > code
  • 48. // get the last value written for anything from a given host from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0") |> last()
  • 49. // get the last value written for anything from a given host from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0") |> last() Result: _result Block: keys: [_field, _measurement, host, region] bounds: [1677-09-21T00:12:43.145224192Z, 2018-02-12T15:53:04.361902250Z) _time _field _measurement host region _value ------------------------------ --------------- --------------- --------------- --------------- ---------------------- 2018-02-12T15:53:00.000000000Z usage_system cpu server0 east 60.6284 Block: keys: [_field, _measurement, host, region] bounds: [1677-09-21T00:12:43.145224192Z, 2018-02-12T15:53:04.361902250Z) _time _field _measurement host region _value ------------------------------ --------------- --------------- --------------- --------------- ---------------------- 2018-02-12T15:53:00.000000000Z usage_user cpu server0 east 39.3716
  • 50. // get the last minute of data from a specific // measurement & field & host from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0" and r["_measurement"] == "cpu" and r["_field"] == "usage_user") |> range(start:-1m)
  • 51. // get the last minute of data from a specific // measurement & field & host from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0" and r["_measurement"] == "cpu" and r["_field"] == "usage_user") |> range(start:-1m) Result: _result Block: keys: [_field, _measurement, host, region] bounds: [2018-02-12T16:01:45.677502014Z, 2018-02-12T16:02:45.677502014Z) _time _field _measurement host region _value ------------------------------ --------------- --------------- --------------- --------------- ---------------------- 2018-02-12T16:01:50.000000000Z usage_user cpu server0 east 50.549 2018-02-12T16:02:00.000000000Z usage_user cpu server0 east 35.4458 2018-02-12T16:02:10.000000000Z usage_user cpu server0 east 30.0493 2018-02-12T16:02:20.000000000Z usage_user cpu server0 east 44.3378 2018-02-12T16:02:30.000000000Z usage_user cpu server0 east 11.1584 2018-02-12T16:02:40.000000000Z usage_user cpu server0 east 46.712
  • 52. // get the mean in 10m intervals of last hour from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0" and r["_measurement"] == "cpu") |> range(start:-1h) |> window(every:15m) |> mean() Result: _result Block: keys: [_field, _measurement, host, region] bounds: [2018-02-12T15:05:06.708945484Z, 2018-02-12T16:05:06.708945484Z) _time _field _measurement host region _value ------------------------------ --------------- --------------- --------------- --------------- ---------------------- 2018-02-12T15:28:41.128654848Z usage_user cpu server0 east 50.72841444444444 2018-02-12T15:43:41.128654848Z usage_user cpu server0 east 51.19163333333333 2018-02-12T15:13:41.128654848Z usage_user cpu server0 east 45.5091088235294 2018-02-12T15:58:41.128654848Z usage_user cpu server0 east 49.65145555555555 2018-02-12T16:05:06.708945484Z usage_user cpu server0 east 46.41292368421052 Block: keys: [_field, _measurement, host, region] bounds: [2018-02-12T15:05:06.708945484Z, 2018-02-12T16:05:06.708945484Z) _time _field _measurement host region _value ------------------------------ --------------- --------------- --------------- --------------- ---------------------- 2018-02-12T15:28:41.128654848Z usage_system cpu server0 east 49.27158555555556 2018-02-12T15:58:41.128654848Z usage_system cpu server0 east 50.34854444444444 2018-02-12T16:05:06.708945484Z usage_system cpu server0 east 53.58707631578949 2018-02-12T15:13:41.128654848Z usage_system cpu server0 east 54.49089117647058 2018-02-12T15:43:41.128654848Z usage_system cpu server0 east 48.808366666666664
  • 54. Functional // get the last 1 hour written for anything from a given host from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0") |> range(start:-1m)
  • 55. Functional // get the last 1 hour written for anything from a given host from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0") |> range(start:-1m) built in functions
  • 56. Functional // get the last 1 hour written for anything from a given host from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0") |> range(start:-1m) anonymous functions
  • 57. Functional // get the last 1 hour written for anything from a given host from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0") |> range(start:-1m) pipe forward operator
  • 58. Named Parameters // get the last 1 hour written for anything from a given host from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0") |> range(start:-1m) named parameters only!
  • 64. Inputs // get the last 1 hour written for anything from a given host from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0") |> range(start:-1m) no input
  • 65. Outputs // get the last 1 hour written for anything from a given host from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0") |> range(start:-1m) output is entire db
  • 66. Outputs // get the last 1 hour written for anything from a given host from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0") |> range(start:-1m) pipe that output to filter
  • 67. Filter function input // get the last 1 hour written for anything from a given host from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0") |> range(start:-1m) anonymous filter function input is a single record {“_measurement”:”cpu”, ”_field”:”usage_user", “host":"server0", “region":"west", "_value":23.2}
  • 68. Filter function input // get the last 1 hour written for anything from a given host from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0") |> range(start:-1m) A record looks like a flat object or row in a table {“_measurement”:”cpu”, ”_field”:”usage_user", “host":"server0", “region":"west", "_value":23.2}
  • 69. Record Properties // get the last 1 hour written for anything from a given host from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0") |> range(start:-1m) tag key {“_measurement”:”cpu”, ”_field”:”usage_user", “host":"server0", “region":"west", "_value":23.2}
  • 70. Record Properties // get the last 1 hour written for anything from a given host from(db:"mydb") |> filter(fn: (r) => r.host == "server0") |> range(start:-1m) same as before {“_measurement”:”cpu”, ”_field”:”usage_user", “host":"server0", “region":"west", "_value":23.2}
  • 71. Special Properties starts with _ reserved for system attributes from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0" and r["_measurement"] == "cpu" and r["_field"] == "usage_user") |> range(start:-1m) |> max() {“_measurement”:”cpu”, ”_field”:”usage_user", “host":"server0", “region":"west", "_value":23.2}
  • 72. Special Properties works other way from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0" and r._measurement == "cpu" and r._field == "usage_user") |> range(start:-1m) |> max() {“_measurement”:”cpu”, ”_field”:”usage_user", “host":"server0", “region":"west", "_value":23.2}
  • 73. Special Properties _measurement and _field present for all InfluxDB data from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0" and r["_measurement"] == "cpu" and r["_field"] == "usage_user") |> range(start:-1m) |> max() {“_measurement”:”cpu”, ”_field”:”usage_user", “host":"server0", “region":"west", "_value":23.2}
  • 74. Special Properties _value exists in all series from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0" and r["_measurement"] == "cpu" and r["_field"] == “usage_user" and r[“_value"] > 50.0) |> range(start:-1m) |> max() {“_measurement”:”cpu”, ”_field”:”usage_user", “host":"server0", “region":"west", "_value":23.2}
  • 75. Filter function output // get the last 1 hour written for anything from a given host from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0") |> range(start:-1m) filter function output is a boolean to determine if record is in set
  • 76. Filter Operators // get the last 1 hour written for anything from a given host from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0") |> range(start:-1m) != =~ !~ in
  • 77. Filter Boolean Logic // get the last 1 hour written for anything from a given host from(db:"mydb") |> filter(fn: (r) => (r[“host"] == “server0" or r[“host"] == “server1") and r[“_measurement”] == “cpu") |> range(start:-1m) parens for precedence
  • 78. Function with explicit return // get the last 1 hour written for anything from a given host from(db:"mydb") |> filter(fn: (r) => {return r[“host"] == “server0"}) |> range(start:-1m) long hand function definition
  • 79. Outputs // get the last 1 hour written for anything from a given host from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0") |> range(start:-1m) filter output is set of data matching filter function
  • 80. Outputs // get the last 1 hour written for anything from a given host from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0") |> range(start:-1m) piped to range which further filters by a time range
  • 81. Outputs // get the last 1 hour written for anything from a given host from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0") |> range(start:-1m) range output is the final query result
  • 82. Function Isolation (but the planner may do otherwise)
  • 83. Does order matter? from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0" and r["_measurement"] == "cpu" and r["_field"] == "usage_user") |> range(start:-1m) |> max() from(db:"mydb") |> range(start:-1m) |> filter(fn: (r) => r["host"] == "server0" and r["_measurement"] == "cpu" and r["_field"] == "usage_user") |> max()
  • 84. Does order matter? from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0" and r["_measurement"] == "cpu" and r["_field"] == "usage_user") |> range(start:-1m) |> max() from(db:"mydb") |> range(start:-1m) |> filter(fn: (r) => r["host"] == "server0" and r["_measurement"] == "cpu" and r["_field"] == "usage_user") |> max() range and filter switched
  • 85. Does order matter? from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0" and r["_measurement"] == "cpu" and r["_field"] == "usage_user") |> range(start:-1m) |> max() from(db:"mydb") |> range(start:-1m) |> filter(fn: (r) => r["host"] == "server0" and r["_measurement"] == "cpu" and r["_field"] == "usage_user") |> max() results the same Result: _result Block: keys: [_field, _measurement, host, region] bounds: [2018-02-12T17:52:02.322301856Z, 2018-02-12T17:53:02.322301856Z) _time _field _measurement host region _value ------------------------------ --------------- --------------- --------------- --------------- ---------------------- 2018-02-12T17:53:02.322301856Z usage_user cpu server0 east 97.3174
  • 86. Does order matter? from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0" and r["_measurement"] == "cpu" and r["_field"] == "usage_user") |> range(start:-1m) |> max() from(db:"mydb") |> range(start:-1m) |> filter(fn: (r) => r["host"] == "server0" and r["_measurement"] == "cpu" and r["_field"] == "usage_user") |> max() is this the same as the top two? from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0" and r["_measurement"] == "cpu" and r["_field"] == "usage_user") |> max() |> range(start:-1m)
  • 87. Does order matter? from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0" and r["_measurement"] == "cpu" and r["_field"] == "usage_user") |> range(start:-1m) |> max() from(db:"mydb") |> range(start:-1m) |> filter(fn: (r) => r["host"] == "server0" and r["_measurement"] == "cpu" and r["_field"] == "usage_user") |> max() moving max to here changes semantics from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0" and r["_measurement"] == "cpu" and r["_field"] == "usage_user") |> max() |> range(start:-1m)
  • 88. Does order matter? from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0" and r["_measurement"] == "cpu" and r["_field"] == "usage_user") |> range(start:-1m) |> max() from(db:"mydb") |> range(start:-1m) |> filter(fn: (r) => r["host"] == "server0" and r["_measurement"] == "cpu" and r["_field"] == "usage_user") |> max() here it operates on only the last minute of data from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0" and r["_measurement"] == "cpu" and r["_field"] == "usage_user") |> max() |> range(start:-1m)
  • 89. Does order matter? from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0" and r["_measurement"] == "cpu" and r["_field"] == "usage_user") |> range(start:-1m) |> max() from(db:"mydb") |> range(start:-1m) |> filter(fn: (r) => r["host"] == "server0" and r["_measurement"] == "cpu" and r["_field"] == "usage_user") |> max() here it operates on data for all time from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0" and r["_measurement"] == "cpu" and r["_field"] == "usage_user") |> max() |> range(start:-1m)
  • 90. Does order matter? from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0" and r["_measurement"] == "cpu" and r["_field"] == "usage_user") |> range(start:-1m) |> max() from(db:"mydb") |> range(start:-1m) |> filter(fn: (r) => r["host"] == "server0" and r["_measurement"] == "cpu" and r["_field"] == "usage_user") |> max() then that result is filtered down to the last minute (which will likely be empty) from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0" and r["_measurement"] == "cpu" and r["_field"] == "usage_user") |> max() |> range(start:-1m)
  • 92. Optimization from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0" and r["_measurement"] == "cpu" and r["_field"] == "usage_user") |> range(start:-1m) |> max() from(db:"mydb") |> range(start:-1m) |> filter(fn: (r) => r["host"] == "server0" and r["_measurement"] == "cpu" and r["_field"] == "usage_user") |> max()
  • 93. Optimization from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0" and r["_measurement"] == "cpu" and r["_field"] == "usage_user") |> range(start:-1m) |> max() from(db:"mydb") |> range(start:-1m) |> filter(fn: (r) => r["host"] == "server0" and r["_measurement"] == "cpu" and r["_field"] == "usage_user") |> max() this is more efficient
  • 94. Optimization from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0" and r["_measurement"] == "cpu" and r["_field"] == "usage_user") |> range(start:-1m) |> max() from(db:"mydb") |> range(start:-1m) |> filter(fn: (r) => r["host"] == "server0" and r["_measurement"] == "cpu" and r["_field"] == "usage_user") |> max() query DAG different plan DAG same as one on left
  • 95. Optimization from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0" and r["_measurement"] == "cpu" and r["_field"] == “usage_user” r[“_value"] > 22.0) |> range(start:-1m) |> max() from(db:"mydb") |> range(start:-1m) |> filter(fn: (r) => r["host"] == "server0" and r["_measurement"] == "cpu" and r["_field"] == “usage_user" r[“_value"] > 22.0) |> max() this does a full table scan
  • 96. Variables & Closures db = "mydb" measurement = "cpu" from(db:db) |> filter(fn: (r) => r._measurement == measurement and r.host == "server0") |> last()
  • 97. Variables & Closures db = "mydb" measurement = "cpu" from(db:db) |> filter(fn: (r) => r._measurement == measurement and r.host == "server0") |> last() anonymous filter function closure over surrounding context
  • 98. User Defined Functions db = "mydb" measurement = “cpu" fn = (r) => r._measurement == measurement and r.host == "server0" from(db:db) |> filter(fn: fn) |> last() assign function to variable fn
  • 99. User Defined Functions from(db:"mydb") |> filter(fn: (r) => r["_measurement"] == "cpu" and r["_field"] == "usage_user" and r["host"] == "server0") |> range(start:-1h)
  • 100. User Defined Functions from(db:"mydb") |> filter(fn: (r) => r["_measurement"] == "cpu" and r["_field"] == "usage_user" and r["host"] == "server0") |> range(start:-1h) get rid of some common boilerplate?
  • 101. User Defined Functions select = (db, m, f) => { return from(db:db) |> filter(fn: (r) => r._measurement == m and r._field == f) }
  • 102. User Defined Functions select = (db, m, f) => { return from(db:db) |> filter(fn: (r) => r._measurement == m and r._field == f) } select(db: "mydb", m: "cpu", f: "usage_user") |> filter(fn: (r) => r["host"] == "server0") |> range(start:-1h)
  • 103. User Defined Functions select = (db, m, f) => { return from(db:db) |> filter(fn: (r) => r._measurement == m and r._field == f) } select(m: "cpu", f: "usage_user") |> filter(fn: (r) => r["host"] == "server0") |> range(start:-1h) throws error error calling function "select": missing required keyword argument "db"
  • 104. Default Arguments select = (db="mydb", m, f) => { return from(db:db) |> filter(fn: (r) => r._measurement == m and r._field == f) } select(m: "cpu", f: "usage_user") |> filter(fn: (r) => r["host"] == "server0") |> range(start:-1h)
  • 105. Default Arguments select = (db="mydb", m, f) => { return from(db:db) |> filter(fn: (r) => r._measurement == m and r._field == f) } select(m: "cpu", f: "usage_user") |> filter(fn: (r) => r["host"] == "server0") |> range(start:-1h)
  • 106. Multiple Results to Client data = from(db:"mydb") |> filter(fn: (r) r._measurement == "cpu" and r._field == "usage_user") |> range(start: -4h) |> window(every: 5m) data |> min() |> yield(name: "min") data |> max() |> yield(name: "max") data |> mean() |> yield(name: "mean")
  • 107. Multiple Results to Client data = from(db:"mydb") |> filter(fn: (r) r._measurement == "cpu" and r._field == "usage_user") |> range(start: -4h) |> window(every: 5m) data |> min() |> yield(name: "min") data |> max() |> yield(name: "max") data |> mean() |> yield(name: "mean") Result: min Block: keys: [_field, _measurement, host, region] bounds: [2018-02-12T16:55:55.487457216Z, 2018-02-12T20:55:55.487457216Z) _time _field _measurement host region _value ------------------------------ --------------- --------------- --------------- --------------- ---------------------- name
  • 108. User Defined Pipe Forwardable Functions mf = (m, f, table=<-) => { return table |> filter(fn: (r) => r._measurement == m and r._field == f) } from(db:"mydb") |> mf(m: "cpu", f: "usage_user") |> filter(fn: (r) => r.host == "server0") |> last()
  • 109. User Defined Pipe Forwardable Functions mf = (m, f, table=<-) => { return table |> filter(fn: (r) => r._measurement == m and r._field == f) } from(db:"mydb") |> mf(m: "cpu", f: "usage_user") |> filter(fn: (r) => r.host == "server0") |> last() takes a table from a pipe forward by default
  • 110. User Defined Pipe Forwardable Functions mf = (m, f, table=<-) => { return table |> filter(fn: (r) => r._measurement == m and r._field == f) } from(db:"mydb") |> mf(m: "cpu", f: "usage_user") |> filter(fn: (r) => r.host == "server0") |> last() calling it, then chaining
  • 111. Passing as Argument mf = (m, f, table=<-) => { return table |> filter(fn: (r) => r._measurement == m and r._field == f) } sending the from as argument mf(m: "cpu", f: "usage_user", table: from(db:"mydb")) |> filter(fn: (r) => r.host == "server0") |> last()
  • 112. Passing as Argument mf = (m, f, table=<-) => filter(fn: (r) => r._measurement == m and r._field == f, table: table) rewrite the function to use argument mf(m: "cpu", f: "usage_user", table: from(db:"mydb")) |> filter(fn: (r) => r.host == "server0") |> last()
  • 113. Any pipe forward function can use arguments min(table: range(start: -1h, table: filter(fn: (r) => r.host == "server0", table: from(db: "mydb"))))
  • 114. Make you a Lisp
  • 115. Easy to add Functions like plugins in Telegraf
  • 118. package functions import ( "fmt" "github.com/influxdata/ifql/ifql" "github.com/influxdata/ifql/query" "github.com/influxdata/ifql/query/execute" "github.com/influxdata/ifql/query/plan" ) const CountKind = "count" type CountOpSpec struct { } func init() { ifql.RegisterFunction(CountKind, createCountOpSpec) query.RegisterOpSpec(CountKind, newCountOp) plan.RegisterProcedureSpec(CountKind, newCountProcedure, CountKind) execute.RegisterTransformation(CountKind, createCountTransformation) } func createCountOpSpec(args map[string]ifql.Value, ctx ifql.Context) (query.OperationSpec, error) { if len(args) != 0 { return nil, fmt.Errorf(`count function requires no arguments`) } return new(CountOpSpec), nil } func newCountOp() query.OperationSpec { return new(CountOpSpec) } func (s *CountOpSpec) Kind() query.OperationKind { return CountKind }
  • 119. type CountProcedureSpec struct { } func newCountProcedure(query.OperationSpec) (plan.ProcedureSpec, error) { return new(CountProcedureSpec), nil } func (s *CountProcedureSpec) Kind() plan.ProcedureKind { return CountKind } func (s *CountProcedureSpec) Copy() plan.ProcedureSpec { return new(CountProcedureSpec) } func (s *CountProcedureSpec) PushDownRule() plan.PushDownRule { return plan.PushDownRule{ Root: SelectKind, Through: nil, } } func (s *CountProcedureSpec) PushDown(root *plan.Procedure, dup func() *plan.Procedure) { selectSpec := root.Spec.(*SelectProcedureSpec) if selectSpec.AggregateSet { root = dup() selectSpec = root.Spec.(*SelectProcedureSpec) selectSpec.AggregateSet = false selectSpec.AggregateType = "" return } selectSpec.AggregateSet = true selectSpec.AggregateType = CountKind }
  • 120. type CountAgg struct { count int64 } func createCountTransformation(id execute.DatasetID, mode execute.AccumulationMode, spec plan.ProcedureSpec, ctx execute.Context (execute.Transformation, execute.Dataset, error) { t, d := execute.NewAggregateTransformationAndDataset(id, mode, ctx.Bounds(), new(CountAgg)) return t, d, nil } func (a *CountAgg) DoBool(vs []bool) { a.count += int64(len(vs)) } func (a *CountAgg) DoUInt(vs []uint64) { a.count += int64(len(vs)) } func (a *CountAgg) DoInt(vs []int64) { a.count += int64(len(vs)) } func (a *CountAgg) DoFloat(vs []float64) { a.count += int64(len(vs)) } func (a *CountAgg) DoString(vs []string) { a.count += int64(len(vs)) } func (a *CountAgg) Type() execute.DataType { return execute.TInt } func (a *CountAgg) ValueInt() int64 { return a.count }
  • 122. Imports and Namespaces from(db:"mydb") |> filter(fn: (r) => r.host == "server0") |> range(start: -1h) // square the value |> map(fn: (r) => r._value * r._value) shortcut for this?
  • 123. Imports and Namespaces from(db:"mydb") |> filter(fn: (r) => r.host == "server0") |> range(start: -1h) // square the value |> map(fn: (r) => r._value * r._value) square = (table=<-) { table |> map(fn: (r) => r._value * r._value) }
  • 124. Imports and Namespaces import "github.com/pauldix/ifqlmath" from(db:"mydb") |> filter(fn: (r) => r.host == "server0") |> range(start: -1h) |> ifqlmath.square()
  • 125. Imports and Namespaces import "github.com/pauldix/ifqlmath" from(db:"mydb") |> filter(fn: (r) => r.host == "server0") |> range(start: -1h) |> ifqlmath.square() namespace
  • 127. Math across measurements foo = from(db: "mydb") |> filter(fn: (r) => r._measurement == "foo") |> range(start: -1h) bar = from(db: "mydb") |> filter(fn: (r) => r._measurement == "bar") |> range(start: -1h) join( tables: {foo:foo, bar:bar}, fn: (t) => t.foo._value + t.bar._value) |> yield(name: "foobar")
  • 128. Having Query from(db:"mydb") |> filter(fn: (r) => r._measurement == "cpu") |> range(start:-1h) |> window(every:10m) |> mean() // this is the having part |> filter(fn: (r) => r._value > 90)
  • 129. Grouping // group - average utilization across regions from(db:"mydb") |> filter(fn: (r) => r._measurement == "cpu" and r._field == "usage_system") |> range(start: -1h) |> group(by: ["region"]) |> window(every:10m) |> mean()
  • 130. Get Metadata from(db:"mydb") |> filter(fn: (r) => r._measurement == "cpu") |> range(start: -48h, stop: -47h) |> tagValues(key: "host")
  • 131. Get Metadata from(db:"mydb") |> filter(fn: (r) => r._measurement == "cpu") |> range(start: -48h, stop: -47h) |> group(by: ["measurement"], keep: ["host"]) |> distinct(column: "host")
  • 132. Get Metadata tagValues = (table=<-) => table |> group(by: ["measurement"], keep: ["host"]) |> distinct(column: "host")
  • 133. Get Metadata from(db:"mydb") |> filter(fn: (r) => r._measurement == "cpu") |> range(start: -48h, stop: -47h) |> tagValues(key: “host") |> count()
  • 134. Functions Implemented as IFQL // _sortLimit is a helper function, which sorts // and limits a table. _sortLimit = (n, desc, cols=["_value"], table=<-) => table |> sort(cols:cols, desc:desc) |> limit(n:n) // top sorts a table by cols and keeps only the top n records. top = (n, cols=["_value"], table=<-) => _sortLimit(table:table, n:n, cols:cols, desc:true)
  • 135. Project Status and Timeline
  • 136. API 2.0 Work Lock down query request/response format
  • 138. We’re contributing the Go implementation! https://github.com/influxdata/arrow
  • 139. Finalize Language (a few months or so)
  • 140. Ship with Enterprise 1.6 (summertime)
  • 141. Hack & workshop day tomorrow! Ask the registration desk today