Expression

Computes an arbitrary expression on time series fed to the node. It supports arithmentic, relational and logical operations.

Note

When combining arithmetic and relational/logical ops, we currently treat values > 0 as true and <= 0 as false. At some point we need to make it configurable.

Fields include:

Name

Data Type

Required

Description

Default

Example

expression

String

Required

The expression string to compute.

null

m2 / (m1 + m2)

as

String

Optional

A name to substitute for the metric name for time series emitted from this node. If the as config is missing, the expression ID is substituted as the metric name. Tags are preserved.

null

Percent

joinConfig

Object

Required

The join config to use when combining series.

null

See the Join section below.

interpolatorConfig

List

Required for now

A list of interpolator configs for the downsampler to deal with empty buckets.

null

See Interpolators

infectiousNan

boolean

Optional

Whether or not NaNs from the source data should infect each timestamp when aggregating values. E.g. if one value out of 20 are NaN for a timestamp and this is true, the timestamp will return a NaN. If all values for the timestamp are NaN then the result will be NaN regardless.

false

true

Variable Names

Variable names in expressions can be one of two values:

  • A TimeSeriesDataSource node ID such as m1 if the node had that ID. This is also the data source ID.

  • The full metric name such as sys.cpu.busy.

Literals

Literals may also be used such as:

  • Integers

  • Double precision floating point values

  • true or false

Operators

Currently supported operators include:

  • + - Addition

  • - - Subtraction

  • * - Multiplication

  • / - Division

  • % - Modulo

  • >, <, ==, !=, <=, >= - Conditionals

  • AND, OR, NOT - Relationals

  • ? : - Ternary/Conditional Expression

Substitution

When a value is missing at a given timestamp for either side of an expression and the interpolator does not return a literal value, a value may be subsituted based on the operator in order to provide a useful result.

  • For addition and subtraction, missing values are treated as zero.

  • For all other operations, missing values are treated as infectious NaNs.

Ternary or Conditional Expressions

A ternary expression is a simple if/else statement that evaluates a condition and returns one value if the result is true and a different value if the result is false. V3 supports single level ternary expressions at this time (nesting to come in the future).

The condition of a ternary can be one of:

  • A relational condition such as m1 > 1 AND m2 > 1 ? 1 : 0

  • A logical condition such as m1 > 1 ? 1 : 0

  • A single metric such as m1 ? m1 : NaN in which case the value of the condition metric is treated as boolean according to the rules at the top of this document.

Note that NaN can be used as a literal in a ternary operand but not in the condition (yet).

Joining Time Series

Because multiple time series are received in an expression node from multiple sources, the time series must be grouped so that time series from source A match up with those from source B based on the tag values. Additionally, when two time series are matched, the data points at each timestamp must be aligned as well.

Note

We highly recommend that you apply a downsampling operator to all data sources before linking them into an expression node so that the values align cleanly and interpolation is skipped.

If we take the example query below, we’ll see time series like the following:

Number

Source

Metric

Tags

TS1

m1

sys.if.in

host=web01, dc=PHX

TS2

m1

sys.if.in

host=web02, dc=PHX

TS3

m1

sys.if.in

host=web01, dc=DEN

TS4

m2

sys.if.out

host=web01, dc=PHX

TS5

m2

sys.if.out

host=web02, dc=PHX

TS6

m2

sys.if.out

host=web01, dc=DEN

We have 6 total time series with 3 from each time series data source.

Similar to a relational database, there are a number of join types that you can choose from. The most common join is the NATURAL_OUTER join that will attempt a one-to-one match using all of the tags in a time series and for those that do not align it will use substitution rules to handle the missing series. Using a NATURAL_OUTER join (or even an INNER join) we would match TS1 <=> TS3, TS2 <=> TS4 and TS3 <=> TS5. The result of the expression would have 3 time series:

Number

Metric

Tags

TS7

if.in.pct_of_total

host=web01, dc=PHX

TS8

if.in.pct_of_total

host=web02, dc=PHX

TS9

if.in.pct_of_total

host=web01, dc=DEN

As another example, lets assume that we are using the INNER join and TS2 and TS6 are both missing values for our query time range. In this case our output would only have 1 time series, that of TS7 in the table above because an inner join requires that both time series on either side of an expression be present in order for it to be evaluated.

When processing multi-variate expressions, the expression is broken into a tree of binary expressions. If a pair of time series at one level of the tree fails to satisfy join requirements, the rest of the tree is not evaluated.

For simple expressions where a single variable is combined with a literal value, join configurations are essentially ignored.

For more details on join configs see Join.

Joining On Time

Once two time series are joined on time, we must then proceede to compute the expression for each data point in each series. Expressions will follow the same logic as downsamplers and group by nodes in that when particular values are missing at a timestamp, an interpolated value is used.

Example Query

{
    "start": "1h-ago",
    "executionGraph": [{
                    "id": "m1",
                    "type": "TimeSeriesDataSource",
                    "metric": {
                            "type": "MetricLiteral",
                            "metric": "sys.if.in"
                    }
            },
            {
                    "id": "ds1",
                    "type": "downsample",
                    "aggregator": "sum",
                    "interval": "1m",
                    "runAll": false,
                    "fill": true,
                    "interpolatorConfigs": [{
                            "dataType": "numeric",
                            "fillPolicy": "NAN",
                            "realFillPolicy": "NONE"
                    }],
                    "sources": ["m1"]
            },
            {
                    "id": "m2",
                    "type": "TimeSeriesDataSource",
                    "metric": {
                            "type": "MetricLiteral",
                            "metric": "sys.if.out"
                    }
            },
            {
                    "id": "ds2",
                    "type": "downsample",
                    "aggregator": "sum",
                    "interval": "1m",
                    "runAll": false,
                    "fill": true,
                    "interpolatorConfigs": [{
                            "dataType": "numeric",
                            "fillPolicy": "NAN",
                            "realFillPolicy": "NONE"
                    }],
                    "sources": ["m2"]
            }, {
                    "id": "e1",
                    "as": "if.in.pct_of_total",
                    "type": "expression",
                    "expression": "(m1 / (m1 + m2)) * 100",
                    "join": {
                            "type": "Join",
                            "joinType": "NATURAL_OUTER"
                    },
                    "interpolatorConfigs": [{
                            "dataType": "numeric",
                            "fillPolicy": "NAN",
                            "realFillPolicy": "NONE"
                    }],
                    "sources": ["ds1", "ds2"]
            }
    ]
}