How does OpenTSDB work?
OpenTSDB requires you to run one or more Time Series Daemons (TSDs). Each TSD is independent. There is no master, no shared state. The TSD uses HBase to store and retrieve time-series data. Users of the TSD never need to access HBase directly. You can communicate with the TSD via a simple telnet-style protocol, and via HTTP. An additional proper binary RPC protocol is in the works. All communications happen on the same port (the TSD figures out the protocol of the client by looking at the first few bytes it receives).
You need to write little scripts that collect data from your systems (e.g.
by reading interesting metrics from /proc on Linux, collecting
counters from your network gear via SNMP, or other interesting data from your
applications, via JMX for instance for Java applications) and push data points
to one of the TSDs every few seconds.
StumbleUpon wrote a Python framework
called
tcollector
that we use to collect
thousand of metrics from Linux 2.6, Apache's HTTPd, MySQL, HBase, memcached,
Varnish and more. This framework comes with a number of Linux collectors and
an HBase collector, and we're planning on releasing more collectors later
so stay tuned.
In OpenTSDB, a data point is made of:
- A metric name.
- A UNIX timestamp (seconds since Epoch).
- A value (64 bit integer or double-precision floating point value).
- A set of tags (key-value pairs) that annotate this data point.
mysql.bytes_received or mysql.bytes_sent.
A data point must have at least one tag. It is not recommended to have
more than 6-7 tags per data point, as the cost associated with storing
new data points quickly becomes dominated by the number of tags beyond
that point.
With the tags in the example above, it will be easy to create graphs and dashboards that show the network activity of MySQL on a per host and/or per schema basis.