Getting Started
This page will walk you through the setup process to get OpenTSDB running. It assumes you've read and understood the overview. With no prior experience, it should take about 15 minutes to get OpenTSDB running, including the time needed to setup HBase on a single node.Setting up OpenTSDB
- GWT 2.4 (ASLv2)
The runtime dependencies for OpenTSDB are:
- JDK 1.6
- asynchbase 1.3.0 (BSD)
- Guava 12.0 (ASLv2)
- logback 1.0 (LGPLv2.1 / EPL)
- Netty 3.4 (ASLv2)
- SLF4J 1.6 (MIT) with Log4J and JCL adapters
- suasync 1.2 (BSD)
- ZooKeeper 3.3 (ASLv2)
PATH version 4.2 minimum,
4.4 recommended.
Before getting started, you need an instance of HBase 0.94 (ASLv2) up and running. If you don't already have one, you can get started quickly with a single-node HBase instance. Earlier versions of HBase will work too, but being on the last major release is heavily recommended.
Almost all the following instructions can be copy-pasted directly into a
terminal on a Linux or Mac OS X (or otherwise POSIXy) machine. You will need
to edit the placeholders which are typeset like-this.
A Bourne shell (such as bash or zsh) is assumed.
No special privileges are required.
Checkout, compile & start OpenTSDB
OpenTSDB uses the usual build process that consists in running./bootstrap (only once, when you first check out the code),
followed by ./configure and make. There is a
handy shell script named build.sh that will take care of all
of that for you, and build OpenTSDB in a new subdirectory named
build:
./build/tsdb or you can run make install to install
OpenTSDB on your system. Should you ever change your mind, there is also
make uninstall, so there are no strings attached.
If it's the first time you run OpenTSDB with your HBase instance, you first need to create the necessary HBase tables:
tsdb and tsdb-uid.
If you're just evaluating OpenTSDB, don't worry about compression for now. In
production / at scale, make sure you use COMPRESSION=lzo and have
LZO enabled.
Now start a TSD (Time Series Daemon):
--zkquorum flag to specify the comma-separated list of hosts
serving your ZooKeeper quorum. The --cachedir can be purged
periodically, e.g. by a cron job.
At this point you can access the TSD's web interface through 127.0.0.1:4242 (if it's running on your local machine).
Using OpenTSDB
Create your first metrics
Metrics need to be registered before you can start storing data points for them.mysql.bytes_received and
mysql.bytes_sent
New tags, on the other hand, are automatically registered whenever they're used for the first time. Right now OpenTSDB only allows you to have up to 224 = 16777216 different metrics, 16777216 different tag names and 16777216 different tag values. This is because each one of those is assigned a UID on 3 bytes. Metric names, tag names and tag values have their own UID spaces, which is why you can have 16777216 of each kind. The size of each space is configurable but there is no knob that exposes this configuration parameter right now. So bear in mind that using user ID or event ID as a tag value will not work right now if you have a large site.
Start collecting data
So now that we have our 2 metrics, we can start sending data do the TSD. Let's write a little shell script to collect some data off of MySQL and send it to the TSD (note: this is just an example, in practice you can use tcollector's MySQL collector.):
What does the script do? If you're not a big fan of shell and
awk scripting, it may not be obvious how this works.
But it's simple. The set -e command simply instructs
bash to exit with an error if any of the commands fail.
This simplifies error handling. The script then enters an infinite loop.
In this loop, we query MySQL to retrieve 2 of its status variables:
--batch -N flags ask the mysql command to
remove the human friendly fluff so we don't have to filter it out ourselves.
Then the output is piped to awk, which is told to split fields on
tabs (-F"\t") because with the --batch flag that's
what mysql will use. We also create a couple of variables, one
named now and initialize it to the current timestamp, the other
named host and set to the hostname of the local machine. Then,
for every line, we print put mysql., followed by the lower-case
form of the first word, then by a space, then by the current timestamp, then
by the second word (the value), another space, and finally host=
and the current hostname. Rinse and repeat every 15 seconds. The -w
30 parameter given to nc simply sets a timeout on the
connection to the TSD.
Bear in mind, this is just an example, in practice you can use tcollector's MySQL collector.
If you don't have a MySQL server to monitor, you can try this instead to collect basic load metrics from your Linux servers.
Batch imports
Let's imagine that you have a cron job that crunches gigabytes of application logs every day or every hour to extract profiling data. For instance, you could be logging the time taken to process a request and your cron job would compute an average for every 30 second window. Maybe you're particularly interested in 2 types of requests handled by your application, so you'll compute separate averages for those requests, and an another average for every other request type. So your cron job may produce an output file that looks like this:myservice.latency.avg) and naming the tag that represents the
request type. If each server has its own logs and you process them
separately, you may want to add another tag to each line like the
host=foo tag we saw in the previous section. This way you'll
be able to plot the latency of each server individually, in addition to your
average latency across the board and/or per request type.
In order to import a data file in the format above (metric timestamp
value tags) simply run the following command:
gzip'ing it first. This can
be as simple as piping the output of your cron job to
gzip -9 >output.gz instead of writing directly to a file. The
import command is able to read gzip'ed files and it
greatly helps performance for large batch imports.
Self monitoring
Each TSD exports some stats about itself through the simplestats
command. You can collect those stats and feed them back to the TSD every few
seconds. First, create the necessary metrics:
Then you can use this simple script to collect stats and store them in OpenTSDB: