Getting StartedThis page will walk you through the setup process to get OpenTSDB running. It assumes you've read and understood the overview. With no prior experience, it should take about 15 minutes to get OpenTSDB running, including the time needed to setup HBase on a single node.
Setting up OpenTSDB
- GWT 2.4 (ASLv2)
The runtime dependencies for OpenTSDB are:
- JDK 1.6
- asynchbase 1.3.0 (BSD)
- Guava 12.0 (ASLv2)
- logback 1.0 (LGPLv2.1 / EPL)
- Netty 3.4 (ASLv2)
- SLF4J 1.6 (MIT) with Log4J and JCL adapters
- suasync 1.2 (BSD)
- ZooKeeper 3.3 (ASLv2)
PATHversion 4.2 minimum, 4.4 recommended.
Before getting started, you need an instance of HBase 0.94 (ASLv2) up and running. If you don't already have one, you can get started quickly with a single-node HBase instance. Earlier versions of HBase will work too, but being on the last major release is heavily recommended.
Almost all the following instructions can be copy-pasted directly into a
terminal on a Linux or Mac OS X (or otherwise POSIXy) machine. You will need
to edit the placeholders which are typeset
A Bourne shell (such as
zsh) is assumed.
No special privileges are required.
Checkout, compile & start OpenTSDBOpenTSDB uses the usual build process that consists in running
./bootstrap(only once, when you first check out the code), followed by
make. There is a handy shell script named
build.shthat will take care of all of that for you, and build OpenTSDB in a new subdirectory named
./build/tsdbor you can run
make installto install OpenTSDB on your system. Should you ever change your mind, there is also
make uninstall, so there are no strings attached.
If it's the first time you run OpenTSDB with your HBase instance, you first need to create the necessary HBase tables:
tsdb-uid. If you're just evaluating OpenTSDB, don't worry about compression for now. In production / at scale, make sure you use
COMPRESSION=lzoand have LZO enabled.
Now start a TSD (Time Series Daemon):
--zkquorumflag to specify the comma-separated list of hosts serving your ZooKeeper quorum. The
--cachedircan be purged periodically, e.g. by a cron job.
At this point you can access the TSD's web interface through 127.0.0.1:4242 (if it's running on your local machine).
Create your first metricsMetrics need to be registered before you can start storing data points for them.
New tags, on the other hand, are automatically registered whenever they're used for the first time. Right now OpenTSDB only allows you to have up to 224 = 16777216 different metrics, 16777216 different tag names and 16777216 different tag values. This is because each one of those is assigned a UID on 3 bytes. Metric names, tag names and tag values have their own UID spaces, which is why you can have 16777216 of each kind. The size of each space is configurable but there is no knob that exposes this configuration parameter right now. So bear in mind that using user ID or event ID as a tag value will not work right now if you have a large site.
Start collecting dataSo now that we have our 2 metrics, we can start sending data do the TSD. Let's write a little shell script to collect some data off of MySQL and send it to the TSD (note: this is just an example, in practice you can use tcollector's MySQL collector.):
What does the script do? If you're not a big fan of shell and
awk scripting, it may not be obvious how this works.
But it's simple. The
set -e command simply instructs
bash to exit with an error if any of the commands fail.
This simplifies error handling. The script then enters an infinite loop.
In this loop, we query MySQL to retrieve 2 of its status variables:
--batch -Nflags ask the
mysqlcommand to remove the human friendly fluff so we don't have to filter it out ourselves. Then the output is piped to
awk, which is told to split fields on tabs (
-F"\t") because with the
--batchflag that's what
mysqlwill use. We also create a couple of variables, one named
nowand initialize it to the current timestamp, the other named
hostand set to the hostname of the local machine. Then, for every line, we print
put mysql., followed by the lower-case form of the first word, then by a space, then by the current timestamp, then by the second word (the value), another space, and finally
host=and the current hostname. Rinse and repeat every 15 seconds. The
-w 30parameter given to
ncsimply sets a timeout on the connection to the TSD.
Bear in mind, this is just an example, in practice you can use tcollector's MySQL collector.
If you don't have a MySQL server to monitor, you can try this instead to collect basic load metrics from your Linux servers.
Batch importsLet's imagine that you have a cron job that crunches gigabytes of application logs every day or every hour to extract profiling data. For instance, you could be logging the time taken to process a request and your cron job would compute an average for every 30 second window. Maybe you're particularly interested in 2 types of requests handled by your application, so you'll compute separate averages for those requests, and an another average for every other request type. So your cron job may produce an output file that looks like this:
myservice.latency.avg) and naming the tag that represents the request type. If each server has its own logs and you process them separately, you may want to add another tag to each line like the
host=footag we saw in the previous section. This way you'll be able to plot the latency of each server individually, in addition to your average latency across the board and/or per request type.
In order to import a data file in the format above (
value tags) simply run the following command:
gzip'ing it first. This can be as simple as piping the output of your cron job to
gzip -9 >output.gzinstead of writing directly to a file. The
importcommand is able to read
gzip'ed files and it greatly helps performance for large batch imports.
Self monitoringEach TSD exports some stats about itself through the simple
statscommand. You can collect those stats and feed them back to the TSD every few seconds. First, create the necessary metrics:
Then you can use this simple script to collect stats and store them in OpenTSDB: