OpenTSDB

Scalability

Can OpenTSDB scale to multiple data centers?

Yes. It is recommended that you run one set of Time Series Daemons (TSDs) per HBase cluster and one HBase cluster per physical datacenter. It is not recommended to have HBase clusters spanning across data centers. Instead you can use HBase replication to replicate tables across data centers. HBase replication is still considered experimental and is being actively developed at StumbleUpon with help from the rest of the HBase community.

Right now the TSD doesn't provide any assistance to use multiple HBase clusters together at the same time, so you can't easily plot time series that come from different data centers. This will be fixed.

How much write throughput can I get with OpenTSDB?

It depends mostly on two things:
  1. The size of your HBase cluster.
  2. The CPUs you're using.
If your HBase cluster is reasonably sized, it's unlikely that OpenTSDB will max it out as the TSDs tend to be CPU bound before that happens (unless you run many TSDs). A TSD can easily handle 2000 new data points per second per core on an old dual-core Intel Xeon CPU from 2006. More modern CPUs will get you more throughput.

How much read throughput can I get with OpenTSDB?

This is not very well documented right now. The answer certainly depends on the size of the queries (e.g. generating graphs with many millions of data points are more expensive than if they only have a few tens of thousands, obviously). The read path has plenty of room for optimizations and the TSD only does some very basic caching right now.

What type of hardware should I run the TSDs on?

There are no strict requirements. The recommended configuration, however, is a 4-core machine with at least 4GB of RAM, and a tmpfs partition for the cache directory used by the TSD. Having more RAM helps the TSD ride over transient HBase outages by allowing it to buffer more incoming data before getting to the point where it must start discarding data.

How much disk space do I need?

The answer depends mostly on the average number of tags per data point. At StumbleUpon, we use 4.5 tags on average and our 100+ billion data points take only just over a terabyte of disk space (pre-HDFS 3x replication). We use LZO which is extremely recommended in a production setting. In our case, each data point ends up taking about 12 bytes of disk space (or actually 36 if you include the 3x replication factor of HDFS). We also find that, on average, LZO is able to achieve a compression factor of 4.2x on our table, but your mileage will vary. Without LZO, a data point costs roughly: 16 bytes of HBase overhead, 3 bytes for the metric, 4 bytes for the timestamp, 6 bytes per tag, 2 bytes of OpenTSDB overhead, 8 bytes for the value.

We expect that those storage costs, even though they're already pretty low, will come down significantly. Right now all the values are stored on 8 bytes, but the storage format was designed to support variable-length encoding. This simply hasn't been implemented yet.

In addition to variable-length encoding, a new backwards-compatible storage format has been designed but not implemented yet either. We expect that this new format will reduce the storage requirements by 5x assuming that data points come 15 seconds apart on average (our default interval at StumbleUpon).

Reliability

What are the Single Points of Failure of OpenTSDB?

OpenTSDB itself doesn't have any specific SPoF as there is no central component and you can run multiple TSDs on different machines. The TSDs need HBase to run, and HBase doesn't have any SPoF* either as HBase only really needs a ZooKeeper quorum to keep serving. A ZooKeeper quorum is typically made of 5 different machines, out of which you can afford to lose 2 before the system goes down. Note that although HBase has a master, it is not actually needed for HBase to keep serving. Not having a master running will prevent HBase from starting or recovering from machine failures but, in a steady state, losing the master doesn't impede on HBase's ability to serve.

* Fine prints: if your HBase cluster is backed by HDFS, which is most likely the case for production clusters at the time of this writing, then you have a SPoF because of the NameNode of HDFS. If you run HBase on top of a reliable distributed filesystem, then you don't have any SPoF.

What are the failure modes of OpenTSDB?

The TSD eventually becomes unhealthy when HBase itself is down or broken for an extended period of time. Right now, the TSD doesn't handle very well prolonged HBase outages and will discard incoming data points once its buffers are full if it's unable to flush them to HBase. This is going to be improved.

At StumbleUpon we've had a number of cases where a collector that runs on hundreds of machines goes crazy and generate a DDoS on the TSDs. The TSD doesn't do a good job at handling such DDoS situations by penalizing offending clients, so its performance will degrade once the machine it's running on is unable to keep up with the load.

What is the recommended deployment for OpenTSDB?

We recommend that you run multiple TSDs. At StumbleUpon, we found it useful to dedicate one or more TSDs for read queries (human users using the web UI to generate graphs or to view dashboards) and let other TSDs handle the write queries (new data points coming in from production machines). For the "read-only" TSDs, we recommend Varnish for load balancing. Read more about Varnish and TSDs.

What data durability guarantees does OpenTSDB make?

By default the TSD buffers data points for about 1 second before persisting them in HBase (configurable via the --flush-interval flag). If the TSD was to crash without getting a chance to run its shutdown hook, you could lose up to 1 second worth of data points. In practice we've found this trade off to be acceptable given the performance benefits that deferred flushes offer in terms of write throughput. Once a data point has been stored in HBase, data durability is guaranteed if you're running HBase on top of a distributed filesystem that provides the necessary data durability guarantees.

If you use HDFS, we recommend that you run Cloudera's Distribution for Hadoop (CDH), version 3 or above preferably, as this version comes with all the necessary patches to make HDFS less unreliable and has better performance.

Data Model

How can I increment a counter in OpenTSDB?

OpenTSDB doesn't work with counters. It simply records (timestamp, value) pairs. Data points are independent from each other. Say you want to keep track of clicks on an ad in OpenTSDB. You wouldn't send a "+1" to the TSD for every click. Instead, if your application doesn't already keeps track of click counts, you'd need to increment a counter for every click and periodically send the value of that counter to the TSD. You can later query the TSD and ask for the rate of change of the counter, which will give you clicks per second.

Can I store sub-second precision timestamps in OpenTSDB?

No. Right now timestamps are encoded on 4 bytes so this is not possible. Note that this is not typically needed for OpenTSDB's main use case, which consists in monitoring large clusters of commodity machines. If you think you really need sub-second precision, please reach out to our mailing list for advices: opentsdb@googlegroups.com

Can I use another storage backend than HBase?

No. OpenTSDB was designed specifically for a storage backend that follows the Bigtable data model (a distributed, strongly consistent, sorted multi-dimensional hash map). At the time of writing, HBase is the only such system that's both open-source and usable in production, so the code was written specifically for HBase. Technically it would be feasible to port the code to other systems that follow the Bigtable data model. Systems that differ by not storing data in a sorted fashion (such as distributed hash tables) or that do not offer a strong consistency guarantee will simply not work with the current design.

edit (Q1'13): Due to popular demand, the next major version of OpenTSDB is likely to contain an interface behind which calls to HBase will be abstracted away, in order to allow integration with other storage backends.

Misc

How do the TSDs handle DST changes or leap seconds?

The TSD doesn't assign timestamps to your data points, your collectors do. It is strongly recommended that you use UNIX timestamps in your collectors, so all your timestamps will be based on Epoch. This way you will not be affected by timezone adjustments or DST changes on your machines.

The TSD always renders timestamps in local time, to make it easier for us human to understand and correlate events based on the timezone we live in. So you should to make sure you give the TSD the correct timezone setting (e.g. via the TZ environment variable). When the TSD starts, it computes its offset from UTC and will then keep that offset forever. In case of a DST change, for instance, it would then appear that the TSD is 1 hour behind. There are plans to periodically re-compute the offset from UTC to avoid that situation, but right now you have to restart the TSD in order to adjust the offset. Note that this doesn't prevent the TSD from working properly, it only affects anything that parses dates from local time or renders them in local time. Dashboards and alerting systems should use relative time (e.g. "1d ago") and should thus be unaffected.

When leap seconds occur, UNIX timestamps go back by one second. The TSD should handle this situation gracefully (although this hasn't been tested yet). Unless you're collecting data every second, you won't notice anything except that the interval between the two data points where the leap second occurred is one second less than it should have been. If you do collect data every second, the second data point that attempts to overwrite the previous one during the leap second will be discarded with an error message.

The graphs are ugly, can they be made prettier?

Ugliness is a subjective thing :)

There are a lot of knobs that aren't exposed yet that would allow the TSD to generate nicer, antialiased, smoothed graphs. It's just a matter of exposing those Gnuplot knobs. Also, recent versions of Gnuplot can generate graphs in HTML5 canvas. We plan to use this to build pretty graphs you can interact with from your web browser.

Please contribute to help make the UI sexier.

Can I use OpenTSDB to generate graphs for my customers / end-users?

Yes, but you have to be careful with that. OpenTSDB was written for internal use only, to help engineers and operations staff understand and manage large computer systems. It hasn't been through any security review.

We don't recommend that you give direct access to the TSD to untrusted users. If you really want to leverage the TSD's graphing features, we recommend that you put the TSD behind a secured HTTP proxy that only allows specific requests to go through. Alternatively, you could use the TSD to periodically pre-generate a fixed set of graphs and serve them as static images to your customers.

Why does OpenTSDB return more data than I asked for in my query?

All queries specify a start time and an end time (if the end time isn't specified, it is assumed to be "now"). OpenTSDB's goal is to plot a sensible graph covering that time span. However it needs to retrieve data before and after the times you actually specified, in order to know how to properly compute the values near the "edges" of the graph. Because having extra values past the times actually requested is required to draw accurate graphs, OpenTSDB also returns the extra data based on the assumption that if you want to plot your own graphs or make your own processing, you will also need the extra data to get the correct behavior near the edges.

The amount of extra data that OpenTSDB attempts to retrieve is proportional to the time span covered by your query. It is a known usability issue that sometimes it will return too much excessive data, when less data is actually required to correctly plot the edges of the graph. This is being fixed.

I don't understand the data points returned for my query

Sometimes the results to a query don't match people's expectations. This is often because it's not necesssarily quite obvious what steps are involved in a query, why OpenTSDB uses interpolation, when do aggregators kick in. There is now a dedicated page to explain this: OpenTSDB query execution.

Meta

Why was OpenTSDB written in Java?

Mostly because OpenTSDB lives around the HBase community, which is a Java community. OpenTSDB also started by using HBase's library, which only exists in Java. Eventually, however, OpenTSDB started to use another alternative library to access HBase (asynchbase) but sadly this one too is in Java.

Has OpenTSDB anything to do with OpenBSD?

While the author of OpenTSDB admires the work done on OpenBSD, the fact that the name of projects are so close is just a coincidence. "TSDB" alone was too ambiguous, and the author miserably failed to come up with a better name.

Who is behind OpenTSDB?

OpenTSDB was originally designed and implemented at StumbleUpon by Benoît Sigoure. Berk D. Demir contributed ideas and feedback during the early design stages. tcollector was designed and implemented by Mark Smith and Dave Barr, with contributions from Benoît Sigoure.

How to contribute to OpenTSDB?

The easiest way is to fork the project on GitHub. Make whatever changes you want to your own fork, then send a pull request. You can also send your patches to the mailing list. Be prepared to go through a couple iterations as the code is being reviewed before getting accepted in the main repository. If you are familiar with how the Linux kernel is developped, then this is pretty similar.

Who commits to OpenTSDB?

Anyone can commit to OpenTSDB, provided that the changes are accepted in the main repository after getting reviewed. There is no notion of "commit access", or no list of committers.

Why does OpenTSDB use the LGPL?

One of the most frequent "holy war" that plague open-source communities is that of what licenses to use, which ones are better or "more free" than others. OpenTSDB uses the GNU LGPLv2.1+ for maximum compatibility with its dependencies and other licenses, and because its author thinks that the LGPL strikes the right balance between the goals of free software and the legal restrictions often present in corporate environments.

Let's stress the following points:

With this out of the way, we hope that those afraid of the 3 letters "GPL" will acknowledge the importance of using the LGPL in OpenTSDB and will overcome their fear of the license.
Disclaimer: This page doesn't provide any formal legal advice. Information given here is given in good faith. If you have any doubt, talk to a lawyer first. In the text above "LGPL" or "GPL" refers to the version 2.1 of the license, or (at your option) any later version.

Who supports OpenTSDB?

StumbleUpon supported the initial development of OpenTSDB as well as its open-source release.

YourKit is kindly supporting open source projects with its full-featured Java Profiler. YourKit, LLC is the creator of innovative and intelligent tools for profiling Java and .NET applications. Take a look at YourKit's leading software products: YourKit Java Profiler and YourKit .NET Profiler.