In order to use OpenTSDB, you need to have
HBase up and running.
This page will help you get started with a simple, single-node HBase
setup, which is good enough to evaluate OpenTSDB or monitor small
installations. If you need scalability and reliability, you will
need to setup a full HBase cluster.
You can copy-paste all the following instructions directly into a terminal.
Setup a single-node HBase instance
If you already have an HBase cluster,
skip this step.
If you're gonna be using less than 5-10 nodes, stick to a single node.
Deploying HBase on a single node is easy and can help get you started
with OpenTSDB quickly. You can always scale to a real cluster and migrate
your data later.
tar xfz hbase-0.98.10.1-hadoop1-bin.tar.gz
At this point, you are ready to start HBase (without HDFS) on a single
node. But before starting it, I recommend using the following configuration:
iface=lo`uname | sed -n s/Darwin/0/p`
cat >conf/hbase-site.xml <<EOF
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
Make sure to adjust the value of
hbase_rootdir if you want HBase
to store its data in somewhere more durable than a temporary directory. The
default is to use
/tmp, which means you'll lose all your data
whenever your server reboots. The remaining settings are less important
and simply force HBase to stick to the loopback interface (
on Mac OS X, or just
lo on Linux), which simplifies things when
you're just testing HBase on a single node.
Now start HBase:
There is no reason to not use LZO with HBase. Except in rare cases, the CPU
cycles spent on doing LZO compression / decompression pay for themselves by
saving you time wasted doing more I/O. This is certainly true for OpenTSDB
where LZO can easily compress OpenTSDB's binary data by 3 to 4x. Installing
LZO is simple and is done as follows.
In order to build
hadoop-lzo, you need to have Ant installed as
well as liblzo2 with development headers:
apt-get install ant liblzo2-dev # Debian/Ubuntu
yum install ant ant-nodeps lzo-devel.x86_64 # RedHat/CentOS/Fedora
brew install lzo # Mac OS X
Compile & Deploy
Thanks to our friends at Cloudera for maintaining the Hadoop-LZO package:
git clone git://github.com/cloudera/hadoop-lzo.git
CLASSPATH=path/to/hadoop-core-1.0.4.jar CFLAGS=-m64 CXXFLAGS=-m64 ant compile-native tar
mkdir -p $hbasedir/lib/native
cp build/hadoop-lzo-0.4.14/hadoop-lzo-0.4.14.jar $hbasedir/lib
cp -a build/hadoop-lzo-0.4.14/lib/native/* $hbasedir/lib/native
Restart HBase and make sure you create your tables with
COMPRESSION => 'LZO'
- Where to find
hadoop-core-1.0.4.jar? On a normal, production
HBase install, it will be under HBase's
lib/ directory. In your
development environment it may be stashed under HBase's
find to locate it.
- On Mac OS X, you may get
error: Native java headers not found. Is
$JAVA_HOME set correctly? when
configure is looking for
jni.h, in which case you need to insert
CLASSPATH on the 3rd command above (the one that invokes
- On RedHat/CentOS/Fedora you may have to specify where Java is, by adding
JAVA_HOME=/usr/lib/jvm/java-1.6.0-openjdk.x86_64 (or similar)
ant command-line, before the
- On RedHat/CentOS/Fedora, if you get the weird error message that "Cause:
the class org.apache.tools.ant.taskdefs.optional.Javah was not found." then
you need to install the
- The build may fail with
[javah] Error: Class
org.apache.hadoop.conf.Configuration could not be found. in which case
you need to apply
- On Ubuntu, the build may fail to compile the code with
LzoCompressor.c:125:37: error: expected expression before ','
token. As per
the solution is to add
LDFLAGS='-Wl,--no-as-needed' to the
Migrating to a real HBase cluster
TBD. In short:
- Shut down all your TSDs.
- Shut down your single-node HBase cluster.
- Copy the directories named
from your local filesystem to the HDFS cluster backing up your real HBase
./bin/hbase org.jruby.Main ./bin/add_table.rb
/hdfs/path/to/hbase/tsdb and again for the
- Restart your real HBase cluster (sorry).
- Restart your TSDs after making sure they now use your real HBase
Putting HBase in production
TBD. In short:
- Stay on a single node unless you can deploy HBase on at least 5 machines,
preferably at least 10.
- Make sure you have LZO installed
and make sure it's enabled for the tables used by OpenTSDB.