Thursday, September 19, 2013

YCSB on HBase

* This post is for using YCSB on HBase 0.94.11 and Hadoop 1.2.1, for YCSB on HBase 0.96 and Hadoop 2.2, please go to this post.

YCSB (Yahoo Cloud Serving Benchmark) is a benchmark tool with common set of workloads for evaluating the performance of different “key-value” and “cloud” serving stores. HBase is one of the targets that can be benchmarked using YCSB.

The first step to use this benchmark is to donwload the source from YCSB git:
git clone http://github.com/brianfrankcooper/YCSB.git

Although it is mentioned you are able to download the binary from the site but the binary will not work when your hbase server version is different compare to the hbase client version used in YCSB binary and you will most likely get the error like below:

java.lang.IllegalArgumentException: Not a host:port pair: 

Once finish cloning, cd into the newly created directory YCSB and edit the following files using your favorite editor.

-YCSB/hbase/pom.xml
Edit the following line shown below to the hbase and hadoop version you have in your environment. In my case, my hbase is 0.94.11 and hadoop is 1.2.1.


-YCSB/pom.xml
Edit the following line to change the slf4j version to 1.4.3.


-YCSB/elasticsearch/pom.xml
Edit the following line to change the slf4j version to 1.4.3.


The changes to the last 2 pom.xml files is to make sure hbase and ycsb use the same version of slf4j. If this is not changed, you might face the problem shown below when running ycsb.

SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/hadoopuser/ycsb-0.1.4/hbase-binding/lib/hbase-binding-0.1.4.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/hadoopuser/ycsb-0.1.4/hbase-binding/lib/slf4j-log4j12-1.4.3.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: slf4j-api 1.6.x (or later) is incompatible with this binding.
SLF4J: Your binding is version 1.5.5 or earlier.
SLF4J: Upgrade your binding to version 1.6.x. or 2.0.x
Exception in thread "Thread-1" java.lang.NoSuchMethodError: org.slf4j.impl.StaticLoggerBinder.getSingleton()Lorg/slf4j/impl/StaticLoggerBinder;
        at org.slf4j.LoggerFactory.bind(LoggerFactory.java:128)
        at org.slf4j.LoggerFactory.performInitialization(LoggerFactory.java:108)
        at org.slf4j.LoggerFactory.getILoggerFactory(LoggerFactory.java:279)
        at org.slf4j.LoggerFactory.getLogger(LoggerFactory.java:252)
        at org.slf4j.LoggerFactory.getLogger(LoggerFactory.java:265)
        at org.apache.zookeeper.ZooKeeper.(ZooKeeper.java:94)
        at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.(RecoverableZooKeeper.java:98)
        at org.apache.hadoop.hbase.zookeeper.ZKUtil.connect(ZKUtil.java:127)
        at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.(ZooKeeperWatcher.java:153)
        at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.(ZooKeeperWatcher.java:127)
        at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getZooKeeperWatcher(HConnectionManager.java:1507)
        at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.ensureZookeeperTrackers(HConnectionManager.java:716)
        at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:986)
        at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:961)
        at org.apache.hadoop.hbase.client.HTable.finishSetup(HTable.java:227)
        at org.apache.hadoop.hbase.client.HTable.(HTable.java:170)
        at org.apache.hadoop.hbase.client.HTable.(HTable.java:129)
        at com.yahoo.ycsb.db.HBaseClient.getHTable(HBaseClient.java:118)
        at com.yahoo.ycsb.db.HBaseClient.update(HBaseClient.java:302)
        at com.yahoo.ycsb.db.HBaseClient.insert(HBaseClient.java:357)
        at com.yahoo.ycsb.DBWrapper.insert(DBWrapper.java:148)
        at com.yahoo.ycsb.workloads.CoreWorkload.doInsert(CoreWorkload.java:461)
        at com.yahoo.ycsb.ClientThread.run(Client.java:269)

Cd into YCSB directory and run mvn clean package to build the package. Once you see the following output, it means the build is successful.

[INFO] YCSB Root ......................................... SUCCESS [40.653s]
[INFO] Core YCSB ......................................... SUCCESS [46.852s]
[INFO] Cassandra DB Binding .............................. SUCCESS [44.413s]
[INFO] HBase DB Binding .................................. SUCCESS [1:49.114s]
[INFO] Hypertable DB Binding ............................. SUCCESS [45.091s]
[INFO] DynamoDB DB Binding ............................... SUCCESS [38.011s]
[INFO] ElasticSearch Binding ............................. SUCCESS [3:22.121s]
[INFO] Infinispan DB Binding ............................. SUCCESS [2:43.266s]
[INFO] JDBC DB Binding ................................... SUCCESS [13.182s]
[INFO] Mapkeeper DB Binding .............................. SUCCESS [8.313s]
[INFO] Mongo DB Binding .................................. SUCCESS [5.941s]
[INFO] OrientDB Binding .................................. SUCCESS [15.621s]
[INFO] Redis DB Binding .................................. SUCCESS [4.171s]
[INFO] Voldemort DB Binding .............................. SUCCESS [14.630s]
[INFO] YCSB Release Distribution Builder ................. SUCCESS [13.381s]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 12:45.433s
[INFO] Finished at: Thu Sep 12 18:25:37 SGT 2013
[INFO] Final Memory: 65M/165M
[INFO] ------------------------------------------------------------------------


You should be able to look for ycsb-0.1.4.tar.gz file inside YCSB/distribution/target directory. Copy this file to the directory where you have the access permission and untar it. Once untar, copy the hbase-site.xml file from your hbase conf directory to your ycsb-0.1.4/hbase-binding/conf/ directory.

Before you can run the test, you need to start your hdfs (start-dfs.sh)and hbase (start-hbase.sh). Go into hbase shell and create a table call usertable with column family call family.


After create the table and the column family, you can start loading data into your database

$ ~/ycsb-0.1.4/bin/ycsb load hbase -P ~/ycsb-0.1.4/workloads/workloada -p columnfamily=family -p recordcount=10000 -p threadcount=4 -s | tee -a workloada_load.dat

Start running the benchmark with the command below:

$ ~/ycsb-0.1.4/bin/ycsb run hbase -P ~/ycsb-0.1.4/workloads/workloada -p columnfamily=family -p operationcount=10000 -p recordcount=10000 -p threadcount=4 -s | tee -a workloada_run.dat

The steps above are very simple validation using workloada with only 10000 records loaded into database and 10000 operations during the run. Please take note for run operation (especially for read and update operation tests), you need to specify the recordcount also for your test database size. If you never specify, it will use the default value which is specified in the workload files (default is 1000) and this will cause your test to only execute 10000 operations again and again on 1000 records and the rest of the 9000 records will not be accessed at all.

For more details on what are the available workloads, you can refer to the offcial git site.

No comments: