Upgrading Apache Phoenix in HDP Cluster

About new Hadoop cluster we set up, the phoenix version bundled with HDP distribution(4.7) had some bugs which would make it impossible to use to run BI queries. There was no way provided by HDP to upgrade phoenix as we were using the latest version. Looking around on the internet, I found that manually we can replace the related jars and bins to have a new version in place. So that’s what I tried....

November 18, 2017 · 2 min · Sanket

HBase YouAreDeadException: Dead RegionServer due to GC Pause

So the CDH Cluster was replaced by HDP Cluster and everything was going smooth for time being. Until the time when I started getting a dead RegionServer. Frequently. So a deep dive was needed to dig out what indeed was happening. And it turned out to be a long dive. The following was the logline: 2017-05-23 06:59:22,173 FATAL [regionserver/<hostname>/10.10.205.55:16020] regionserver.HRegionServer: ABORTING region server <hostname>,16020,1493962926376: org.apache.hadoop.hbase.YouAreDeadException: Server REPORT rejected; currently processing<hostname>,16020,1493962926376 as dead server This alone did not tell much....

May 26, 2017 · 3 min · Sanket

Migrating OpenTSDB to Another HBase Cluster

As a part of migration from CDH cluster to HDP cluster, we also had to migrate OpenTSDB which was running on CDH cluster. There are many methods to copy/transfer data between clusters and what we used here was ExportSnapshot. So these are the steps roughly: Stop TSDs Take snapshot(s) Transfer snapshots Restore snapshots Modify and start TSDs Steps 1 and 5 are self understood. We will look at how to take,transfer and restore snapshots....

April 24, 2017 · 2 min · Sanket

Stuff You Can Do While Tuning HBase

So you are setting up HBase! Congratulations! When it comes to tuning HBase there are so many things you can do. And most of the things will be dependent upon type of data you will be storing and it’s access patterns. So I will be saying this a lot: ‘value of this parameter depends upon your workload’. Here I will try to enlist some of the variables that you can tweak while tuning hbase....

April 9, 2017 · 4 min · Sanket

HBase Benchmarking

Currently I am working with new setup of Apache HBase cluster to query data using Phoenix on top of HDP Distribution. After setting up cluster, the values for heap, cache and timeouts were all defaults. Now I needed to know how good is the cluster in current shape and how can it be improved. Now for the improvement part, understanding of HBase internals is needed. How does a write work in HBase....

March 5, 2017 · 5 min · Sanket