Currently I am working with new setup of Apache HBase cluster to query data using Phoenix on top of HDP Distribution. After setting up cluster, the values for heap, cache and timeouts were all defaults. Now I needed to know how good is the cluster in current shape and how can it be improved. Continue reading HBase Benchmarking
So the other day I had to create a CentOS 6 AMI for HDP installation as it had Hue package available only for CentOS 6. I launched an instance with EBS attached of 10 GB with CentOS 6. Went on to create AMI out of it with EBS size of 100GB.
These all went good and I proceed with launching instances for HDP cluster (12 was the number of instances). Everything went good and installation was complete. Later only Ambari Server started throwing warnings about disk space. Despite attaching a 100 GB EBS. Continue reading Resize EBS Root Volume of CentOS 6 AMI
The other day I faced a problem with monitoring setup and I found that the WebUI is not responding. I SSHed into server and checked if process is running. It was. Checked if port was open. It was. So as it happened, the process was running and listening on port but it was stuck somewhere and it was not accepting connection. So there it was, a running stuck process. Continue reading Debugging Stuck Process in Linux
I was reading on HDFS (Hadoop’s distributed file system) and it’s internals. How does it store data. What is reading path. What is writing path. How does replication works. And to understand it better my mentor suggested me to implement the same. And so I made PyDFS. (Screenshots at bottom of the post) Continue reading Simple Distributed File System in Python : PyDFS