Learning HBase

Learning HBase – A nice book from my friend that I can refer to you. Shashwat Shriparv November 2014 Learn the fundamentals of HBase administration and development with the help of real-time scenarios About This Book Learn how HBase works…

Failed to create a table in Hive backed by HBase ?

Failed to create a table in Hive backed by HBase. Will it fix the error to copy hbase-site.xml into /etc/hive/conf/ ? Often when you use Impala with Hbase and first you need create metastore from Hive. $ hive –auxpath /opt/cloudera/parcels/CDH-4.5.0-1.cdh4.5.0.p251.30/lib/hive/lib/zookeeper.jar,/opt/cloudera/parcels/CDH-4.5.0-1.cdh4.5.0.p251.30/lib/hive/lib/hbase.jar,/opt/cloudera/parcels/CDH-4.5.0-1.cdh4.5.0.p251.30/lib/hive/lib/hive-hbase-handler-0.10.0-cdh4.5.0.jar,/opt/cloudera/parcels/CDH-4.5.0-1.cdh4.5.0.p251.30/lib/hive/lib/guava-11.0.2.jar…

Hadoop.In Excel.

Explore and analyze Big Data, without IT overhead https://datanitro.com/hadoop_in_excel.html

Apache Hadoop 2 & Apache Hadoop YARN videos

FYI Session Title Watch View Unlocking Hadoop’s Potential Video Enterprise Hadoop for Pools, Ponds, Clouds and Beyond Video Apache Hadoop YARN: Present and Future Video Slides YARN: The Key to Overcoming the Challenges of Broad-based Hadoop Adoption Video Slides One…

Hadoop Cluster set-up document

Running Hadoop on RHEL Linux (Multi-Node Cluster) Here’s we’ll see how to set-up multi-node Apache Hadoop cluster backed by the Hadoop Distributed File System (HDFS), running on RHEL Linux. Hadoop is a framework written in Java for running applications on…

Hadoop Streaming with Python

Hadoop Streaming Hadoop streaming is a utility that comes with the Hadoop distribution. The utility allows developers to create an run Map/Reduce jobs with any executable or script as the ampper and/or the reducer. For example: hadoop jar /opt/cloudera/parcels/CDH/lib/hadoop-0.20-mapreduce/contrib/streaming/hadoop-streaming-2.0.0-mr1-cdh4.5.0.jar \…

Monitoring Hadoop from the browser

Hadoop provides two web interfaces that you should become familiar with, one for HDFS and the other for MapReduce. Both are useful in pseudo-distributed mode and are critical tools when you have a fully distributed setup. The HDFS web UI…