Learning HBase
Learning HBase – A nice book from my friend that I can refer to you. Shashwat Shriparv November 2014 Learn the fundamentals of HBase administration and development with the help of real-time scenarios About This Book Learn how HBase works…
Failed to create a table in Hive backed by HBase ?
Failed to create a table in Hive backed by HBase. Will it fix the error to copy hbase-site.xml into /etc/hive/conf/ ? Often when you use Impala with Hbase and first you need create metastore from Hive. $ hive –auxpath /opt/cloudera/parcels/CDH-4.5.0-1.cdh4.5.0.p251.30/lib/hive/lib/zookeeper.jar,/opt/cloudera/parcels/CDH-4.5.0-1.cdh4.5.0.p251.30/lib/hive/lib/hbase.jar,/opt/cloudera/parcels/CDH-4.5.0-1.cdh4.5.0.p251.30/lib/hive/lib/hive-hbase-handler-0.10.0-cdh4.5.0.jar,/opt/cloudera/parcels/CDH-4.5.0-1.cdh4.5.0.p251.30/lib/hive/lib/guava-11.0.2.jar…
Hadoop Streaming
Hadoop Streaming —————- Hadoop streaming is a utility that comes with the Hadoop distribution. The utility allows you to create and run Map/Reduce jobs with any executable or script as the mapper and/or the reducer. For example: $HADOOP_HOME/bin/hadoop jar $HADOOP_HOME/hadoop-streaming.jar…
Hadoop Archives & MR
Overview ——– Hadoop archives are special format archives. A Hadoop archive maps to a file system directory. A Hadoop archive always has a *.har extension. A Hadoop archive directory contains metadata (in the form of _index and _masterindex) and data…
MongoDB Connector for Hadoop
Purpose The MongoDB Connector for Hadoop is a library which allows MongoDB (or backup files in its data format, BSON) to be used as an input source, or output destination, for Hadoop MapReduce tasks. It is designed to allow greater…
Hadoop.In Excel.
Explore and analyze Big Data, without IT overhead https://datanitro.com/hadoop_in_excel.html
Apache Hadoop 2 & Apache Hadoop YARN videos
FYI Session Title Watch View Unlocking Hadoop’s Potential Video Enterprise Hadoop for Pools, Ponds, Clouds and Beyond Video Apache Hadoop YARN: Present and Future Video Slides YARN: The Key to Overcoming the Challenges of Broad-based Hadoop Adoption Video Slides One…
Hadoop Cluster set-up document
Running Hadoop on RHEL Linux (Multi-Node Cluster) Here’s we’ll see how to set-up multi-node Apache Hadoop cluster backed by the Hadoop Distributed File System (HDFS), running on RHEL Linux. Hadoop is a framework written in Java for running applications on…
Hadoop Streaming with Python
Hadoop Streaming Hadoop streaming is a utility that comes with the Hadoop distribution. The utility allows developers to create an run Map/Reduce jobs with any executable or script as the ampper and/or the reducer. For example: hadoop jar /opt/cloudera/parcels/CDH/lib/hadoop-0.20-mapreduce/contrib/streaming/hadoop-streaming-2.0.0-mr1-cdh4.5.0.jar \…
Monitoring Hadoop from the browser
Hadoop provides two web interfaces that you should become familiar with, one for HDFS and the other for MapReduce. Both are useful in pseudo-distributed mode and are critical tools when you have a fully distributed setup. The HDFS web UI…