Apache Hadoop

Posted on January 22, 2012. Filed under: Uncategorized |

Apache Hadoop is a framework for running applications on large cluster built of commodity hardware. The Hadoop framework transparently provides applications both reliability and data motion. Hadoop implements a computational paradigm named Map/Reduce, where the application is divided into many small fragments of work, each of which may be executed or re-executed on any node in the cluster. In addition, it provides a distributed file system (HDFS) that stores data on the compute nodes, providing very high aggregate bandwidth across the cluster. Both MapReduce and the Hadoop Distributed File System are designed so that node failures are automatically handled by the framework.

General Information

  • HBase, a Bigtable-like structured storage system for Hadoop HDFS
  • Apache Pig is a high-level data-flow language and execution framework for parallel computation. It is built on top of Hadoop Core.
  • Hive a data warehouse infrastructure which allows sql-like adhoc querying of data (in any format) stored in Hadoop
  • ZooKeeper is a high-performance coordination service for distributed applications.
  • Hama, a Google’s Pregel-like distributed computing framework based on BSP (Bulk Synchronous Parallel) computing techniques for massive scientific computations.
  • Mahout, scalable Machine Learning algorithms using Hadoop

User Documentation

Setting up a Hadoop Cluster

Tutorials

MapReduce

The MapReduce algorithm is the foundational algorithm of Hadoop, and is critical to understand.

Contributed parts of the Hadoop codebase

  • These are independent modules that are in the Hadoop codebase but not tightly integrated with the main project -yet.

Developer Documentation

Make a Comment

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Liked it here?
Why not try sites on the blogroll...

%d bloggers like this: