Facebook recently deployed Facebook Messages, its first ever
user-facing application built on the Apache Hadoop platform.
Apache HBase is a database-like layer built on Hadoop designed
to support billions of messages per day. This paper describes the
reasons why Facebook chose Hadoop and HBase over other
systems such as Apache Cassandra and Voldemort and discusses
the application’s requirements for consistency, availability,
partition tolerance, data model and scalability. We explore the
enhancements made to Hadoop to make it a more effective
Hadoop: The Definitive Guide helps you harness the power of your data. Ideal for processing large datasets, the Apache Hadoop framework is an open source implementation of the MapReduce algorithm on which Google built its empire. This comprehensive resource demonstrates how to use Hadoop to build reliable, scalable, distributed systems: programmers will find details for analyzing large datasets, and administrators will learn how to set up and run Hadoop clusters.
Complete with case studies that illustrate how Hadoop solves specific problems, this book helps you:
We have been using HBase for around a year in our development and projects, from 0.17.x to
0.19.x. We and all in the community know the critical performance and reliability issues of these
releases.
Now, the great news is that HBase‐0.20.0 will be released soon. Jonathan Gray from Streamy,
Ryan Rawson from StumbleUpon, Michael Stack from Powerset/Microsoft, Jean‐Daniel Cryans
from OpenPlaces, and other contributors had done a great job to redesign and rewrite many