Flying Yellow Elephant: Predictable and Efficient MapReduce in the Cloud

Authors: 
Schad, J
Author: 
Schad, J

Today, growing datasets require new technologies as standard tech-
nologies — such as parallel DBMSs — do not easily scale to such
level. On the one side, there is the MapReduce paradigm allow-
ing non-expert users to easily define large distributed jobs. On the
other side, there is Cloud Computing providing a pay-as-you-go
infrastructure for such computations. This PhD project aims at im-
proving the combination of both technologies, especially for the
following issues: (i) predictability of performance, (ii) runtime op-
timization and (iii) Cloud-aware scheduling. These issues can re-
sult in significant runtime overhead or non-optimal use of comput-
ing resources, which in a Cloud setting directly correlates to high
monetary cost. We present preliminary results that confirm a signif-
icant improvement on performance when addressing some of these
issues. Further, we discuss research challenges and initial ideas for
above mentioned issues.

Year: 
2010
Venue: 
VLDB 2010 PhD workshop
URL: 
http://infosys.cs.uni-saarland.de/publications/Sch10.pdf
Citations range: 
n/a
AttachmentSize
Schad2010FlyingYellowElephantPredictableandEfficientMapReducein.pdf473.57 KB