yahoo-inc.com

Keyword search

Guided search

Click a term to initiate a search.

Adaptively parallelizing distributed range queries

Thu, 04/15/2010 - 15:09 — kolb

Authors:

Vigfusson, Ymir; Silberstein, Adam; Cooper, Brian F.; Fonseca, Rodrigo

We consider the problem of how to best parallelize range
queries in a massive scale distributed database. In tradi-
tional systems the focus has been on maximizing paral-
lelism, for example by laying out data to achieve the highest
throughput. However, in a massive scale database such as
our PNUTS system [11] or BigTable [10], maximizing par-
allelism is not necessarily the best strategy: the system has
more than enough servers to saturate a single client by re-
turning results faster than the client can consume them, and
when there are multiple concurrent queries, maximizing par-

Year:

2009

Pig Latin: A not-so-foreign language for data processing

Fri, 10/16/2009 - 13:30 — admin

Authors:

C. Olston, B. Reed, U. Srivastava, R. Kumar, A. Tomkins

There is a growing need for ad-hoc analysis of extremely
large data sets, especially at internet companies where inno-
vation critically depends on being able to analyze terabytes
of data collected every day. Parallel database products, e.g.,
Teradata, oﬀer a solution, but are usually prohibitively ex-
pensive at this scale. Besides, many of the people who ana-
lyze this data are entrenched procedural programmers, who
ﬁnd the declarative, SQL style to be unnatural. The success
of the more procedural map-reduce programming model, and

Year:

2008

Cloud Computing publication categorizer

Keyword search

Guided search

Author

Year

Topic

Tags

mailpart

Citations range

yahoo-inc.com

Adaptively parallelizing distributed range queries

Pig Latin: A not-so-foreign language for data processing

Navigation

User login