Unit 5 Lecture 5
Unit 5 Lecture 5
1. MapReduce
2. Map Reduce Model
3. Fault Tolerance
4. Efficiency
5. Important Questions
6. References
Datastore
• Google and Amazon offer simple transactional <Key, Value> pair database stores
– Google App Engine’s Datastore
– Amazon’ SimpleDB
• All entities (objects) in Datastore reside in one BigTable table
– Does not exploit column-oriented storage
• Entities table: store data as one column family
Datastore
• Multiple index tables are used to support efficient queries
• BigTable:
– Horizontally partitioned (also called sharded) across disks
– Sorted lexicographically by the key values
• Beside lexicographic sorting Datastore enables:
– Efficient execution of prefix and range queries on key values
• Entities are ‘grouped’ for transaction purpose
– Keys are lexicographic by group ancestry
• Entities in the same group: stored close together on disk
• Index tables: support a variety of queries
– Uses values of entity attributes as keys
Datastore
• Automatically created indexes:
– Single-Property indexes
• Supports efficient lookup of the records with WHERE clause
– ‘Kind’ indexes
• Supports efficient lookup of queries of form SELECT ALL
• Configurable indexes
– Composite index:
• Retrieves more complex queries
• Query execution
– Indexes with highest selectivity is chosen
MapReduce
• MapReduce is a processing technique and a program model for distributed computing based on java.
• The MapReduce algorithm contains two important tasks, namely Map and Reduce.
• Map takes a set of data and converts it into another set of data, where individual elements are broken
down into tuples (key/value pairs).
• Secondly, reduce task, which takes the output from a map as an input and combines those data tuples into
a smaller set of tuples. As the sequence of the name MapReduce implies, the reduce task is always performed after
the map job.
• The major advantage of MapReduce is that it is easy to scale data processing over multiple computing
nodes.
• Under the MapReduce model, the data processing primitives are called mappers and reducers.
• Decomposing a data processing application into mappers and reducers is sometimes nontrivial.
• But, once we write an application in the MapReduce form, scaling the application to run over hundreds,
thousands, or even tens of thousands of machines in a cluster is merely a configuration change.
• This simple scalability is what has attracted many programmers to use the MapReduce model.
The Algorithm
• Generally MapReduce paradigm is based on sending the computer to where the data resides!
• MapReduce program executes in three stages, namely map stage, shuffle stage, and reduce stage.
• Map stage − The map or mapper’s job is to process the input data. Generally the input data is in the form of
file or directory and is stored in the Hadoop file system (HDFS). The input file is passed to the mapper
function line by line. The mapper processes the data and creates several small chunks of data.
• Reduce stage − This stage is the combination of the Shuffle stage and the Reduce stage. The Reducer’s job is
to process the data that comes from the mapper. After processing, it produces a new set of output, which will
be stored in the HDFS.
• During a MapReduce job, Hadoop sends the Map and Reduce tasks to the appropriate servers in the cluster.
• The framework manages all the details of data-passing such as issuing tasks, verifying task completion, and
copying data around the cluster between the nodes.
• Most of the computing takes place on nodes with data on local disks that reduces the network traffic.
• After completion of the given tasks, the cluster collects and reduces the data to form an appropriate result, and
sends it back to the Hadoop server.
MapReduce Model
• Map phase:
– Each mapper reads approximately 1/M of the input from the global file system, using
locations given by the master
– Map operation consists of transforming one set of key-value pairs to
another:
• Reduce phase:
– The master informs the reducers where the partial computations have been stored
on local files of respective mappers
– Reducers make remote procedure call requests to the mappers to fetch the files
– Each reducer groups the results of the map step using the same key and performs a
function f on the list of values that correspond to these key value:
• 3 mappers; 2 reducers
• Map function:
• Reduce function:
MapReduce: Fault Tolerance
• Heartbeat communication
– Updates are exchanged regarding the status of tasks assigned to workers
– Communication exists, but no progress: master duplicate those tasks and assigns to processors who have
already completed
• If a mapper fails, the master reassigns the key-range designated to it to another working node for re-execution
– Re-execution is required as the partial computations are written into local files,
rather than GFS file system
• If a reducer fails, only the remaining tasks are reassigned to another node, since the completed tasks are already
written back into GFS
MapReduce: Efficiency
• General computation task on a volume of data D
• Takes wD time on a uniprocessor (time to read data from disk + performing
computation + time to write back to disk)
• Time to read/write one word from/to disk = c
• Now, the computational task is decomposed into map and reduce stages as follows:
– Map stage:
• Mapping time = cmD
• Data produced as output = σD
– Reduce stage:
• Reducing time = crσD
• Data produced as output = σµD
MapReduce: Efficiency
• Considering no overheads in decomposing a task into a map and a reduce stages, we have
the following relation:
𝒘 𝑫 = 𝒄 𝑫 + 𝒄 𝒎𝑫 + 𝒄 𝒓 𝝈 𝑫 + 𝒄 𝝈 µ𝑫
• Now, we use P processors that serve as both mapper and reducers in respective phases to
solve the problem
• Additional overhead:
– Each mapper writes to its local disk followed by each reducer remotely reading from the local disk of
each mapper
• For analysis purpose: time to read a word locally or remotely is same
• 𝒘
Time to read data from disk by each mapper =
𝑫�
𝝈 �
• Data produced by each mapper =
𝑫�
�
MapReduce: Efficiency
𝒄𝝈
• Time required to write into local disk = 𝑫
�
𝑷𝟐
• 𝝈𝑫
Data read by each reducer from its partition�
in each of P mappers =
• The entire exchange can be executed in P steps, with each reducer r reading from mapper r + i mod r in
step i
𝒄𝝈𝑫 ⨯ P = 𝒄𝝈𝑫
𝑷𝟐 𝑷
• Transfer time from mapper local disk to GFS for each reducer =
• Total overhead in parallel implementation due to intermediate disk reads and
writes = ( 𝒘 𝑫 + 𝟐𝒄 )
𝝈𝑫
𝑷
𝑷
• Parallel efficiency of the MapReduce implementation:
𝒘𝑫 𝟏
𝟏+ 𝟐
𝜺𝑴𝑹 =
( 𝑷 +𝟐𝒄 𝑷 )
=
𝒄
𝒘 𝑫 𝝈𝑫
𝑷
𝒘
𝝈
MapReduce: Applications
– The map task consists of emitting a word-document/record-id pair for each word:
– The reduce step groups the pairs by word and creates an index entry for each word:
• Relational operations using MapReduce
– Execute SQL statements (relational joins/group by) on large data sets
– Advantages over parallel database
• Large scale
• Fault-tolerance
Important Questions