MapReduce BigData 09
MapReduce BigData 09
Presented By:
Aashraful Islam Souraov
(ID 2025351009)
What is MapReduce? ①
• During a MapReduce job, Hadoop sends the Map and Reduce tasks to the
appropriate servers in the cluster.
• The framework manages all the details of data-passing such as issuing tasks,
verifying task completion, and copying data around the cluster between the
nodes.
• Most of the computing takes place on nodes with data on local disks that
reduces the network traffic.
• After completion of the given tasks, the cluster collects and reduces the data to
form an appropriate result, and sends it back to the Hadoop server.
What are the Advantages & Disadvantages of
MapReduce?
Advantages:
• Scalable (due to simple design)
• Runs on cheap commodity hardware
• Procedural control i.e. we can control of the execution of every step
What are the Advantages & Disadvantages of
MapReduce?
Disadvantages:
• It is not flexible i.e. the MapReduce framework is rigid.
• A lot of manual coding is required, even for common operations such as
join, filter, projection, aggregates, sorting, distinct.
• Semantics are hidden inside the map and reduce functions, so it is difficult
to maintain, extend and optimize them