Question : Hadoop will start transferring the data as soon as Mapper finishes it task and it will not wait till last Map Task finished 1. True 2. False
Explanation: In practice, Hadoop will start trnsfer data from Mappers to Reducers as the Mappers finish work.
The developer can specify the percentage of Mappers which should finish before Reducers start
retrieving data.
The developers reduce method still does not start untill all intermediate data has been transferred and sorted.
Refer HadoopExam.com Recorded Training Module : 3 and 4
Question : If a Mapper runs slow relative to other than ?
1. No reducer can start until last Mapper finished 2. If mapper is running slow then another instance of Mapper will be started by Hadoop on another machine 3. Access Mostly Uused Products by 50000+ Subscribers 4. The result of the first mapper finished will be used 5. All of the above
Explanation: It is possible for one Map Task to run more slowly than the others.
-- Perhaps due to faulty Hardware, or just a very slow machine. -- The reduce method in the Reducer cannot start until every Mapper has finished.
Hadoop Uses Speculative Execution : -- If a Mapper appears to be running significantly more slowly than the others, a new instance of Mapper will be started on another machine, operating on same machine.
-- The result of the first Mapper to finish will be used. -- Hadoop will kill off the Mapper which is still running.
Refer HadoopExam.com Recorded Training Module : 3 and 4
Often, Mappers produce large amounts of intermediate data - The data must be passed to the Reducers - This can result in a lot of network traffic.
You can specify the Combiner, which is consider mini-reducer - Combiner runs locally on a single Mappers output. - Output from the Combiner is sent to the Reducers. - Input and Output data types for the Combiner and Reducer must be identical.
Combiner can be applied only when operation performed is commutative and associative.
Note : The Combiner may run once, or more than once, on the output from any given Mapper.
Do not put the in the Combiner which could influence your results if it runs more than once.