Question : A combiner reduce the amount of data sent to the Reducer ?
1. True 2. False
Correct Answer : 1
Explanation: Often, Mappers produce large amounts of intermediate data - The data must be passed to the Reducers - This can result in a lot of network traffic.
You can specify the Combiner, which is consider mini-reducer - Combiner runs locally on a single Mappers output. - Output from the Combiner is sent to the Reducers. - Input and Output data types for the Combiner and Reducer must be identical.
Combiner can be applied only when operation performed is commutative and associative.
Note : The Combiner may run once, or more than once, on the output from any given Mapper.
Do not put the in the Combiner which could influence your results if it runs more than once.
Refer HadoopExam.com Recorded Training Module : 3
Question :
Combiner reduces the network traffic but increases the amount of work needed to be done by the reducer ?
1. True 2. False
Correct Answer : 2
Explanation: Combiner decreases the amount of network traffice required during the shuffle and sort phase and often also decreases the amount of work needed to be done by the reducer.
Often, Mappers produce large amounts of intermediate data - The data must be passed to the Reducers - This can result in a lot of network traffic.
You can specify the Combiner, which is consider mini-reducer - Combiner runs locally on a single Mappers output. - Output from the Combiner is sent to the Reducers. - Input and Output data types for the Combiner and Reducer must be identical.
Combiner can be applied only when operation performed is commutative and associative.
Note : The Combiner may run once, or more than once, on the output from any given Mapper.
Do not put the in the Combiner which could influence your results if it runs more than once.
Refer HadoopExam.com Recorded Training Module : 3
Question :
Which is the correct for Pseudo-Distributed mode of the Hadoop
1. This a single machine cluster 2. All daemons run on the same machine 3. It does not require to run all the daemon in this mode 4. All 1,2 and 3 are correct 5. Only 1 and 2 are correct
Correct Answer : 5
Explanation: A developer will configure their machine to run in Pseudo-Distributed mode
This effectively creates a single machine cluster All five Hadoop daemons are running on the same machine Very useful for testing code before it is deployed to the real cluster
Refer HadoopExam.com Recorded Training Module : 14 and 16
1. While job is running the intermediate data is keep deleted 2. Reducers write their final output to HDFS 3. Intermediate data is never deleted, HDFS stores them for History Tracking 4. All 1,2 and 3 are correct 5. None of the above