Premium

Mapr (HP) Hadoop Developer Certification Questions and Answers (Dumps and Practice Questions)



Question : Hadoop will start transferring the data as soon as Mapper finishes it task and it will not wait till last Map Task finished
 : Hadoop will start transferring the data as soon as Mapper finishes it task and it will not wait till last Map Task finished
1. True
2. False

Correct Answer : Get Lastest Questions and Answer :


Explanation: In practice, Hadoop will start trnsfer data from Mappers to Reducers as the Mappers finish work.

The developer can specify the percentage of Mappers which should finish before Reducers start

retrieving data.

The developers reduce method still does not start untill all intermediate data has been transferred and sorted.

Refer HadoopExam.com Recorded Training Module : 3 and 4






Question : If a Mapper runs slow relative to other than ?


  : If a Mapper runs slow relative to other than ?
1. No reducer can start until last Mapper finished
2. If mapper is running slow then another instance of Mapper will be started by Hadoop on another machine
3. Access Mostly Uused Products by 50000+ Subscribers
4. The result of the first mapper finished will be used
5. All of the above

Correct Answer : Get Lastest Questions and Answer :


Explanation: It is possible for one Map Task to run more slowly than the others.

-- Perhaps due to faulty Hardware, or just a very slow machine.
-- The reduce method in the Reducer cannot start until every Mapper has finished.

Hadoop Uses Speculative Execution :
-- If a Mapper appears to be running significantly more slowly than the others, a new instance of Mapper will be
started on another machine, operating on same machine.

-- The result of the first Mapper to finish will be used.
-- Hadoop will kill off the Mapper which is still running.

Refer HadoopExam.com Recorded Training Module : 3 and 4





Question : What is the Combiner ?

  : What is the Combiner ?
1. Runs locally on a single Mappers output
2. Using Combiner can reduce the network traffic
3. Access Mostly Uused Products by 50000+ Subscribers
4. None of the 1,2 and 3
5. All 1,2 and 3 applicable to the Combiner


Correct Answer : Get Lastest Questions and Answer :

Often, Mappers produce large amounts of intermediate data
- The data must be passed to the Reducers
- This can result in a lot of network traffic.

You can specify the Combiner, which is consider mini-reducer
- Combiner runs locally on a single Mappers output.
- Output from the Combiner is sent to the Reducers.
- Input and Output data types for the Combiner and Reducer must be identical.

Combiner can be applied only when operation performed is commutative and associative.

Note : The Combiner may run once, or more than once, on the output from any given Mapper.

Do not put the in the Combiner which could influence your results if it runs more than once.

Refer HadoopExam.com Recorded Training Module : 3



Related Questions


Question :
You have the following key-value pairs as output from your Map task:
(HadoopExam, 1)
(Is, 1)
(the, 1)
(best, 1)
(material, 1)
(provider, 1)
(for, 1)
(the, 1)
(Hadoop, 1)
How many keys will be passed to the Reducer's reduce() method?


  :
1. 9
2. 8
3. Access Mostly Uused Products by 50000+ Subscribers
4. 6
5. 5


Question : While processing the file using MapReduce framework, the output of the Mapper which we call as
intermediate key-value pairs, select the correct statement for this output of the mappers.
 : While processing the file using MapReduce framework, the output of the Mapper which we call as
1. Intermediate key-value pairs are written to the HDFS of the machines running the map tasks, and then copied to the machines running the reduce tasks.
2. Intermediate key-value pairs are written to the local disks of the machines running the reduce tasks.
3. Access Mostly Uused Products by 50000+ Subscribers
4. Intermediate key-value pairs are written to the local disks of the machines running the map tasks, and then read by the machines running the reduce tasks.


Question : HadoopExam stores everyday, the users IP address+location as a string in the file as well as
number of total clicks as an Integer (Incremented for each click) and this is quite huge file,
where the keys are strings (address+location), and the values are integers (clicks).
For each unique key, you want to identify the largest integer. In writing a MapReduce program to accomplish this,
using the combine is advantageous ?
 : HadoopExam stores everyday, the users IP address+location as a string in the file as well as
1. Yes
2. No
3. Access Mostly Uused Products by 50000+ Subscribers
4. Yes, if configured while cluster setup


Question : A MapReduce program has two components: one that implements the mapper, and another that implements the reducer. You have to implement
map() method for the Mapper and reduce() method for the reducer. When is the earliest that the reduce() method of any reduce task of your submitted
job will be called?
 : A MapReduce program has two components: one that implements the mapper, and another that implements the reducer. You have to implement
1. Not until all map tasks have completed
2. As soon as first map tasks have completed
3. Access Mostly Uused Products by 50000+ Subscribers
4. It can be started any time during the Job no particular time


Question : While processing Timeseries data of the QuickTechie Inc log file using MapReduce ETL batch job you have set up the number of reducers
to 1 (one) . Select the correct statement which applies.
  : While processing Timeseries data of the QuickTechie Inc log file using MapReduce ETL batch job you have set up the number of reducers
1. A single reducer gathers and processes all the output from all the mappers. The output is written to a multiple file in HDFS.
2. Number of reducers can not be configured, it is determined by the NameNode during runtime.
3. Access Mostly Uused Products by 50000+ Subscribers
4. A single reducer will process all the output from all the mappers. The output is written to a single file in HDFS.


Question : You have created a MapReduce job to process TimeSeries Market Data file with the driver class called
HadoopDriver (in the default package) packaged into a jar called HadoopExam.jar, what is the appropriate way to submit this job to the cluster?
  : You have created a MapReduce job to process TimeSeries Market Data file with the driver class called
1. hadoop jar HadoopExam.jar HadoopDriver outputdir inputdir
2. hadoop inputdir outputdir jar HadoopExam.jar HadoopDriver
3. Access Mostly Uused Products by 50000+ Subscribers
4. hadoop jar HadoopExam.jar HadoopDriver inputdir outputdir