Premium

Mapr (HP) Hadoop Developer Certification Questions and Answers (Dumps and Practice Questions)



Question : Default separator between key and value is tab
 : Default separator between key and value is tab
1. True
2. False

Correct Answer : Get Lastest Questions and Answer :
Explanation:




Question : Select correct statement regarding Reducer

 : Select correct statement regarding Reducer
1. Each reducer will take , partitioned generated and decided by Hadoop framework as an input. And processes one iterable list of key-value
pairs at a time.

2. Reducer generates output as a patitioned file in a format part-r-0000x

3. Access Mostly Uused Products by 50000+ Subscribers

4. 1,2
5. 1,2,3

Correct Answer : Get Lastest Questions and Answer :
Explanation: Output of the mapper is first written on the local disk for sorting and shuffling process. It is also in the form of key-value
pair. And then it is merged and finally given to reducer.

MapReduce makes the guarantee that the input to every reducer is sorted by key. The process by which the system performs the sort "and transfers the map
outputs to the reducers as inputs "is known as the shuffle. It is said that the shuffle is the heart of MapReduce and is where the cemagic Â? happens.

Output of all mapper goes to all reducer. During the reduce phase, the reduce function is invoked for each key in the sorted output. The output of this
phase is written to the output filesystem, typically HDFS. The key-value pair provided as output by reducer is passed to the OutputFormat which are then
written to HDFS. It also provides RecordWriter class that writes individual records to the file mentioned in setOutputPath(). Each reducer writes separate
file in the output directory and these files are named as part-00000.

One can also use the combiner for the optimization purpose. Combiner is conceptually placed after the map block and it reduces the output particular to that
block of map. It is generally termed as mini-reducer. It also reduces the network lag.




Question : Select correct statement regarding input key-values of a Mapper class


 : Select correct statement regarding input key-values of a Mapper class
1. Whatever you have configured as an input key and value type must match in the Mapper class

2. Input key and value type defined on the Mapper class level must match in map() method arguments

3. Access Mostly Uused Products by 50000+ Subscribers

4. 1,2
5. 1,2,3


Correct Answer : Get Lastest Questions and Answer :
Explanation: All below three points are correct regarding input key and value type for a Mapper class.
A. Whatever you have configured as an input key and value type must match in the Mapper class
B. Input key and value type defined on the Mapper class level must match in map() method arguments
C. Output key and value class type must match with the input class of the Mapper class



Related Questions


Question : You've written a MapReduce job based on HadoopExam websites log file named MAIN.PROFILE.log file , resulting in an extremely
large amount of output data. Which of the following cluster resources will your job stress? ?
 : You've written a MapReduce job based on HadoopExam websites log file named MAIN.PROFILE.log file , resulting in an extremely
1. network I/O and disk I/O
2. network I/O and RAM
3. Access Mostly Uused Products by 50000+ Subscribers
4. RAM , network I/O and disk I/O


Question : You have written a Mapper which invokes the following five calls to the OutputCollector.collect method:

output.collect(new Text("Flag"), new Text("Rahul"));
output.collect(new Text("Shirt"), new Text("Yakul"));
output.collect(new Text("Shoe"), new Text("Rahul"));
output.collect(new Text("Flag"), new Text("Gemini"));
output.collect(new Text("Socks"), new Text("Yakul"));

How many times will the Reducer's reduce() method be invoked.

 : You have written a Mapper which invokes the following five calls to the OutputCollector.collect method:
1. 5
2. 4
3. Access Mostly Uused Products by 50000+ Subscribers
4. 7
5. 8


Question : ___________ is an optimization technique where a computer system performs some task that may not be actually needed. The main idea is to
do work before it is known whether that work will be needed at all, so as to prevent a delay that would have to be incurred by doing the work after it
is known whether it is needed. If it turns out the work was not needed after all, the results are ignored. The Hadoop framework also provides a
mechanism to handle machine issues such as faulty configuration or hardware failure. The JobTracker detects that one or a number of
machines are performing poorly and starts more copies of a map or reduce task. This behaviour is known as ________________

 : ___________ is an optimization technique where a computer system performs some task that may not be actually needed. The main idea is to
1. Task Execution
2. Job Execution
3. Access Mostly Uused Products by 50000+ Subscribers
4. Speculative Execution


Question :
You are working in the HadoopExam consultancy team and written a MapReduce and Pig job, which of the following is correct statement?

  :
1. Pig comes with additional capabilities to MapReduce. Pig programs are executed as MapReduce jobs via the Pig interpreter.
2. Pig comes with no additional capabilities to MapReduce. Pig programs are executed as MapReduce jobs via the Pig interpreter.
3. Access Mostly Uused Products by 50000+ Subscribers
4. Pig comes with additional capabilities to MapReduce. Pig programs are executed as MapReduce jobs via the Pig interpreter.


Question : Everyday HadoopExam has a good number of subscribers, but the file size created from this information is
smaller than 64MB, and same 64MB is configured as a block size on the cluster.
You are running a job that will process this file as a single input split on a cluster which has no other jobs currently running,
and with all settings at their default values. Each node has an equal number of open Map slots.
On which node will Hadoop first attempt to run the Map task?

  : Everyday HadoopExam has a good number of subscribers, but the file size created from this information is
1. The node containing the first TaskTracker to heartbeat into the JobTracker, regardless of the location of the input split
2. The node containing the first JobTracker to heartbeat into the Namenode, regardless of the location of the input split
3. Access Mostly Uused Products by 50000+ Subscribers
4. The node containing nearest location of the input split


Question : You are working on a project of HadoopExam client where you need to chain together MapReduce and Pig jobs.
You also need the ability to use forks, decision points, and path joins.
Which of the following ecosystem projects allows you to accomplish this?

 : You are working on a project of HadoopExam client where you need to chain together MapReduce and Pig jobs.
1. Oozie
2. MapReduce chaining
3. Access Mostly Uused Products by 50000+ Subscribers
4. Zookeeper
5. Hue