Mapr (HP) Hadoop Developer Certification Questions and Answers (Dumps and Practice Questions)

Question : Using Hadoop mapreduce framework, you have to use Unix /bin/cat command as a Mapper and /bin/wc as a reducer. Select the correct option from below command.

1. $HADOOP_HOME/bin/hadoop jar $HADOOP_HOME/hadoop-streaming.jar \
-input myInputDirs \
-output myOutputDir \
-mapR /bin/cat -reducer /bin/wc
2. $HADOOP_HOME/bin/hadoop \
-input myInputDirs \
-output myOutputDir \
-mapper /bin/cat \
-reducer /bin/wc
3. $HADOOP_HOME/bin/hadoop jar $HADOOP_HOME/hadoop-streaming.jar \
-input myInputDirs \
-output myOutputDir \
-map /bin/cat \
-red /bin/wc
4. $HADOOP_HOME/bin/hadoop jar $HADOOP_HOME/hadoop-streaming.jar \
-input myInputDirs \
-output myOutputDir \
-mapper /bin/cat \
-reducer /bin/wc

Correct Answer : Get Lastest Questions and Answer :
Explanation: Hadoop streaming is a utility that comes with the Hadoop distribution. The utility allows you to create and run Map/Reduce jobs with
any executable or script as the mapper and/or the reducer. For example:

$HADOOP_HOME/bin/hadoop jar $HADOOP_HOME/hadoop-streaming.jar \
-input myInputDirs \
-output myOutputDir \
-mapper /bin/cat \
-reducer /bin/wc

In the above example, both the mapper and the reducer are executables that read the input from stdin (line by line) and emit the output to stdout.
The utility will create a Map/Reduce job, submit the job to an appropriate cluster, and monitor the progress of the job until it completes.

When an executable is specified for mappers, each mapper task will launch the executable as a separate process when the mapper is initialized.
As the mapper task runs, it converts its inputs into lines and feed the lines to the stdin of the process. In the meantime,
the mapper collects the line oriented outputs from the stdout of the process and converts each line into a key/value pair, which is collected as the output of the mapper.

Question : Below snippet submits new streaming job for Hadoop MapReduce.

$HADOOP_HOME/bin/hadoop jar $HADOOP_HOME/hadoop-streaming.jar \
-input myInputDirs \
-output myOutputDir \
-mapper /bin/cat \
-reducer /bin/wc

Now input line submitted as below.

We are learning BigData
Training provided by HadoopExam.com
MapReduce is nice to learn

While processing above line what will be the Keys.

1. [W,T,M]

2. Size of each line in Bytes e.g. [1000, 1050, 900]

3. Access Mostly Uused Products by 50000+ Subscribers

4. Entire content of the line [We are learning BigData, Training provided by HadoopExam.com, MapReduce is nice to learn]

Correct Answer : Get Lastest Questions and Answer :
Explanation: When an executable is specified for mappers, each mapper task will launch the executable as a separate process when the mapper is initialized. As the mapper
task runs, it
converts its inputs into lines and feed the lines to the stdin of the process. In the meantime, the mapper collects the line oriented outputs from the stdout of the process and
converts each line into a key/value pair, which is collected as the output of the mapper. By default, the prefix of a line up to the first tab character is the key and the rest of
the line (excluding the tab character) will be the value. If there is no tab character in the line, then entire line is considered as key and the value is null. However, this can be
customized

Question : Below snippet submits new streaming job for Hadoop MapReduce.

$HADOOP_HOME/bin/hadoop jar $HADOOP_HOME/hadoop-streaming.jar \
-input myInputDirs \
-output myOutputDir \
-mapper /bin/cat \
-reducer /bin/wc

Now input line submitted as below.

We [tab]are learning BigData
Training [tab] provided by HadoopExam.com
Map[tab]Reduce is nice to learn

While processing above line what will be the Keys.

1. [W,T,M]

2. Size of each line in Bytes e.g. [1000, 1050, 900]

3. Access Mostly Uused Products by 50000+ Subscribers

4. Entire content of the line [We are learning BigData, Training provided by HadoopExam.com, MapReduce is nice to learn]

5. [We, Training, Map]

Correct Answer : Get Lastest Questions and Answer :
Explanation: When an executable is specified for mappers, each mapper task will launch the executable as a separate process when the mapper is initialized. As the mapper
task runs, it
converts its inputs into lines and feed the lines to the stdin of the process. In the meantime, the mapper collects the line oriented outputs from the stdout of the process and
converts each line into a key/value pair, which is collected as the output of the mapper. By default, the prefix of a line up to the first tab character is the key and the rest of
the line (excluding the tab character) will be the value. If there is no tab character in the line, then entire line is considered as key and the value is null. However, this can be
customized.

When an executable is specified for reducers, each reducer task will launch the executable as a separate process then the reducer is initialized. As the reducer task runs, it
converts
its input key/values pairs into lines and feeds the lines to the stdin of the process. In the meantime, the reducer collects the line oriented outputs from the stdout of the process,
converts each line into a key/value pair, which is collected as the output of the reducer. By default, the prefix of a line up to the first tab character is the key and the rest of
the line (excluding the tab character) is the value

Related Questions

Question : Which Daemons control the Hadoop Mapreduce Job

1. TaskTracker
2. NameNode
3. Access Mostly Uused Products by 50000+ Subscribers
4. JobTracker

Question : Arrange the life cycle of the Mapreduce Job based on below option
1. Each Nodes which run SOFTWARE DAEMON known as Tasktracker
2. Clients submit the Mapreduce Job to the Jobtracker
3. The Jobtracker assigns Map and reduce Tasks to the other nodes on the cluster
4. The TaskTracker is responsible for actually instantiating the Map and Reduce Task
5. Tasktracker report the tasks progress back to the JobTracker

1. 1,2,3,4,5
2. 2,1,3,4,5
3. Access Mostly Uused Products by 50000+ Subscribers
4. 1,3,2,4,5

Question : How to define a Job in Hadoop ?

1. Is the execution of Mapper or reducer instance
2. A couple of Mapper and reducer which work on same file block
3. Access Mostly Uused Products by 50000+ Subscribers
4. None of the above

Question : Distributing the values among associated with the key in sorted order to the reducer is defined as ?

1. Map and Reduce
2. Shuffle and Sort
3. Access Mostly Uused Products by 50000+ Subscribers
4. None of the above

Question : You have written a Mapper which invokes the following five calls to the OutputColletor.collect method:
output.collect (new Text ("Apple"), new Text ("Red") ) ;
output.collect (new Text ("Banana"), new Text ("Yellow") ) ;
output.collect (new Text ("Apple"), new Text ("Yellow") ) ;
output.collect (new Text ("Cherry"), new Text ("Red") ) ;
output.collect (new Text ("Apple"), new Text ("Green") ) ;
How many times will the Reducer's reduce method be invoked?

1. 6
2. 3
3. Access Mostly Uused Products by 50000+ Subscribers
4. 0
5. 5

Question : What data does a Reducer reduce method process?

1. All the data in a single input file.
2. All data produced by a single mapper.
3. Access Mostly Uused Products by 50000+ Subscribers
4. All data for a given value, regardless of which mapper(s) produced it.