Mapr (HP) Hadoop Developer Certification Questions and Answers (Dumps and Practice Questions)

Question : During streaming job submission, you want to set environment variable as EXAMPLE_DIR=/home/example/dictionaries/. Which is correct option.

1. -env EXAMPLE_DIR=/home/example/dictionaries/

2. -e EXAMPLE_DIR=/home/example/dictionaries/

3. Access Mostly Uused Products by 50000+ Subscribers

4. -sysenv EXAMPLE_DIR=/home/example/dictionaries/

5. -cmdenv EXAMPLE_DIR=/home/example/dictionaries/

Correct Answer : Get Lastest Questions and Answer :
Explanation: To set an environment variable in a streaming command use:

-cmdenv EXAMPLE_DIR=/home/example/dictionaries/

Question : Which daemons is considered as master

1. NameNode
2. Secondary NameNode
3. Access Mostly Uused Products by 50000+ Subscribers
4. 1,2 and 3 are correct
5. 1 and 3 are correct

Correct Answer : Get Lastest Questions and Answer :

Explanation: We can consider nodes to be in two different categories:
Master Nodes
- Run the NameNode, Secondary NameNode, JobTracker daemons
- Only one of each of these daemons runs on the cluster

Slave Nodes
- Run the DataNode and TaskTracker daemons
- A slave node will run both of these daemons

Refer HadoopExam.com Recorded Training Module : 2 and 3

Question : Which node is considered as slave nodes

1. Secondary NameNode
2. DataNode
3. Access Mostly Uused Products by 50000+ Subscribers
4. 1,2 and 3 are correct
5. 2 and 3 are correct

Correct Answer : Get Lastest Questions and Answer :

Explanation: We can consider nodes to be in two different categories:
Master Nodes
- Run the NameNode, Secondary NameNode, JobTracker daemons
- Only one of each of these daemons runs on the cluster

Slave Nodes
- Run the DataNode and TaskTracker daemons
- A slave node will run both of these daemons

Refer HadoopExam.com Recorded Training Module : 2 and 3

Related Questions

Question : You have to run MapReduce job, where Mapper is a Java class and Reducer is a Unix command "/bin/wc " . After completing your entire Job, you want that only two
partitions should be created. Select correct options which fulfill this requirement.

1. $HADOOP_HOME/bin/hadoop jar $HADOOP_HOME/hadoop-streaming.jar \
-reducer=2 \
-input myInputDirs \
-output myOutputDir \
-mapper org.apache.hadoop.mapred.lib.IdentityMapper \
-reducer /bin/wc
2. $HADOOP_HOME/bin/hadoop jar $HADOOP_HOME/hadoop-streaming.jar \
-D mapred.reduce.tasks=2 \
-input myInputDirs \
-output myOutputDir \
-mapper org.apache.hadoop.mapred.lib.IdentityMapper \
-reducer /bin/wc
3. Access Mostly Uused Products by 50000+ Subscribers
-D mapred.reduce.count=2 \
-input myInputDirs \
-output myOutputDir \
-mapper org.apache.hadoop.mapred.lib.IdentityMapper \
-reducer /bin/wc
4. As default file count would always be 2, hence no specific configuration is required.
$HADOOP_HOME/bin/hadoop jar $HADOOP_HOME/hadoop-streaming.jar \
-input myInputDirs \
-output myOutputDir \
-mapper org.apache.hadoop.mapred.lib.IdentityMapper \
-reducer /bin/wc

Question : You have following data in a file callled HadoopExam.txt
Learning.Hadoop.From.HadoopExam.com
Learning.Spark.From.QuickTechie.com
Learning.Cassandra.From.Training4Exam.com
Learning.HBase.From.AWSExam.blogspot.com

Now from above data , while running Hadoop MapReduce streaming job, you want to creare key-set as below.

[Learning.Hadoop,Learning.Spark,Learning.Cassandra,Learning.HBase]
Which of the following is a correct code snippet.

1. $HADOOP_HOME/bin/hadoop jar $HADOOP_HOME/hadoop-streaming.jar \
-D stream.map.output.field.separator=. \
-D stream.num.map.output.key.fields=15 \
-input myInputDirs \
-output myOutputDir \
-mapper org.apache.hadoop.mapred.lib.IdentityMapper \
-reducer org.apache.hadoop.mapred.lib.IdentityReducer
2. $HADOOP_HOME/bin/hadoop jar $HADOOP_HOME/hadoop-streaming.jar \
-D stream.map.output.field.separator=. \
-D stream.num.map.output.key.fields=2 \
-input myInputDirs \
-output myOutputDir \
-mapper org.apache.hadoop.mapred.lib.IdentityMapper \
-reducer org.apache.hadoop.mapred.lib.IdentityReducer
3. Access Mostly Uused Products by 50000+ Subscribers
-D stream.map.output.field.separator=. \
-input myInputDirs \
-output myOutputDir \
-mapper org.apache.hadoop.mapred.lib.IdentityMapper \
-reducer org.apache.hadoop.mapred.lib.IdentityReducer
4. $HADOOP_HOME/bin/hadoop jar $HADOOP_HOME/hadoop-streaming.jar \
-D stream.map.output.field.separator=. \
-D stream.num.map.output.key.counts=2 \
-input myInputDirs \
-output myOutputDir \
-mapper org.apache.hadoop.mapred.lib.IdentityMapper \
-reducer org.apache.hadoop.mapred.lib.IdentityReducer

Question : The ________________ options allow you to make files and archives available to the tasks. The argument is a URI to the file or archive
that you have already uploaded to HDFS. These files and archives are cached across jobs. You can retrieve the host and fs_port values from the fs.default.name
config variable.

1. -files and -archives

2. -file and -archive

3. Access Mostly Uused Products by 50000+ Subscribers

4. -archives

Question : In hadoop streaming job, can we use following as mapper

-mapper "cut -f1 | sed s/foo/bar/g"

1. True
2. False

Question : Is below command valid for Hadoop Streaming Job
hadoop jar hadoop-streaming.jar -input '/user/foo/dir1' -input '/user/foo/dir2'

1. Yes
2. No

Question : Select the correct statement with regards to Hadoop streaming applications
A. A streaming process can use the stderr to emit counter information. reporter:counter:,, should be sent to stderr to update the counter.
B. A streaming process can use the stderr to emit status information. To set a status, reporter:status: should be sent to stderr.
C. You can use the record reader StreamXmlRecordReader to process XML documents.
D. During the execution of a streaming job, the names of the "mapred" parameters are transformed. The dots ( . ) become underscores ( _ ). For example, mapred.job.id becomes
mapred_job_id and mapred.jar becomes mapred_jar. In your code, use the parameter names with the underscores.

1. A,B,C
2. B,C,D
3. Access Mostly Uused Products by 50000+ Subscribers
4. A,B,D
5. A,B,C,D