Mapr (HP) Hadoop Developer Certification Questions and Answers (Dumps and Practice Questions)

Question : You have defined Mapper class as below
public class HadoopExamMapper extends Mapper{
public void map(XXXXX key, YYYYY value, Context)
}
What is the correct replacement for XXXXX and YYYYY

1. LongWritable, Text

2. LongWritable, IntWritable

3. Access Mostly Uused Products by 50000+ Subscribers

4. IntWritable, Text

Correct Answer : Get Lastest Questions and Answer :
Explanation: A. Whatever you have configured as an input key and value type must mathch in the Mapper class
B. Input key and value type defined on the Mapper class level must match in map() method arguments
C. Output key and value class type must match with the input class of the Mapper class

Question : Which of the following is a correct statement regarding Input key and Value for the Reducer class

1. Both input key and value type of Reducer must match the output key and value type of a defined Mapper class

2. The output key class and output value class in the Reducer must match those defined in the job configuration

3. Access Mostly Uused Products by 50000+ Subscribers

4. 1,3

5. 1,2

Correct Answer : Get Lastest Questions and Answer :
Explanation: The input to the mapper depends on what InputFormat is used. The InputFormat is responsible for reading the incoming data and
shaping it into whatever format the Mapper expects.The default InputFormat is TextInputFormat, which extends FileInputFormat.
If you do not change the InputFormat, using a Mapper with different Key-Value type signature than will cause this error.
If you expect input, you will have to choose an appropiate InputFormat. You can set the InputFormat in Job setup:
job.setInputFormat(MyInputFormat.class);
And like I said, by default this is set to TextInputFormat.
Now, let's say your input data is a bunch of newline-separated records delimited by a comma:
. "A,value1"
. "B,value2"
If you want the input key to the mapper to be ("A", "value1"), ("B", "value2") you will have to implement a custom InputFormat and RecordReader
with the signature

Here are a few rules regarding input and output keys and values for the Reducer class:

The input key class and input value class in the Reducer must match the output key class and output value class defined in the Mapper class.
The output key class and output value class in the Reducer must match those defined in the job configuration. The behavior of the cleanup(), run(), and
setup() methods are identical to those described for the Mapper class.
Now that you have a basic understanding of the MapReduce API, including framework functionality, the Mapper and Reducer, Mapper input, the record reader,
reducer output data processing, and the Mapper, Reducer and Job class API, I suggest that you dive into some additional training.

Question : You have following reducer class defined
public class HadoopExamReducer extends Reducer {
public void reduce(XXXXX, key, YYYYY value, Context context) ....
}
What is the correct replacement for XXXXX and YYYYY

1. Text, Iterable

2. Text, IntWritable

3. Access Mostly Uused Products by 50000+ Subscribers

4. IntWritable, List

Correct Answer : Get Lastest Questions and Answer :
Explanation: The input to the mapper depends on what InputFormat is used. The InputFormat is responsible for reading the incoming data and
shaping it into whatever format the Mapper expects.The default InputFormat is TextInputFormat, which extends FileInputFormat.

If you do not change the InputFormat, using a Mapper with different Key-Value type signature than will cause this error. If you
expect input, you will have to choose an appropiate InputFormat. You can set the InputFormat in Job setup:

job.setInputFormat(MyInputFormat.class);
And like I said, by default this is set to TextInputFormat.

Now, let's say your input data is a bunch of newline-separated records delimited by a comma:

"A,value1"
"B,value2"
If you want the input key to the mapper to be ("A", "value1"), ("B", "value2") you will have to implement a custom InputFormat and RecordReader with the
signature

n short, add a class which extends FileInputFormat and a class which extends RecordReader. Override the
FileInputFormat#getRecordReader method, and have it return an instance of your custom RecordReader.

Then you will have to implement the required RecordReader logic. The simplest way to do this is to create an instance of LineRecordReader in your custom
RecordReader, and delegate all basic responsibilities to this instance. In the getCurrentKey and getCurrentValue-methods you will implement the logic for
extracting the comma delimited Text contents by calling LineRecordReader#getCurrentValue and splitting it on comma.

Related Questions

Question : To analyze the website click of HadoopExam.com you have written a Mapreduce job, which
will product the click reports for each week e.g. 53 reports for whole year.Which of the following Hadoop API class you must use
so that output file generated as per the weeks and output data will go in corresponding output file.

1. Hive
2. MapReduce Chaining
3. Access Mostly Uused Products by 50000+ Subscribers
4. Partitioner

Question : Reducers are generally helpful to write the job ouput data in desried location or database.
In your ETL MapReduce job you set the number of reducer to zero, select the correct statement which applies.

1. You can not configure number of reducer
2. No reduce tasks execute. The output of each map task is written to a separate file in HDFS
3. Access Mostly Uused Products by 50000+ Subscribers
4. You can not configure number of reducer, it is decided by Tasktracker at runtime

Question : In the QuickTechie website log file named as MAIN.PROFILES.log you have keys are (ipaddres+locations), and the values are Number of clicks (int).
For each unique key (string), you want to find the average of all values associated with each key. In writing a MapReduce program to accomplish this, can you take advantage of a
combiner?

1. No, best way to accomplish this you have to use Aapche Pig
2. No, best way to accomplish this you have to use MapReduce chaining.
3. Access Mostly Uused Products by 50000+ Subscribers
4. Yes

Question : In our website www.HadoopExam.com we have Million profiles and created ETL jobs for processing this file.
You have submitted a ETL mapReduce job for HadoopExam.com websites log file analysis as well as combining profile data to Hadoop
and notice in the JobTracker's Web UI that the Mappers are 80% complete
while the reducers are 20% complete. What is the best explanation for this?

1. The progress attributed to the reducer refers to the transfer of data from completed Mappers.
2. The progress attributed to the reducer refers to the transfer of data from Mappers is still going on.
3. Access Mostly Uused Products by 50000+ Subscribers
4. The progress attributed to the reducer refers to the transfer of data from Mappers an not be predicted.

Question : In your MapReduce job, you have three configuration parameters.
What is the correct or best way to pass a these three configuration parameters to a mapper or reducer?

1. As key pairs in the Configuration object.
2. As value pairs in the Configuration object.
3. Access Mostly Uused Products by 50000+ Subscribers
4. Not possible

Question : In word count MapReduce algorithm, why might using a combiner (Combiner, runs after the Mapper and before the Reducer. )
reduce the overall job running time?

1. combiners perform local filtering of repeated word, thereby reducing the number of key-value pairs that need to be shuffled across the network to the reducers.
2. combiners perform global aggregation of word counts, thereby reducing the number of key-value pairs that need to be shuffled across the network to the reducers.
3. Access Mostly Uused Products by 50000+ Subscribers
4. combiners perform local aggregation of word counts, thereby reducing the number of key-value pairs that need to be shuffled across the network to the reducers.