Mapr (HP) Hadoop Developer Certification Questions and Answers (Dumps and Practice Questions)

Question : Which of the following class is responsible for committing output of the Job

1. OutputFormat

2. Job

3. Access Mostly Uused Products by 50000+ Subscribers

4. Context

Correct Answer : Get Lastest Questions and Answer :
Explanation: OutputFormat describes the output-specification for a Map-Reduce job.
The Map-Reduce framework relies on the OutputFormat of the job to:

Validate the output-specification of the job. For e.g. check that the output directory doesn't already exist.
Provide the RecordWriter implementation to be used to write out the output files of the job. Output files are stored in a FileSystem.

Question : You are running a word count MapReduce job. But somehow job is not successfully completed and fails after processing % of reducer class.
Which statement is correct in this case?

1. It will generate 90% output only

2. It will only generate _logs directory as output

3. Access Mostly Uused Products by 50000+ Subscribers

4. 1,2

5. 2,3

Correct Answer : Get Lastest Questions and Answer :
Explanation: A job that does not successful only generate an _logs directory as output. All other partial reducer output will not be written.

Question : Select correct statements

1. RecordWriter writes the key-value pairs to the output files

2. The TextOutputFormat.LineRecordWriter implementation requires a java.io.DataOutputStream
object to write the key-value pairs to the HDFS/MapR-FS file system

3. Access Mostly Uused Products by 50000+ Subscribers

4. 1,2
5. 1,2,3

Correct Answer : Get Lastest Questions and Answer :
Explanation: RecordWriter writes the output pairs to an output file.
RecordWriter implementations write the job outputs to the FileSystem.

Related Questions

Question : Because of OutOfMemory a Map or Reduce Task can crash, How does Hadoop MapReduce v (MRv)
handle JVMs when a new MapReduce job is started on a cluster?

1. The TaskTracker may or may not use same JVM for each task it manages on that node
2. The TaskTracker reuse the same JVM for each task it manages on that node
3. Access Mostly Uused Products by 50000+ Subscribers
4. The TaskTracker spawns a new JVM for each task it manages on that node

Question : You have configured Hadoop cluster with MR. And, you have a directory called HadoopExam in HDFS containing three files: Exam and Exam.
You submit a job to the cluster, using that directory as the input directory.
A few seconds after you have submitted the job, a user start copying a large file, Exam3,
into the directory. Select the correct statement?

1. All files Exam1, Exam2 and Exam3 will be processed by the job
2. Only files Exam1, and Exam2 will be processed by the job
3. Access Mostly Uused Products by 50000+ Subscribers
4. Only files Exam3 will be processed by the job

Question :
As you know Hadoop cluster is made of Multiple nodes and each file is divided in multiple blocks and stored in different nodes.
For this you need to able to serialize your data and you use Writable interface for this, select the correct statement for the
Writable interface.

1. Writable is a class that all keys and values in MapReduce must extend. Classes extending this interface must implement methods for serializing and deserializing themselves.
2. Writable is an class that all keys and values in MapReduce must extend. Classes extending this interface must not implement methods for serializing and deserializing themselves until they want to customize it.
3. Access Mostly Uused Products by 50000+ Subscribers
4. Writable is an interface that all values in MapReduce must implement. Classes implementing this interface must implement methods for serializing and deserializing themselves.

Question :
You have written a MapReduce job, and in the Reducer you want the data to be adjusted from multiple reducers before
writing to the HDFS,Is it possible that reduce tasks to communicate with each other and can talk to each other? .

1. Yes, all reducer task runs can share the data by doing proper configuration
2. Yes, each reduce task runs independently and in isolation, by creating a shared file reducer can communicate with each other
3. Access Mostly Uused Products by 50000+ Subscribers
4. It all depends on file size created if it is smaller than block size then it is possible

Question : All HadoopExam website subscribers information is stored in the MySQL database,
Which tool is best suited to import a portion of a subscribers information every day as files into HDFS,
and generate Java classes to interact with that imported data?

1. Hive
2. Pig
3. Access Mostly Uused Products by 50000+ Subscribers
4. Flume

Question :A client application of HadoopExam creates an HDFS file named HadoopExam.txt with a replication factor of .
Identify which best describes the file access rules in HDFS if the file has a single block that is stored on data nodes C1, C2, C3, C4 and C5?

1. The file can not be accessed if at least one of the DataNodes storing the block is un-available.
2. The file can be accessed if at least one of the DataNodes storing the block is available and client connected to that node only.
3. Access Mostly Uused Products by 50000+ Subscribers
4. The file can be accessed if at least one of the DataNodes storing the block is available and even NameNode is crashed.