Premium

Cloudera Hadoop Developer Certification Questions and Answer (Dumps and Practice Questions)



Question :

Which of the following is a correct way to disable the Speculative-execution

A. In Command Line
bin/hadoop jar -Dmapreduce.map.speculative =false \
-D mapreduce.reduce.speculative=false jar>

B. In JobConfiguration:
jobconf.setBoolean("mapreduce.map.speculative", false);
jobconf.setBoolean("mapreduce.reduce.speculative ", false);

C. In JobConfiguration:
jobconf.setBoolean("mapreduce.speculative", false);

D. In JobConfiguration:
jobconf.setBoolean("mapreduce.mapred.speculative", false);



  :
1. A,B
2. B,C
3. Access Mostly Uused Products by 50000+ Subscribers
4. A,D

Correct Answer : Get Lastest Questions and Answer :

Speculative Execution of MapReduce tasks is on by default, and for HBase clusters it is generally advised to turn off Speculative Execution at a system-level unless you need it for a specific case, where it can be configured per-job. Set the properties mapreduce.map.speculative and mapreduce.reduce.speculative to false.
Speculative execution is enabled by default. You can disable speculative execution for the mappers and reducers by setting the mapred.map.tasks.speculative.execution and mapred.reduce.tasks.speculative.execution JobConf options to false, respectively using old API, while with newer API you may consider changing mapreduce.map.speculative and mapreduce.reduce.speculative.





Question :

You have written a word count MapReduce program for a big file, almost 5TB in size. Now you want after completion of the job,
you want to create a single file from all the reducers output. Which is the best option. Assuming all the output files of
jobs are written in the output directory

/data/weblogs/weblogs_md5_groups.bcp


  :
1. hadoop fs -getmerge weblogs_md5_ groups.bcp /data/weblogs/weblogs_md5_groups.bcp
2. hadoop fs -getmerge /data/weblogs/weblogs_md5_groups.bcp/*
3. Access Mostly Uused Products by 50000+ Subscribers
4. hadoop fs -getmerge /data/weblogs/weblogs_md5_groups.bcp weblogs_md5_ groups.bcp

Correct Answer : Get Lastest Questions and Answer :


Explanation: The getmerge command can be used to merge all four of the part files and then copy the singled merged file to the local filesystem




Question : Which statement is correct for below code snippet

public class TokenCounterMapper
extends Mapper [Object, Text, Text, IntWritable>{

private final static IntWritable one = new IntWritable(1);
private Text word = new Text();

public void map(Object key, Text value, Context context) throws IOException, InterruptedException {
StringTokenizer itr = new StringTokenizer(value.toString());
while (itr.hasMoreTokens()) {
word.set(itr.nextToken());
}
context.write(word, one);

}
}


  : Which statement is correct for below code snippet
1. All key value pair will be written to context
2. Some key value pair will be written to context
3. Access Mostly Uused Products by 50000+ Subscribers
4. No key value pair will be written to context

Correct Answer : Get Lastest Questions and Answer :




Related Questions


Question : You are working on a project of HadoopExam client where you need to chain together MapReduce and Pig jobs.
You also need the ability to use forks, decision points, and path joins.
Which of the following ecosystem projects allows you to accomplish this?

 : You are working on a project of HadoopExam client where you need to chain together MapReduce and Pig jobs.
1. Oozie
2. MapReduce chaining
3. Access Mostly Uused Products by 50000+ Subscribers
4. Zookeeper
5. Hue


Question :
You have the following key-value pairs as output from your Map task:
(HadoopExam, 1)
(Is, 1)
(the, 1)
(best, 1)
(material, 1)
(provider, 1)
(for, 1)
(the, 1)
(Hadoop, 1)
How many keys will be passed to the Reducer's reduce() method?


 :
1. 9
2. 8
3. Access Mostly Uused Products by 50000+ Subscribers
4. 6
5. 5


Question : While processing the file using MapReduce framework, the output of the Mapper which we call as
intermediate key-value pairs, select the correct statemen for this output of the mappers.
 : While processing the file using MapReduce framework, the output of the Mapper which we call as
1. Intermediate key-value pairs are written to the HDFS of the machines running the map tasks, and then copied to the machines running the reduce tasks.
2. Intermediate key-value pairs are written to the local disks of the machines running the reduce tasks.
3. Access Mostly Uused Products by 50000+ Subscribers
4. Intermediate key-value pairs are written to the local disks of the machines running the map tasks, and then read by the machines running the reduce tasks.


Question : HadoopExam stores everyday, the users IP address+location as a string in the file as well as
number of total clicks as an Integer (Incremented for each click) and this is quite huge file,
where the keys are strings (address+location), and the values are integers (clicks).
For each unique key, you want to identify the largest integer. In writing a MapReduce program to accomplish this,
using the combine is advantageous ?
 : HadoopExam stores everyday, the users IP address+location as a string in the file as well as
1. Yes
2. No
3. Access Mostly Uused Products by 50000+ Subscribers
4. Yes, if configured while cluster setup


Question : A MapReduce program has two components: one that implements the mapper, and another that implements the reducer. You have to implement
map() method for the Mapper and reduce() method for the reducer. When is the earliest that the reduce() method of any reduce task of your submitted
job will be called?
 : A MapReduce program has two components: one that implements the mapper, and another that implements the reducer. You have to implement
1. Not until all map tasks have completed
2. As soon as first map tasks have completed
3. Access Mostly Uused Products by 50000+ Subscribers
4. It can be started any time during the Job no particular time


Question : While processing Timeseries data of the QuickTechi Inc log file using MapReduce ETL batch job you have set up the number of reducers
to 1 (one) . Select the correct statment which applies.
 : While processing Timeseries data of the QuickTechi Inc log file using MapReduce ETL batch job you have set up the number of reducers
1. A single reducer gathers and processes all the output from all the mappers. The output is written to a multiple file in HDFS.
2. Number of reducers can not be configured, it is determined by the NameNode during runtime.
3. Access Mostly Uused Products by 50000+ Subscribers
4. A single reducer will process all the output from all the mappers. The output is written to a single file in HDFS.