Cloudera Hadoop Developer Certification Questions and Answer (Dumps and Practice Questions)

Question : The key output of the Mapper must be identical to reducer input key.

1. True
2. False

Correct Answer : Get Lastest Questions and Answer :

Question : One key is processed by one reducer ?

1. True
2. False

Correct Answer : Get Lastest Questions and Answer :

Question : Number of the Mapper configuration is defined in JobConf object ?

1. True
2. False

Correct Answer : Get Lastest Questions and Answer :

Number of mapper is decided by the Hadoop framework

Related Questions

Question :

Select the correct statement which applies for below DDL

CREATE TABLE page_view(viewTime INT, userid BIGINT,
page_url STRING, referrer_url STRING,
ip STRING COMMENT 'IP Address of the User')
COMMENT 'This is the page view table'
PARTITIONED BY(dt STRING, country STRING)
CLUSTERED BY(userid) SORTED BY(viewTime) INTO 32 BUCKETS
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '\001'
COLLECTION ITEMS TERMINATED BY '\002'
MAP KEYS TERMINATED BY '\003'
STORED AS SEQUENCEFILE

1. the page_view table is bucketed (clustered by) userid and within each bucket the data is sorted in increasing order of viewTime
2. an organization allows the user to do efficient sampling on the clustered column - in this case userid.
3. Access Mostly Uused Products by 50000+ Subscribers
4. The CLUSTERED BY and SORTED BY creation commands do not affect how data is inserted into a table - only how it is read.

1. 1,2,3
2. 2,3,4
3. Access Mostly Uused Products by 50000+ Subscribers
4. 1,2,3,4
5. 1,4

Question :

Which of the given is a correct code snippet of the Mapper,
for implementing word count example.

1. A
2. B
3. Access Mostly Uused Products by 50000+ Subscribers

Question : Select the correct statement while reading/writing the data in RDBMS using MapReduce

1. In order to use DBInputFormat you need to write a class that deserializes the columns from the database record into individual data fields to work with
2. The DBOutputFormat writes to the database by generating a set of INSERT statements in each reducer
3. Access Mostly Uused Products by 50000+ Subscribers
4. If you want to export a very large volume of data, you may be better off generating the INSERT statements into a text file, and then using a bulk data import tool provided by your database to do the database import.
5. All of the above

Question :

Which of the following is a correct way to disable the Speculative-execution

A. In Command Line
bin/hadoop jar -Dmapreduce.map.speculative =false \
-D mapreduce.reduce.speculative=false jar>

B. In JobConfiguration:
jobconf.setBoolean("mapreduce.map.speculative", false);
jobconf.setBoolean("mapreduce.reduce.speculative ", false);

C. In JobConfiguration:
jobconf.setBoolean("mapreduce.speculative", false);

D. In JobConfiguration:
jobconf.setBoolean("mapreduce.mapred.speculative", false);

1. A,B
2. B,C
3. Access Mostly Uused Products by 50000+ Subscribers
4. A,D

Question :

You have written a word count MapReduce program for a big file, almost 5TB in size. Now you want after completion of the job,
you want to create a single file from all the reducers output. Which is the best option. Assuming all the output files of
jobs are written in the output directory

/data/weblogs/weblogs_md5_groups.bcp

1. hadoop fs -getmerge weblogs_md5_ groups.bcp /data/weblogs/weblogs_md5_groups.bcp
2. hadoop fs -getmerge /data/weblogs/weblogs_md5_groups.bcp/*
3. Access Mostly Uused Products by 50000+ Subscribers
4. hadoop fs -getmerge /data/weblogs/weblogs_md5_groups.bcp weblogs_md5_ groups.bcp

Question : Which statement is correct for below code snippet

public class TokenCounterMapper
extends Mapper [Object, Text, Text, IntWritable>{

private final static IntWritable one = new IntWritable(1);
private Text word = new Text();

public void map(Object key, Text value, Context context) throws IOException, InterruptedException {
StringTokenizer itr = new StringTokenizer(value.toString());
while (itr.hasMoreTokens()) {
word.set(itr.nextToken());
}
context.write(word, one);

}
}

1. All key value pair will be written to context
2. Some key value pair will be written to context
3. Access Mostly Uused Products by 50000+ Subscribers
4. No key value pair will be written to context