Cloudera Hadoop Developer Certification Questions and Answer (Dumps and Practice Questions)

Question : Which of the following are responsbilities of the ApplicationMater

1. Before starting any task, create job's output directory for job's OutputCommitter.
2. Both map tasks and reduce tasks are created by Application Master.
3. Access Mostly Uused Products by 50000+ Subscribers
4. If job doesn't qualify as Uber task, Application Master requests containers for all map tasks and reduce tasks.

1. 1,2,3
2. 2,3,4
3. Access Mostly Uused Products by 50000+ Subscribers
4. 1,2,4
5. 1,2,3,4

Correct Answer : Get Lastest Questions and Answer :

Explanation: Role of an Application Master:
o Before starting any task, Job setup method is called to create job's output directory for job's OutputCommitter.
o As noted above, Both map tasks and reduce tasks are created by Application Master.
o If the submitted job is small, then Application Master runs the job in the same JVM on which Application Master is running. It reduces the overhead of creating new container and running tasks in parallel. These small jobs are called as Uber tasks.
o Uber tasks are decided by three configuration parameters, number of mappers "less than and equal to" 10, number of reducers "less than and equal to" 1 and Input file size is less than or equal to an HDFS block size. These parameters can be configured via mapreduce.job.ubertask.maxmaps , mapreduce.job.ubertask.maxreduces , and mapreduce.job.ubertask.maxbytes properties in mapred-site.xml.
o If job doesn't qualify as Uber task, Application Master requests containers for all map tasks and reduce tasks.

You can also Refer/Consider Advance Hadoop YARN Training by HadoopExam.com

Question :
A _____ is the basic unit of processing capacity in YARN, and is an encapsulation of resource elements (memory, cpu etc.)

1. Node Manager
2. Container
3. Access Mostly Uused Products by 50000+ Subscribers
4. DataNode

Correct Answer : Get Lastest Questions and Answer :
A Container is the basic unit of processing capacity in YARN, and is an encapsulation of resource elements (memory, cpu etc.).

You can also Refer/Consider Advance Hadoop YARN Training by HadoopExam.com

Question : __________ are responsible for local monitoring of resource availability, fault reporting,
and container life-cycle management (e.g., starting and killing
jobs).

1. NodeManagers
2. Application Manager
3. Access Mostly Uused Products by 50000+ Subscribers
4. Resource Manager

Correct Answer : Get Lastest Questions and Answer :

Explanation: The central ResourceManager runs as a standalone daemon on a dedicated machine and acts as the central authority for allocating resources to the
various competing applications in the cluster. The ResourceManager has a central and global view of all cluster resources and, therefore, can provide
fairness, capacity, and locality across all users. Depending on the application demand, scheduling priorities, and resource availability, the
ResourceManager dynamically allocates resource containers to applications to run on particular nodes. A container is a logical bundle of resources (e.g.,
memory, cores) bound to a particular cluster node. To enforce and track such assignments, the ResourceManager interacts with a special system daemon
running on each node called the NodeManager. Communications between the ResourceManager and NodeManagers are heartbeat based for scalability.
NodeManagers are responsible for local monitoring of resource availability, fault reporting, and container life-cycle management (e.g., starting and killing
jobs). The ResourceManager depends on the NodeManagers for its "global view" of the cluster.

User applications are submitted to the ResourceManager via a public protocol and go through an admission control phase during which security
credentials are validated and various operational and administrative checks are performed. Those applications that are accepted pass to the scheduler and
are allowed to run. Once the scheduler has enough resources to satisfy the request, the application is moved from an accepted state to a running state.
Aside from internal bookkeeping, this process involves allocating a container for the ApplicationMaster and spawning it on a node in the cluster. Often
called 'container 0,' the ApplicationMaster does not get any additional resources at this point and must request and release additional containers.

You can also Refer/Consider Advance Hadoop YARN Training by HadoopExam.com

Related Questions

Question :

Select the correct statement which applies for below DDL

CREATE TABLE page_view(viewTime INT, userid BIGINT,
page_url STRING, referrer_url STRING,
ip STRING COMMENT 'IP Address of the User')
COMMENT 'This is the page view table'
PARTITIONED BY(dt STRING, country STRING)
CLUSTERED BY(userid) SORTED BY(viewTime) INTO 32 BUCKETS
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '\001'
COLLECTION ITEMS TERMINATED BY '\002'
MAP KEYS TERMINATED BY '\003'
STORED AS SEQUENCEFILE

1. the page_view table is bucketed (clustered by) userid and within each bucket the data is sorted in increasing order of viewTime
2. an organization allows the user to do efficient sampling on the clustered column - in this case userid.
3. Access Mostly Uused Products by 50000+ Subscribers
4. The CLUSTERED BY and SORTED BY creation commands do not affect how data is inserted into a table - only how it is read.

1. 1,2,3
2. 2,3,4
3. Access Mostly Uused Products by 50000+ Subscribers
4. 1,2,3,4
5. 1,4

Question :

Which of the given is a correct code snippet of the Mapper,
for implementing word count example.

1. A
2. B
3. Access Mostly Uused Products by 50000+ Subscribers

Question : Select the correct statement while reading/writing the data in RDBMS using MapReduce

1. In order to use DBInputFormat you need to write a class that deserializes the columns from the database record into individual data fields to work with
2. The DBOutputFormat writes to the database by generating a set of INSERT statements in each reducer
3. Access Mostly Uused Products by 50000+ Subscribers
4. If you want to export a very large volume of data, you may be better off generating the INSERT statements into a text file, and then using a bulk data import tool provided by your database to do the database import.
5. All of the above

Question :

Which of the following is a correct way to disable the Speculative-execution

A. In Command Line
bin/hadoop jar -Dmapreduce.map.speculative =false \
-D mapreduce.reduce.speculative=false jar>

B. In JobConfiguration:
jobconf.setBoolean("mapreduce.map.speculative", false);
jobconf.setBoolean("mapreduce.reduce.speculative ", false);

C. In JobConfiguration:
jobconf.setBoolean("mapreduce.speculative", false);

D. In JobConfiguration:
jobconf.setBoolean("mapreduce.mapred.speculative", false);

1. A,B
2. B,C
3. Access Mostly Uused Products by 50000+ Subscribers
4. A,D

Question :

You have written a word count MapReduce program for a big file, almost 5TB in size. Now you want after completion of the job,
you want to create a single file from all the reducers output. Which is the best option. Assuming all the output files of
jobs are written in the output directory

/data/weblogs/weblogs_md5_groups.bcp

1. hadoop fs -getmerge weblogs_md5_ groups.bcp /data/weblogs/weblogs_md5_groups.bcp
2. hadoop fs -getmerge /data/weblogs/weblogs_md5_groups.bcp/*
3. Access Mostly Uused Products by 50000+ Subscribers
4. hadoop fs -getmerge /data/weblogs/weblogs_md5_groups.bcp weblogs_md5_ groups.bcp

Question : Which statement is correct for below code snippet

public class TokenCounterMapper
extends Mapper [Object, Text, Text, IntWritable>{

private final static IntWritable one = new IntWritable(1);
private Text word = new Text();

public void map(Object key, Text value, Context context) throws IOException, InterruptedException {
StringTokenizer itr = new StringTokenizer(value.toString());
while (itr.hasMoreTokens()) {
word.set(itr.nextToken());
}
context.write(word, one);

}
}

1. All key value pair will be written to context
2. Some key value pair will be written to context
3. Access Mostly Uused Products by 50000+ Subscribers
4. No key value pair will be written to context