Mapr (HP) Hadoop Developer Certification Questions and Answers (Dumps and Practice Questions)

Question : Which one of the following statements describes the relationship between the ResourceManager and the ApplicationMaster?

1. The ApplicationMaster requests resources from the ResourceManager
2. The ApplicationMaster starts a single instance of the ResourceManager
3. Access Mostly Uused Products by 50000+ Subscribers
4. The ApplicationMaster starts an instance of the ResourceManager within each Container

Correct Answer : Get Lastest Questions and Answer :
Explanation:

Question : Which YARN component is responsible for monitoring the success or failure of a Container?

1. ResourceManager
2. ApplicationMaster
3. Access Mostly Uused Products by 50000+ Subscribers
4. JobTracker

Correct Answer : Get Lastest Questions and Answer :
Explanation: ResourceManager (RM) is the master that arbitrates all the available cluster resources and thus helps manage the distributed applications running on the YARN
system. It works together with the per-node NodeManagers (NMs) and the per-application ApplicationMasters (AMs).

NodeManagers take instructions from the ResourceManager and manage resources available on a single node.
ApplicationMasters are responsible for negotiating resources with the ResourceManager and for working with the NodeManagers to start the containers.

ResourceManager is the central authority that manages resources and schedules applications running atop of YARN. Hence, it is potentially a single point of failure in a Apache YARN
cluster. ` This document gives an overview of ResourceManager Restart, a feature that enhances ResourceManager to keep functioning across restarts and also makes ResourceManager
down-time invisible to end-users.

ResourceManager Restart feature is divided into two phases:

ResourceManager Restart Phase 1 (Non-work-preserving RM restart): Enhance RM to persist application/attempt state and other credentials information in a pluggable state-store. RM
will reload this information from state-store upon restart and re-kick the previously running applications. Users are not required to re-submit the applications.

ResourceManager Restart Phase 2 (Work-preserving RM restart): Focus on re-constructing the running state of ResourceManager by combining the container statuses from NodeManagers and
container requests from ApplicationMasters upon restart. The key difference from phase 1 is that previously running applications will not be killed after RM restarts, and so
applications won't lose its work because of RM outage.

Question : When can a reduce class also serve as a combiner without affecting the output of a MapReduce program?

1. When the types of the reduce operation's input key and input value match the types of
the reducer's output key and output value and when the reduce operation is both
communicative and associative.
2. When the signature of the reduce method matches the signature of the combine method.
3. Access Mostly Uused Products by 50000+ Subscribers
4. Always. The point of a combiner is to serve as a mini-reducer directly after the map phase to increase performance.
5. Never. Combiners and reducers must be implemented separately because they serve different purposes.

Correct Answer : Get Lastest Questions and Answer :
Explanation: You can use your reducer code as a combiner if the operation performed is commutative and associative.

Related Questions

Question :

Map wordCountMap = new Map(String, List(String>>(); //It holds each word as a key and all the same words are in the list
In a word count Mapper class, you are emitting key value pair as
Case 1 : context.write("word, IntWritable(1))
and

Case 2 : context.write("word, IntWritable(wordCountMap.get("word").size())) " ,

Select the correct statement from above example code snippet

1. In both the cases consumption of network bandwidth would be same
2. In Case 1 Network bandwidth consumption would be low
3. Access Mostly Uused Products by 50000+ Subscribers
4. Cannot be determined

Question : Suppose you have the file in hdfs directory as below
/myapp/map.zip

And you will use the following API method to add this file to DistributedCache

JobConf job = new JobConf();
DistributedCache.addCacheArchive(new URI("/myapp/map.zip", job);

Which of the best place to read this file in a MapReduce job

1. Inside the map() method of the Mapper
2. You can randomly read this file as needed in the Mapper code
3. Access Mostly Uused Products by 50000+ Subscribers
4. All of the above statement are correct

Question : You have added the below files in Distributed cache

JobConf job = new JobConf();
DistributedCache.addCacheFile(new URI("/myapp/lookup.dat#lookup.dat"),
job);
DistributedCache.addCacheArchive(new URI("/myapp/map.zip", job);
DistributedCache.addFileToClassPath(new Path("/myapp/mylib.jar"), job);
DistributedCache.addCacheArchive(new URI("/myapp/mytar.tar", job);
DistributedCache.addCacheArchive(new URI("/myapp/mytgz.tgz", job);
DistributedCache.addCacheArchive(new URI("/myapp/mytargz.tar.gz", job);

Which of the following is a correct method to get all the paths in an Array of the Distributed Cache files

1. 1. Iterate over the DistributedCache instance in the Mapper and add all the cached file paths to an array.
2. 2. There is a direct method available on the DistributedCache.getAllFilePath()
3. Access Mostly Uused Products by 50000+ Subscribers
4. 4. All of the above

Question :

Which of the given is a correct code snippet of the Mapper,
for implementing word count example.

1. A
2. B
3. Access Mostly Uused Products by 50000+ Subscribers

Question : Select the correct statement while reading/writing the data in RDBMS using MapReduce

1. In order to use DBInputFormat you need to write a class that deserializes the columns from the database record into individual data fields to work with
2. The DBOutputFormat writes to the database by generating a set of INSERT statements in each reducer
3. Access Mostly Uused Products by 50000+ Subscribers
4. If you want to export a very large volume of data, you may be better off generating the INSERT statements into a text file, and then using a bulk data import tool provided by your database to do the database import.
5. All of the above

Question :

You have following data in a hive table

ID:INT,COLOR:TEXT,WIDTH:INT
1,green,190
2,blue,300
3,yellow,299
4,blue,199
5,green,199
6,yellow,299
7,green,799
8,red,800

Select the correct MapReduce program which can produce the output similar to below Hive Query.

Select `(green|blue)?+.+` from table;

1. 1
2. 2
3. Access Mostly Uused Products by 50000+ Subscribers
4. 4