Cloudera Hadoop Administrator Certification Certification Questions and Answer (Dumps and Practice Questions)

Question : Map the fillowing

1. ClientRMProtocol
2. AMRMProtocol
3. ContainerManager

A. The protocol used by the ApplicationMaster to talk to the NodeManager to start/stop containers and get status updates on the containers if needed.
B. The protocol for a client that wishes to communicate with the ResourceManager to launch a new application (i.e. the ApplicationMaster),
check on the status of the application or kill the application. For example, a job-client (a job launching program from the gateway) would use this protocol.
C. The protocol used by the ApplicationMaster to register/unregister itself to/from the ResourceManager as well as to request for resources from the Scheduler to complete its tasks.

1. 1-C, 2-A, 3-B
2. 1-A, 2-B, 3-C
3. 1-B, 2-C, 3-A
4. 1-A, 2-C, 3-B

Correct Answer : 3
The interfaces you'd most like be concerned with are:

ClientRMProtocol - Client ResourceManager
The protocol for a client that wishes to communicate with the ResourceManager to launch a new application (i.e. the ApplicationMaster), check on the status of the application or kill the application. For example, a job-client (a job launching program from the gateway) would use this protocol.
AMRMProtocol - ApplicationMaster -->ResourceManager
The protocol used by the ApplicationMaster to register/unregister itself to/from the ResourceManager as well as to request for resources from the Scheduler to complete its tasks.
ContainerManager - ApplicationMaster -->NodeManager
The protocol used by the ApplicationMaster to talk to the NodeManager to start/stop containers and get status updates on the containers if needed.

Question : The ApplicationReport received from the ResourceManager consists of the :

1. General application information: ApplicationId, queue to which the application was submitted,
user who submitted the application and the start time for the application.
2. ApplicationMaster details: the host on which the ApplicationMaster is running, the rpc port (if any)
on which it is listening for requests from clients and a token that the client needs to communicate with the ApplicationMaster.
3. Application tracking information: If the application supports some form of progress tracking,
it can set a tracking url which is available via ApplicationReport#getTrackingUrl that a client can look at to monitor progress.
4. ApplicationStatus: The state of the application as seen by the ResourceManager is available via ApplicationReport#getYarnApplicationState.
If the YarnApplicationState is set to FINISHED, the client should refer to ApplicationReport#getFinalApplicationStatus to check for the actual
success/failure of the application task itself. In case of failures, ApplicationReport#getDiagnostics may be useful to shed some more light on the the failure.
5. All of the above

Correct Answer : 5

The ApplicationReport received from the ResourceManager consists of the following:

General application information: ApplicationId, queue to which the application was submitted, user who submitted the application and the start time for the application.
ApplicationMaster details: the host on which the ApplicationMaster is running, the rpc port (if any) on which it is listening for requests from clients and a token that the client needs to communicate with the ApplicationMaster.
Application tracking information: If the application supports some form of progress tracking, it can set a tracking url which is available via ApplicationReport#getTrackingUrl that a client can look at to monitor progress.
ApplicationStatus: The state of the application as seen by the ResourceManager is available via ApplicationReport#getYarnApplicationState. If the YarnApplicationState is set to FINISHED, the client should refer to ApplicationReport#getFinalApplicationStatus to check for the actual success/failure of the application task itself. In case of failures, ApplicationReport#getDiagnostics may be useful to shed some more light on the the failure.
If the ApplicationMaster supports it, a client can directly query the ApplicationMaster itself for progress updates via the host:rpcport information obtained from the ApplicationReport. It can also use the tracking url obtained from the report if available.
In certain situations, if the application is taking too long or due to other factors, the client may wish to kill the application. The ClientRMProtocol supports the forceKillApplication call that allows a client to send a kill signal to the ApplicationMaster via the ResourceManager. An ApplicationMaster if so designed may also support an abort call via its rpc layer that a client may be able to leverage.

Question :

Select the correct statement for the YARN

1. The ApplicationMaster is the actual owner of the job. It will be launched by the ResourceManager
and via the client will be provided all the necessary information and resources about the job that it has been tasked with to oversee and complete.
2. As the ApplicationMaster is launched within a container that may (likely will) be sharing a physical
host with other containers, given the multi-tenancy nature, amongst other issues, it cannot make any assumptions of things like
pre-configured ports that it can listen on.
3. When the ApplicationMaster starts up, several parameters are made available to it via the environment.
These include the ContainerId for the ApplicationMaster container, the application submission time and details about
the NodeManager host running the Application Master. Ref ApplicationConstants for parameter names
4. 1 and 2
5. 1,2 and 3

Correct Answer : 5

Explanation: The ApplicationMaster is the actual owner of the job. It will be launched by the ResourceManager and via the client will be provided all the necessary information and resources about the job that it has been tasked with to oversee and complete.
As the ApplicationMaster is launched within a container that may (likely will) be sharing a physical host with other containers, given the multi-tenancy nature, amongst other issues, it cannot make any assumptions of things like pre-configured ports that it can listen on.
When the ApplicationMaster starts up, several parameters are made available to it via the environment. These include the ContainerId for the ApplicationMaster container, the application submission time and details about the NodeManager host running the Application Master. Ref ApplicationConstants for parameter names.
All interactions with the ResourceManager require an ApplicationAttemptId (there can be multiple attempts per application in case of failures). The ApplicationAttemptId can be obtained from the ApplicationMaster containerId. There are helper apis to convert the value obtained from the environment into objects.

After an ApplicationMaster has initialized itself completely, it needs to register with the ResourceManager via AMRMProtocol#registerApplicationMaster. The ApplicationMaster always communicate via the Scheduler interface of the ResourceManager.

Related Questions

Question : The Fair scheduler works best when there is a

1. When there is a need of Higher Memory
2. lot of variability between queues
3. Access Mostly Uused Products by 50000+ Subscribers
4. When there is a need of Higher CPU
5. When all the Jobs needs to be processed in submission order

Question : Select the correct statement regarding Capacity Scheduler

1. The Capacity scheduler permits sharing a cluster while giving each user or group certain minimum capacity guarantees.
2. The Capacity scheduler currently supports memory-intensive applications, where an application can optionally specify higher memory resource requirements than the default.
3. Access Mostly Uused Products by 50000+ Subscribers
4. 1 and 3
5. 1 and 2

Question :

Which of the following properties can exist only in the hdfs-site.xml

1. fs.default.name
2. hadoop.http.staticuser.user
3. Access Mostly Uused Products by 50000+ Subscribers
4. 1 and 2
5. 1 and 3

Question : Which of the following properties can be configured in mapred-site.xml

1. yarn --> mapreduce.framework.name
2. $mr_hist:10020 --> mapreduce.jobhistory.address
3. Access Mostly Uused Products by 50000+ Subscribers
4. 2 and 3
5. 1,2 and 3

Question : Which of the following properties are configured in the yarn-site.xml
1. mapreduce.shuffle --> yarn.nodemanager.aux-services
2. org.apache.hadoop.mapred.ShuffleHandler --> yarn.nodemanager.aux-services.mapreduce.shuffle.class
3. Access Mostly Uused Products by 50000+ Subscribers
4. $rmgr:8030 --> yarn.resourcemanager.scheduler.address
5. $rmgr:8031 --> yarn.resourcemanager.resource-tracker.address
6. $rmgr:8032 --> yarn.resourcemanager.address
7. $rmgr:8033 --> yarn.resourcemanager.admin.address
8. $rmgr:8088 --> yarn.resourcemanager.webapp.address

1. 1,2,3,6,7,8
2. 2,3,4,5,7,8
3. Access Mostly Uused Products by 50000+ Subscribers
4. 1,2,3,4,7,8
5. All 1,2,3,4,5,6,7,8

Question : Select the correct statement which applies to "Fair Scheduler"

1. Fair scheduling is a method of assigning resources to applications such that all apps get, on average, an equal share of resources over time
2. By default, the Fair Scheduler bases scheduling fairness decisions only on CPU
3. Access Mostly Uused Products by 50000+ Subscribers
4. 1 and 3
5. 1 2 and 3