Question : When you submitted your MapReduce job on YARN framework, Which of the following component is responsible for monitoring resource usage e.g. CPU,memory, disk,network on individual nodes. 1. Resource Manager 2. Application Master
3. Node Manager
4. NameNode
Correct Answer : 3
Explanation: In YARN, Task Tracker is replaced with Node Manager in YARN which is a per-machine framework agent and it is responsible for containers, monitoring their resource usage (CPU, memory, disk, network) and reporting the same to the Resource Manager. Application Master negotiates with Resource Manager to get the resources across cluster and work with the Node Managers to execute and monitor the tasks.
Explanation: Components of Mapreduce Job Flow: Mapreduce job flow on YARN involves below components. A Client node, which submits the Mapreduce job. The YARN Resource Manager, which allocates the cluster resources to jobs. The YARN Node Managers, which launch and monitor the tasks of jobs. The MapReduce Application Master, which coordinates the tasks running in the MapReduce job. The application master and the MapReduce tasks run in containers that are scheduled by the resource manager, and managed by the node managers. The HDFS file system is used for sharing job files between the above entities.
Question : Developer has submitted the YARN Job, by calling submitApplication() method on Resource Manager. Please select the correct order of the below steps after that
1. Container will be managed by Node Manager after job submission 2. Resource Manager triggers its sub-component Scheduler, which allocates containers for mapreduce job execution. 3. Resource Manager starts Application Master in the container
1. 2,3,1 2. 1,2,3 3. 2,1,3 4. 1,3,2
Correct Answer : 1
Explanation: Job Start up: The call to Job.waitForCompletion() in the main driver class is where all the execution starts. The driver is the only piece of code that runs on our local machine, and this call starts the communication with the Resource Manager. Retrieves the new Job ID or Application ID from Resource Manager. The Client Node copies Job Resources specified via the -files, -archives, and -libjars command-line arguments, as well as the job JAR file on to HDFS. Finally, Job is submitted by calling submitApplication() method on Resource Manager. Resource Manager triggers its sub-component Scheduler, which allocates containers for mapreduce job execution. Then Resource Manager starts Application Master in the container provided by the scheduler. This container will be managed by Node Manager from here on wards.
1. Setting -Djava.library.path on the command line while launching a container 2. use LD_LIBRARY_PATH 3. Setting -Dnative.library.path on the command line while launching a container 4. By Adding the Jar's in the Hadoop Job Jar