Question : Developer has submitted the YARN Job, by calling submitApplication() method on Resource Manager. Please select the correct order of the below stpes after that
1. Container will be managed by Node Manager after job submission 2. Resource Manager triggers its sub-component Scheduler, which allocates containers for mapreduce job execution. 3. Resource Manager starts Application Master in the container
1. 2,3,1 2. 1,2,3 3. 2,1,3 4. 1,3,2
Correct Answer : 1
Explanation: Job Start up: The call to Job.waitForCompletion() in the main driver class is where all the execution starts. The driver is the only piece of code that runs on our local machine, and this call starts the communication with the Resource Manager. Retrieves the new Job ID or Application ID from Resource Manager. The Client Node copies Job Resources specified via the -files, -archives, and -libjars command-line arguments, as well as the job JAR file on to HDFS. Finally, Job is submitted by calling submitApplication() method on Resource Manager. Resource Manager triggers its sub-component Scheduler, which allocates containers for mapreduce job execution. Then Resource Manager starts Application Master in the container provided by the scheduler. This container will be managed by Node Manager from here on wards.
Question : Which of the following are responsbilities of the ApplicationMater
1. Before starting any task, create job's output directory for job's OutputCommitter. 2. Both map tasks and reduce tasks are created by Application Master. 3. If the submitted job is small, then Application Master runs the job in the same JVM on which Application Master is running. 4. If job doesn't qualify as Uber task, Application Master requests containers for all map tasks and reduce tasks.
1. 1,2,3 2. 2,3,4 3. 1,3,4 4. 1,2,4 5. 1,2,3,4
Correct Answer : 5
Explanation: Role of an Application Master: o Before starting any task, Job setup method is called to create job's output directory for job's OutputCommitter. o As noted above, Both map tasks and reduce tasks are created by Application Master. o If the submitted job is small, then Application Master runs the job in the same JVM on which Application Master is running. It reduces the overhead of creating new container and running tasks in parallel. These small jobs are called as Uber tasks. o Uber tasks are decided by three configuration parameters, number of mappers "less than and equal to" 10, number of reducers "less than and equal to" 1 and Input file size is less than or equal to an HDFS block size. These parameters can be configured via mapreduce.job.ubertask.maxmaps , mapreduce.job.ubertask.maxreduces , and mapreduce.job.ubertask.maxbytes properties in mapred-site.xml. o If job doesn't qualify as Uber task, Application Master requests containers for all map tasks and reduce tasks.
Question : Which of the following are the steps followed as part of TaskExecution
1. Once Containers assigned to tasks, Application Master starts containers by notifying its Node Manager. 2. Application Master copies Job resources (like job JAR file) from HDFS distributed cache and runs map or reduce tasks. 3. Node Manager copies Job resources (like job JAR file) from HDFS distributed cache and runs map or reduce tasks. 4. Running Tasks, keep reporting about the progress and status (Including counters) of current task to Application Master and Application Master collects this progress information from all tasks and aggregate values are propagated to Client Node or user.
1. 1,2,3 2. 2,3,4 3. 3,4,1 4. 1,3,4 5. 1,2,3,4
Correct Answer : 4
Explanation: Task Execution: Once Containers assigned to tasks, Application Master starts containers by notifying its Node Manager. Node Manager copies Job resources (like job JAR file) from HDFS distributed cache and runs map or reduce tasks. Running Tasks, keep reporting about the progress and status (Including counters) of current task to Application Master and Application Master collects this progress information from all tasks and aggregate values are propagated to Client Node or user.
1. YARN takes into account all of the available compute resources on each machine in the cluster. 2. Based on the available resources, YARN negotiates resource requests from applications (such as MapReduce) running in the cluster. 3. YARN then provides processing capacity to each application by allocating Containers. 4. 1 and 3 5. 1,2 and 3