Question : Which of the following are responsbilities of the ApplicationMater
1. Before starting any task, create job's output directory for job's OutputCommitter. 2. Both map tasks and reduce tasks are created by Application Master. 3. Access Mostly Uused Products by 50000+ Subscribers 4. If job doesn't qualify as Uber task, Application Master requests containers for all map tasks and reduce tasks.
Explanation: Role of an Application Master: o Before starting any task, Job setup method is called to create job's output directory for job's OutputCommitter. o As noted above, Both map tasks and reduce tasks are created by Application Master. o If the submitted job is small, then Application Master runs the job in the same JVM on which Application Master is running. It reduces the overhead of creating new container and running tasks in parallel. These small jobs are called as Uber tasks. o Uber tasks are decided by three configuration parameters, number of mappers "less than and equal to" 10, number of reducers "less than and equal to" 1 and Input file size is less than or equal to an HDFS block size. These parameters can be configured via mapreduce.job.ubertask.maxmaps , mapreduce.job.ubertask.maxreduces , and mapreduce.job.ubertask.maxbytes properties in mapred-site.xml. o If job doesn't qualify as Uber task, Application Master requests containers for all map tasks and reduce tasks.
You can also Refer/Consider Advance Hadoop YARN Training by HadoopExam.com
Question : A _____ is the basic unit of processing capacity in YARN, and is an encapsulation of resource elements (memory, cpu etc.)
Correct Answer : Get Lastest Questions and Answer : A Container is the basic unit of processing capacity in YARN, and is an encapsulation of resource elements (memory, cpu etc.).
You can also Refer/Consider Advance Hadoop YARN Training by HadoopExam.com
Question : __________ are responsible for local monitoring of resource availability, fault reporting, and container life-cycle management (e.g., starting and killing jobs).
Explanation: The central ResourceManager runs as a standalone daemon on a dedicated machine and acts as the central authority for allocating resources to the various competing applications in the cluster. The ResourceManager has a central and global view of all cluster resources and, therefore, can provide fairness, capacity, and locality across all users. Depending on the application demand, scheduling priorities, and resource availability, the ResourceManager dynamically allocates resource containers to applications to run on particular nodes. A container is a logical bundle of resources (e.g., memory, cores) bound to a particular cluster node. To enforce and track such assignments, the ResourceManager interacts with a special system daemon running on each node called the NodeManager. Communications between the ResourceManager and NodeManagers are heartbeat based for scalability. NodeManagers are responsible for local monitoring of resource availability, fault reporting, and container life-cycle management (e.g., starting and killing jobs). The ResourceManager depends on the NodeManagers for its "global view" of the cluster.
User applications are submitted to the ResourceManager via a public protocol and go through an admission control phase during which security credentials are validated and various operational and administrative checks are performed. Those applications that are accepted pass to the scheduler and are allowed to run. Once the scheduler has enough resources to satisfy the request, the application is moved from an accepted state to a running state. Aside from internal bookkeeping, this process involves allocating a container for the ApplicationMaster and spawning it on a node in the cluster. Often called 'container 0,' the ApplicationMaster does not get any additional resources at this point and must request and release additional containers.
You can also Refer/Consider Advance Hadoop YARN Training by HadoopExam.com
Question : Select the correct statement while reading/writing the data in RDBMS using MapReduce 1. In order to use DBInputFormat you need to write a class that deserializes the columns from the database record into individual data fields to work with 2. The DBOutputFormat writes to the database by generating a set of INSERT statements in each reducer 3. Access Mostly Uused Products by 50000+ Subscribers 4. If you want to export a very large volume of data, you may be better off generating the INSERT statements into a text file, and then using a bulk data import tool provided by your database to do the database import. 5. All of the above