Question : Which statement is true with respect to MapReduce . or YARN 1. It is the newer version of MapReduce, using this performance of the data processing can be increased. 2. The fundamental idea of MRv2 is to split up the two major functionalities of the JobTracker, resource management and job scheduling or monitoring, into separate daemons. 3. Access Mostly Uused Products by 50000+ Subscribers 4. All of the above 5. Only 2 and 3 are correct Ans : 5 Exp : MapReduce has undergone a complete overhaul in hadoop-0.23 and we now have, what we call, MapReduce 2.0 (MRv2) or YARN. The fundamental idea of MRv2 is to split up the two major functionalities of the JobTracker, resource management and job scheduling or monitoring, into separate daemons. The idea is to have a global ResourceManager (RM) and per-application ApplicationMaster (AM). An application is either a single job in the classical sense of Map-Reduce jobs or a DAG of jobs.
You can also Refer/Consider Advance Hadoop YARN Training by HadoopExam.com
Question : Which statement is true about ApplicationsManager
1. is responsible for accepting job-submissions 2. negotiating the first container for executing the application specific ApplicationMaster and provides the service for restarting the ApplicationMaster container on failure. 3. Access Mostly Uused Products by 50000+ Subscribers 4. All of the above 5. 1 and 2 are correct Ans : 5 Exp : The ApplicationsManager is responsible for accepting job-submissions, negotiating the first container for executing the application specific ApplicationMaster and provides the service for restarting the ApplicationMaster container on failure.
You can also Refer/Consider Advance Hadoop YARN Training by HadoopExam.com
Question : Which tool is used to list all the blocks of a file ?
Explanation: Each slave node in a cluster configured to run MapReduce v2 (MRv2) on YARN typically runs a DataNode daemon (for HDFS functions) and NodeManager daemon (for YARN functions). The NodeManager handles communication with the ResourceManager, oversees application container lifecycles, monitors CPU and memory resource use of the containers, tracks the node health, and handles log management. It also makes available a number of auxiliary services to YARN applications.
You can also Refer/Consider Advance Hadoop YARN Training by HadoopExam.com
Question : How does the Hadoop framework determine the number of Mappers required for a MapReduce job on a cluster running MapReduce v (MRv) on YARN? 1. The number of Mappers is equal to the number of InputSplits calculated by the client submitting the job 2. The ApplicationMaster chooses the number based on the number of available nodes
Correct Answer : Get Lastest Questions and Answer : Each Mapper task processes a single InputSplit. The client calculates the InputSplits before submitting the job to the cluster. The developer may specify how the input split is calculated, with a single HDFS block being the most common split. This is true for both MapReduce v1 (MRv1) and YARN MapReduce implementations.
With YARN, each mapper will be run in a container which consists of a specific amount of CPU and memory resources. The ApplicationMaster requests a container for each mapper. The ResourceManager schedules the resources and instructs the ApplicationMaster of available NodeManagers where the container may be launched.
With MRv1, each Tasktracker (slave node) is configured to handle a maximum number of concurrent map tasks. The JobTracker (master node) assigns a Tasktracker a specific Inputslit to process as a single map task.
You can also Refer/Consider Advance Hadoop YARN Training by HadoopExam.com
Question :
You need to analyze 60,000,000 images stored in JPEG format, each of which is approximately 25 KB. Because your Hadoop cluster isn't optimized for storing and processing many small files you decide to do the following actions: 1. Group the individual images into a set of larger files 2. Use the set of larger files as input for a MapReduce job that processes them directly with Python using Hadoop streaming Which data serialization system gives you the flexibility to do this?
A. CSV B. XML C. HTML D. Avro E. Sequence Files F. JSON