Question : In the reducer, the MapReduce API provides you with an iterator over Writable values. What does calling the next () method return?
1. It returns a reference to a different Writable object time. 2. It returns a reference to a Writable object from an object pool. 3. Access Mostly Uused Products by 50000+ Subscribers 4. It returns a reference to a Writable object. The API leaves unspecified whether this is a reused object or a new object. 5. It returns a reference to the same Writable object if the next value is the same as the previous value, or a new Writable object otherwise.
Correct Answer : Get Lastest Questions and Answer : Explanation: Calling Iterator.next() will always return the SAME EXACT instance of IntWritable, with the contents of that instance replaced with the next value
Question : MapReduce v (MRv/YARN) splits which major functions of the JobTracker into separate daemons? Select two. A. Heath states checks (heartbeats) B. Resource management C. Job scheduling/monitoring D. Job coordination between the ResourceManager and NodeManager E. Launching tasks F. Managing file system metadata G. MapReduce metric reporting H. Managing tasks 1. B,C 2. A,D 3. Access Mostly Uused Products by 50000+ Subscribers 4. C,H 5. B,G
Correct Answer : Get Lastest Questions and Answer : Explanation: The fundamental idea of MRv2 is to split up the two major functionalities of the JobTracker, resource management and job scheduling/monitoring, into separate daemons. The idea is to have a global ResourceManager (RM) and per-application ApplicationMaster (AM). An application is either a single job in the classical sense of Map- Reduce jobs or a DAG of jobs.
The central goal of YARN is to clearly separate two things that are unfortunately smushed together in current Hadoop, specifically in (mainly) JobTracker: / Monitoring the status of the cluster with respect to which nodes have which resources available. Under YARN, this will be global. / Managing the parallelization execution of any specific job. Under YARN, this will be done separately for each job.
Question : For each input key-value pair, mappers can emit: 1. As many intermediate key-value pairs as designed. There are no restrictions on the types of those key-value pairs (i.e., they can be heterogeneous). 2. As many intermediate key-value pairs as designed, but they cannot be of the same type as the input key-value pair. 3. Access Mostly Uused Products by 50000+ Subscribers 4. One intermediate key-value pair, but of the same type. 5. As many intermediate key-value pairs as designed, as long as all the keys have the same types and all the values have the same type.
Correct Answer : Get Lastest Questions and Answer : Explanation: Mapper maps input key/value pairs to a set of intermediate key/value pairs. Maps are the individual tasks that transform input records into intermediate records. The transformed intermediate records do not need to be of the same type as the input records. A given input pair may map to zero or many output pairs.
1. Map files are stored on the namenode and capture the metadata for all blocks on a particular rack. This is how Hadoop is "rack aware" 2. Map files are the files that show how the data is distributed in the Hadoop cluster. 3. Access Mostly Uused Products by 50000+ Subscribers 4. Map files are sorted sequence files that also have an index. The index allows fast data look up.