Question : You have the following key-value pairs as output from your Map task: (the, 1) (fox, 1) (faster, 1) (than, 1) (the, 1) (dog, 1) How many keys will be passed to the Reducer's reduce method?
Correct Answer : Get Lastest Questions and Answer : Explanation: Only one key value pair will be passed from the two (the, 1) key value pairs.
Question : You have a directory named jobdata in HDFS that contains four files: _first.txt, second.txt, .third.txt and #data.txt. How many files will be processed by the FileInputFormat.setInputPaths () command when it's given a path object representing this directory?
1. Four, all files will be processed 2. Three, the pound sign is an invalid character for HDFS file names 3. Access Mostly Uused Products by 50000+ Subscribers 4. None, the directory cannot be named jobdata 5. One, no special characters can prefix the name of an input file
Correct Answer : Get Lastest Questions and Answer : Explanation: Files starting with '_' are considered 'hidden' like unix files starting with '.'. # characters are allowed in HDFS file names.
Question : On a cluster running MapReduce v (MRv), a TaskTracker heartbeats into the JobTracker on your cluster, and alerts the JobTracker it has an open map task slot. What determines how the JobTracker assigns each map task to a TaskTracker? 1. The amount of RAM installed on the TaskTracker node. 2. The amount of free disk space on the TaskTracker node. 3. Access Mostly Uused Products by 50000+ Subscribers 4. The average system load on the TaskTracker node over the past fifteen (15) minutes. 5. The location of the InsputSplit to be processed in relation to the location of the node.
Correct Answer : Get Lastest Questions and Answer : Explanation: The TaskTrackers send out heartbeat messages to the JobTracker, usually every few minutes, to reassure the JobTracker that it is still alive. These message also inform the JobTracker of the number of available slots, so the JobTracker can stay up to date with where in the cluster work can be delegated. When the JobTracker tries to find somewhere to schedule a task within the MapReduce operations, it first looks for an empty slot on the same server that hosts the DataNode containing the data, and if not, it looks for an empty slot on a machine in the same rack.