Question : You have following sample Mapper class and its map() method.
public class ProjectionMapper extends Mapper { private Text word = new Text(); private LongWritable count = new LongWritable(); @Override protected void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException { String[] split = value.toString().split("\t+"); word.set(split[0]); if (split.length > 2) { try { count.set(Long.parseLong(split[2])); context.write(word, count); } catch (NumberFormatException e) { } } } } Now, select the correct statement based on above code. 1. Four arguments to the Mapper Class added in angle bracket are for Input key and value as well output key and value
Correct Answer : Get Lastest Questions and Answer : Explanation: The map() method is called once per input record, so it pays to avoid unnecessary object creation. The body of map() is straightforward: It splits the tab-separated input line into fields, and uses the first field as the word, and the third as the count. The map output is written using the write method in Context. For simplicity, this code ignores lines with an occurrence field that is not a number, but there are other actions you could take, such as incrementing a MapReduce counter to track how many lines it affects (see the getCounter() method on Context for details)
Correct Answer : Get Lastest Questions and Answer : Explanation: map method is called for each record of Input Split and each call emits the key value pair. Now sorting , suffling, merging and partioning is a Hadoop framework responsbility.
Question : We have reduce class example as below.
public class LongSumReducer extends Reducer {
private LongWritable result = new LongWritable();
public void reduce(KEY key, Iterable values, Context context) throws IOException, InterruptedException { long sum = 0; for (LongWritable val : values) { sum += val.get(); } result.set(sum); context.write(key, result); } }
Select correct option 1. reduce methods emits final results as key and value. Both will be saved on hdfs.
2. reduce methods emits final results value and will be saved on hdfs.
Correct Answer : Get Lastest Questions and Answer : Explanation: reduce() method signature is different from map() because it has an iterator over the values, rather than a single value. This reflects the grouping that the framework performs on the values for a key. In LongSumReducer, the implementation is very simple: It sums the values, then writes the total out using the same key as the input.