The ChainMapper class allows to use multiple Mapper classes within a single Map task. The Mapper classes are invoked in a chained (or piped) fashion, the output of the first becomes the input of the second, and so on until the last Mapper, the output of the last Mapper will be written to the task's output.
The ChainReducer class allows to chain multiple Mapper classes after a Reducer within the Reducer task.
For each record output by the Reducer, the Mapper classes are invoked in a chained (or piped) fashion, the output of the first becomes the input of the second, and so on until the last Mapper, the output of the last Mapper will be written to the task's output.
Notes: Running all the Pre and Post processing in a single Jobs leaves no intermediate file and there is dramatic reduction in IO
Question :
Is Data Joining like (RDBMS Join is possible in the Hadoop MapReduce) 1. Yes 2. NO
1. Hive is a part of the Apache Hadoop project that provides SQL like interface for data processing 2. Hive is one component of the Hadoop framework that allows for collecting data together into an external repository 3. Access Mostly Uused Products by 50000+ Subscribers 4. HIVE is part of the Apache Hadoop project that enables in-memory analysis of real-time streams of data
1. The Hadoop administrator has to set the number of the reducer slot to zero on all slave nodes. This will disable the reduce step. 2. It is imposible to disable the reduce step since it is critical part of the Mep-Reduce abstraction. 3. Access Mostly Uused Products by 50000+ Subscribers 4. While you cannot completely disable reducers you can set output to one. There needs to be at least one reduce step in Map-Reduce abstraction.
1. The default input format is xml. Developer can specify other input formats as appropriate if xml is not the correct input 2. There is no default input format. The input format always should be specified. 3. Access Mostly Uused Products by 50000+ Subscribers 4. The default input format is TextInputFormat with byte offset as a key and entire line as a value
1. In order to overwrite default input format, the Hadoop administrator has to change default settings in config file 2. In order to overwrite default input format, a developer has to set new input format on job config before submitting the job to a cluster 3. Access Mostly Uused Products by 50000+ Subscribers 4. None of these answers are correct Solution : 21