Question : You need to create a job that does frequency analysis on input data. You will do this by writing a Mapper that uses TextInputFormat and splits each value (a line of text from an input file) into individual characters. For each one of these characters, you will emit the character as a key and an InputWritable as the value. As this will produce proportionally more intermediate data than input data, which two resources should you expect to be bottlenecks?
1. Processor and network I/O
2. Disk I/O and network I/O
3. Processor and RAM
4. Processor and disk I/O
Correct Answer : 2 Explanation:
Question : You use the hadoop fs -put command to write a MB file using and HDFS block size of MB . Just after this command has finished writing MB of this file, what would another user see when trying to access this life?
1. They would see Hadoop throw a ConcurrentFileAccessException when they try to access this file.
2. They would see the current state of the file, up to the last bit written by the command.
3. They would see the current of the file through the last completed block.
4. They would see no content until the whole file written and closed.
Correct Answer : 3 Explanation:
Question : Which statement is true 1. Output of the reducer could be zero 2. Output of the reducer is written to the HDFS 3. In practice, the reducer usually emits a single key-value pair for each input key 4. All of the above
Correct Answer 4 :
Explanation: Reducer can either have zero output or more final key-value pairs. And whatever is the output it will be written to HDFS.
In Practice,the reducer usually emits a single key-value pair for each input key.
Refer HadoopExam.com Recorded Training Module : 1 and 3