Correct Answer : Get Lastest Questions and Answer : Explanation: 1. Filesystem counters Filesystem counters are used to analysis experimental results. The following are the typical built-in filesystem counters. Local file system FILE_BYTES_READ FILE_BYTES_WRITTEN HDFS file system HDFS_BYTES_READ HDFS_BYTES_WRITTEN FILE_BYTES_READ is the number of bytes read by local file system. Assume all the map input data comes from HDFS, then in map phase FILE_BYTES_READ should be zero. On the other hand, the input file of reducers are data on the reduce-side local disks which are fetched from map-side disks. Therefore, FILE_BYTES_READ denotes the total bytes read by reducers.
FILE_BYTES_WRITTEN consists of two parts. The first part comes from mappers. All the mappers will spill intermediate output to disk. All the bytes that mappers write to disk will be included in FILE_BYTES_WRITTEN. The second part comes from reducers. In the shuffle phase, all the reducers will fetch intermediate data from mappers and merge and spill to reducer-side disks. All the bytes that reducers write to disk will also be included in FILE_BYTES_WRITTEN.
HDFS_BYTES_READ denotes the bytes read by mappers from HDFS when the job starts. This data includes not only the content of source file but also metadata about splits.
HDFS_BYTES_WRITTEN denotes the bytes written to HDFS. It s the number of bytes of the final output.
Note that since HDFS and local file systems are different file systems so the data from the two file systems will never overlap.
Question : label-based scheduling, help us to override the default scheduling algorithm and run tasks on specific nodes 1. True 2. False
Question : While doing MRUnit test, you provide input key and value as well as expected output. What happens if the actual output does not match the expected output 1. Test case will fail and driver will throw an exception
2. Test case will fail and no exception from driver