1. Yes, there is a special data type that will suppress job output 2. No, map reduce job will always generate output. 3. Access Mostly Uused Products by 50000+ Subscribers 4. Yes, but only during map execution when reducers have been set to zero
Explanation: There are a number of scenarios where output is not required from reducers. For instance, web crawling or image processing does not require external fetch or data processing.
Explanation: Map files are just a variation of sequence files. They store data in sorted order
Question :What is the most important feature of map-reduce
1. Ability to store large amount of data 2. Ability to process data on the cluster of the machines without copying all the data over 3. Access Mostly Uused Products by 50000+ Subscribers 4. Ability to process large amounts of data in parallel
Explanation: The fundamental difference of the Hadoop framework is that multiple machines will be used to process the same data and data is readily available for processing in distributed file system
Question : In which of the following scenerio we should use HBase 1. If it require random read, write or both 2. If it requires to do many thousands of operations per second on multiple TB of data 3. If access pattern is well known and simple 4. All of the above
1. You only append to your dataset, and tend to read the whole thing 2. For ad-hoc analytics 3. If data volume is quite small 4. All of the above 5. None of the above