Question : Rows from the HBase cab directly be inserted as input to Mapreduce job
1. True 2. False
Correct Answer : 1
Explanation: You can run MR over data that is stored in HBase
Refer HadoopExam.com Recorded Training Module : 18
Question : In which of the following scenerio we should use HBase 1. If it require random read, write or both 2. If it requires to do many thousands of operations per second on multiple TB of data 3. If access pattern is well known and simple 4. All of the above
Correct Answer : 4
Apache HBase : Use Apache HBase when you need random, realtime read or write access to your Big Data. HBase goal is the hosting of very large tables - billions of rows X millions of columns - atop clusters of commodity hardware. - If you know the access pattern in advance you can put all the data which are used together in a single column family hence it access become faster.
Refer HadoopExam.com Recorded Training Module : 18
Question : In which scenerio HBase should not be used
1. You only append to your dataset, and tend to read the whole thing 2. For ad-hoc analytics 3. If data volume is quite small 4. All of the above 5. None of the above
Correct Answer : 4
When Should I Use or not HBase?
First, make sure you have enough data. If you have hundreds of millions or billions of rows, then HBase is a good candidate. If you only have a few thousand or million rows, then using a traditional RDBMS might be a better choice due to the fact that all of your data might wind up on a single node (or two) and the rest of the cluster may be sitting idle.
Second, make sure you can live without all the extra features that an RDBMS provides and Ad-hoc analysis could make your queries slower.