Hadoop is scalable but.. * MapReduce slow and difficult * Does not support random writes * Poor support for random reads
Question :
Your client application calls the following method for all puts to the single table notiifcations put.setWriteToWAL(false); One region, region1 for the notifications table is assigned to RegionServer rs1. Which of the following statements describes the result of RegionServer rs1 crashing ?
Explanation: HBase uses write Ahead Log, if you dont write to it you will lose all the data thats only in the memstores when a region server fails. This setting is useful for importing a lot of data.
Question :
Which of the following configuration values determines automated splitting ?
Consider going to larger regions to cut down on the total number of regions on your cluster. Generally less Regions to manage makes for a smoother running cluster (You can always latermanually split the big Regions should one prove hot and you want to spread the request load over the cluster). A lower number of regions is preferred, generally in the range of 20 to low- hundreds per RegionServer. Adjust the regionsize as appropriate to achieve this number.
For the 0.90.x codebase, the upper-bound of regionsize is about 4Gb, with a default of 256Mb. For 0.92.x codebase, due to the HFile v2 change much larger regionsizes can be supported (e.g., 20Gb). You may need to experiment with this setting based on your hardware configuration and application needs. Adjust hbase.hregion.max.filesize in your hbase-site.xml. RegionSize can also be set on a per- table basis via HTableDescriptor.
1. It does not. Increasing block size does not improve scan performance. 2. It does not. Increasing block size means that fewer blocks fit into your block cache. This requires HBase to read each block from disk rather than cache for each scan, thereby decreasing scan performance. 3. Access Mostly Uused Products by 50000+ Subscribers 4. Increasing block size means fewer block indexes that need to be read from disk, thereby increasing scan performance.
1. A MultiGet must be issued for rows D, E, F, G, H. 2. The scan class supports ranges via the stop and start rows. 3. Access Mostly Uused Products by 50000+ Subscribers 4. In order to range scan, raw scan mode must be enabled.
1. The client looks up the location of ROOT, in which it looks up the location of META, in which it looks up the location of the correct Users region. 2. The client looks up the location of the master, in which it looks up the location of META, in which it looks up the location of the correct Users region. 3. Access Mostly Uused Products by 50000+ Subscribers 4. The client queries the master to find the location of the Users table.