You want to do mostly full table scans on your data. In order to improve performance you increase your block size. Why does this improve your scan performance?
1. It does not. Increasing block size does not improve scan performance. 2. It does not. Increasing block size means that fewer blocks fit into your block cache. This requires HBase to read each block from disk rather than cache for each scan, thereby decreasing scan performance. 3. Access Mostly Uused Products by 50000+ Subscribers 4. Increasing block size means fewer block indexes that need to be read from disk, thereby increasing scan performance.
Change HFile block size to something bigger to improve scan (at cost of random read).
Question :
Your client application connects to HBase for the first time and queries the .META. table. What information does the .META. table provide to your client application?
The .META. table keeps a list of all regions in the system. The .META. table structure is as follows: Key: Region key of the format ([table],[region start key],[region id]) Values:
info:regioninfo (serialized HRegionInfo instance for this region) info:server (server:port of the RegionServer containing this region) info:serverstartcode (start-time of the RegionServer process containing this region)
Question :
You have a table where keys range from "A" to "Z", and you want to scan from "D" to "H." Which of the following is true?
1. A MultiGet must be issued for rows D, E, F, G, H. 2. The scan class supports ranges via the stop and start rows. 3. Access Mostly Uused Products by 50000+ Subscribers 4. In order to range scan, raw scan mode must be enabled.
Rather than specifying a single row, an optional startRow and stopRow may be defined. If rows are not specified, the Scanner will iterate over all rows.
1. ColumnFamilies can set a TTL length in seconds 2. Rows will automatically delete when expiration time is reached 3. Access Mostly Uused Products by 50000+ Subscribers 4. Used in conjunction with minimum versions setting 5. All of the above