Your client connects to HBase for the first time to read a row user_1234 located in a table Users. What process does your client use to find the correct RegionServer to which it should send the request?
1. The client looks up the location of ROOT, in which it looks up the location of META, in which it looks up the location of the correct Users region. 2. The client looks up the location of the master, in which it looks up the location of META, in which it looks up the location of the correct Users region. 3. Access Mostly Uused Products by 50000+ Subscribers 4. The client queries the master to find the location of the Users table.
*The general flow is that a new client contacts the Zookeeper quorum (a separate cluster of Zookeeper nodes) first to find a particular row key. It does so by retrieving the server name (i.e. host name) that hosts the - ROOT- region from Zookeeper. With that information it can query that server to get the server that hosts the . META. table. Both of these two details are cached and only looked up once. Lastly it can query the .META. server and retrieve the server that has the row the client is looking for.
*The HBase client HTable is responsible for finding RegionServers that are serving the particular row range of interest. It does this by querying the .META. and -ROOT- catalog tables.After locating the required region(s), the client directly contacts the RegionServer serving that region (i.e., it does not go through the master) and issues the read or write request. This information is cached in the client so that subsequent requests need not go through the lookup process. Should a region be reassigned either by the master load balancer or because a RegionServer has died, the client will requery the catalog tables to determine the new location of the user region.
Question :
Your data load application is maintaining a custom versioning scheme (not using the timestamp as the version number). You accidentally executed three writes to a given cell all with the same version during which time no flushes have occurred. Which of the three data writes will dBase maintain? 1. None of the writes to cell 2. The last write to cell 3. Access Mostly Uused Products by 50000+ Subscribers 4. All of the writes to cell
If multiple writes to a cell have the same version, are all versions maintained or just the last? Currently, only the last written is fetchable. In this post https://issues.apache.org/jira/browse/HBASE-2406 read Kevin Peterson answer (
If multiple writes to a cell have the same timestamp, one of those versions will be maintained, and it is undefined which version will be maintained.
)
Hence, in question there is no appropriate answer given, so best fit will be B
Similarly HBase definitive guide, we find below (Page 384)
When using your own timestamp values, you need to test your solution thoroughly, as this approach has not been used widely in production. Be aware that negative timestamp values are untested and, while they have been discussed a few times in HBase developer circles, they have never been confirmed to work properly. Make sure to avoid collisions by using the same value for two separate updates to the same cell. Usually the last saved value is visible afterward.
In the below link it mentions http://hbase.apache.org/book/versions.html#ftn.d4029e5669
Remark 8 says (http://hbase.apache.org/book/versions.html#ftn.d4029e5669) : Currently, only the last written is fetchable.
Question :
Your client application needs to write a row to a region that has, recently split. Where will the row be written?
*With a roughly uniform data distribution and growth, eventually all the regions in the table will need to be split at the same time. Immediately following a split, compactions will run on the daughter regions to rewrite their data into separate files. This causes a large amount of disk I/O and network traffic.
*Splits run unaided on the RegionServer; i.e. the Master does not participate. The RegionServer splits a region, offlines the split region and then adds the daughter regions to META, opens daughters on the parent's hosting RegionServer and then reports the split to the Master.