Premium

Mapr (HP) HBase Developer Certification Questions and Answers (Dumps and Practice Questions)



Question : You want to store clickstream data in HBase. Your data consists of the following:
the source id
the name of the cluster
the url of the click
the datetimestamp for each click
Which rowkey should you use if you want to retrieve the source ids with a scan and sorted with the most recent first?
 : You want to store clickstream data in HBase. Your data consists of the following:
1. (source_id)(Long.MAX_VALUE - (Long)datetimestamp)
2. ((Long)datetimestamp)(source_id)

3. Access Mostly Uused Products by 50000+ Subscribers
4. (source_id)(datetimestamp)(Long.MAX_VALUE)


Correct Answer : Get Lastest Questions and Answer :

Explanation: One of design considerations for yours rowkey is an access pattern of table. In this scenario, your access pattern is to retrieve the source ids with the most recent first.
HBase stores rows in sorted order. Using the rowkey with reverse timestamp (Long.MAX_VALUE - (long) timestamp)>, the latest source id will be at the top of table and thus will be
scanned first. This will avoid having to scan the entire rowkey and save the storage for the smaller byte value of timestamp. A common problem in database processing is quickly
finding the most recent version of a value. A technique using reverse timestamps as a part of the key can help greatly with a special case of this problem. Also found in the HBase
chapter of Tom White's book Hadoop: The Definitive Guide (O'Reilly), the technique involves appending (Long.MAX_VALUE - timestamp) to the end of any key, e.g.,

[key][reverse_timestamp]. The most recent value for [key] in a table can be found by performing a Scan for [key] and obtaining the first record. Since HBase keys are in sorted order,
this key sorts before any older row-keys for [key] and thus is first. If the most important access path is to pull most recent events, then storing the timestamps as
reverse-timestamps (e.g., timestamp = Long.MAX_VALUE - timestamp) will create the property of being able to do a Scan on [hostname][log-event] to obtain the quickly obtain the most
recently captured events.




Question : Given the following HBase code:
byte [] rowKey = Bytes.toBytes(65);
Put put = new Put(rowKey);
put.add("info".getBytes(), "FirstName".getBytes(), "Kimberly".getBytes());
put.add("info".getBytes(), "LastName".getBytes(), "Grant".getBytes());
What does "info" represent?

 : Given the following HBase code:
1. Primary key of the row
2. Column family name
3. Access Mostly Uused Products by 50000+ Subscribers
4. Column value


Correct Answer : Get Lastest Questions and Answer :

Explanation: public Put add(byte[] family,
byte[] qualifier,
byte[] value)
Add the specified column and value to this Put operation.
Parameters:
family - family name
qualifier - column qualifier
value - column value
public Put add(byte[] family,
byte[] qualifier,
long ts,
byte[] value)
Add the specified column and value, with the specified timestamp as its version to this Put operation.
Parameters:
family - family name
qualifier - column qualifier
ts - version timestamp
value - column value
Returns:
this




Question : Given the following HBase code:
byte [] rowKey = Bytes.toBytes(65);
Put put = new Put(rowKey);
put.add("info".getBytes(), "FirstName".getBytes(), "Kimberly".getBytes());
put.add("info".getBytes(), "LastName".getBytes(), "Grant".getBytes());
What does "FirstName" represent?

 : Given the following HBase code:
1. Primary key of the row
2. Column family name
3. Access Mostly Uused Products by 50000+ Subscribers
4. Column value


Correct Answer : Get Lastest Questions and Answer :

Explanation: public Put add(byte[] family,
byte[] qualifier,
byte[] value)
Add the specified column and value to this Put operation.
Parameters:
family - family name
qualifier - column qualifier
value - column value
public Put add(byte[] family,
byte[] qualifier,
long ts,
byte[] value)
Add the specified column and value, with the specified timestamp as its version to this Put operation.
Parameters:
family - family name
qualifier - column qualifier
ts - version timestamp
value - column value
Returns:
this


Related Questions


Question :

Your client application needs to scan a region for a row key value 104. Given a store that contains the following list of RowKey values

100,101,102,103,104,105,106,107

A bloomfilter return which of the following


 :
1. Confirmation that 104 may be contained in the set
2. Confirmation that 104 is contained in the set
3. Access Mostly Uused Products by 50000+ Subscribers
4. The file offset of the value 104





Question : You want to do a full table scan on your data. You decide to disable block caching to see if this improves scan performance.
Will disabling block caching improve scan performance. Will disabling block caching improve scan performance ?
 : You want to do a full table scan on your data. You decide to disable block caching to see if this improves scan performance.
1. No, disabling blcok caching does not improve scan performance.
2. Yes, when you disable blcok caching, you free up that memory for the other operations. With a full table scan, you can not take
take advantage of block caching anyway because your entire table would not fit into cache.
3. Access Mostly Uused Products by 50000+ Subscribers
4. Yes, when you disable block caching, you free up memory for MemStore, which improves, scan performance.





Question : Your organization has an HBase cluster with half the nodes in Geneva and half the nodes in Neveda. Which of the following is true?
 : Your organization has an HBase cluster with half the nodes in Geneva and half the nodes in Neveda. Which of the following is true?
1. There must be two NameNodes one for Geneva and another for Neveda
2. As they are very far to each other avoid replication of the data and set replication factor=1

3. Access Mostly Uused Products by 50000+ Subscribers

4. Keep one datacenter as a backup and do not load any data in that.



Question : You have a AcmeLog table in HBase. The RowKeys are numbers.
You want to retrieve all entries that have row key 100.
Which shell command should you use?
 : You have a AcmeLog table in HBase. The RowKeys are numbers.
1. get 'AcmeLog', (FILTER ='100')
2. get 'AcmeLog', '100'

3. Access Mostly Uused Products by 50000+ Subscribers
4. scan 'AcmeLog', '100'




Question : You have a AcmeUsers table in HBase and you would like to insert a row that consists
of a AcmeID,jayesh2014 and an email address, john@acmeshell.com. The table has a single Column Family
named Meta and the row key will be the Acme's ID. Which command help in this case?
 : You have a AcmeUsers table in HBase and you would like to insert a row that consists
1. put 'AcmeUsers', 'jayesh2014', 'john@acmeshell.com'

2. put 'AcmeUsers', 'Meta:AcmeID', 'jayesh2014', 'Email, 'john@acmeshell.com'

3. Access Mostly Uused Products by 50000+ Subscribers

4. put 'AcmeUsers', 'AcmeID:jayesh2014', 'Email:john@acmeshell.com'




Question : You are storing page view data for a large number of Web sites, each of which has
many subdomains (www.acmeshell.com, archive.acmeshell.com, beta.acmeshell.com, etc.). Your reporting tool needs
to retrieve the total number of page views for a given subdomain of a Web site. Which of the following rowkeys should you use?
 : You are storing page view data for a large number of Web sites, each of which has
1. The domain name followed by the URL

2. The URL followed by the reverse domain name

3. Access Mostly Uused Products by 50000+ Subscribers

4. The URL

5. The URL
including http
http://www.training4exam.com/hbase-hot-spot-detection-and-resolution