Mapr (HP) Hadoop Developer Certification Questions and Answers (Dumps and Practice Questions)

Question : You can compress sequence file

1. Record Level

2. Block Level

3. Access Mostly Uused Products by 50000+ Subscribers

4. 1,2
5. 1,2,3

Correct Answer : Get Lastest Questions and Answer :
Explanation:

Question : Sync Marker in a sequence file are
A. boundaries for records
B. boundaries for blocks
C. boundaries for keys
D. boundaries for input split

1. boundaries for records

2. boundaries for blocks

3. Access Mostly Uused Products by 50000+ Subscribers

4. boundaries for input split

Correct Answer : Get Lastest Questions and Answer :
Explanation:

Question : Which statement is true regarding Distributed Cache

1. Once file submitted as a distributed cache , then both Mapper tasks and Reduce task can use it.

2. You can submit maximum two files in a Distributed cache

3. Access Mostly Uused Products by 50000+ Subscribers

4. You can use Distributed Cache files in Mapper only

Correct Answer : Get Lastest Questions and Answer :
Explanation:

Related Questions

Question : Which of the following is correct options to pass different type of files for MapReduce job

1. hadoop jar --files

2. hadoop jar --libjars

3. Access Mostly Uused Products by 50000+ Subscribers

4. 1,3

5. 1,2,3

Question : How can you use Java API, to submit Distributed Cache file to job in a Driver class

1. DistributedCache.addCacheFile()

2. DistributedCache.addCacheArchive()

3. Access Mostly Uused Products by 50000+ Subscribers

4. 1,2

5. 1,2,3

Question : When you use HBase as Source and Sink for your MapReduce job, which statement is true

1. Data are splitted based on region, and map task will be launched for each region data.

2. After map tasks partitions will be created and each key will be in same partition. However, a partition can have multiple keys

3. Access Mostly Uused Products by 50000+ Subscribers

4. 1,2

5. 1,2,3

Question : You have a MapReduce job, which uses HBase as source and Sink. It reads some stock market data PRICE, DIVIDEND, VOLUME (All there are stored in
different column family.) However, data come from various vendor like Bloomberg, Reuters and Markit. Using MapReduce job we filter out most accurate data.
And marked them as valid record and save back in same table with updated flag value. Table name is "MARKET_DATA" . You have written following Driver code.
And also you want to process data for DIVIDEND column family.

Scan scan = new Scan();
scan.setMaxVersions();
scan.addFamily(Bytes.toBytes("AAAAA"))
XXXXX.initTableMapperJob(YYYYY , scan, CustomMapper.Class, Text.class, LongWritable.class , job );
XXXXX.initTableReducerJob(YYYYY , CustomReducer.Class, job );

Please put proper class name and required value , in place of XXXX and YYYY

1. AAAAA->"DIVIDEND", XXXXX-> TableMapReduceUtil , YYYYY->"MARKET_DATA"
2. XXXXX->"DIVIDEND", AAAAA-> TableMapReduceUtil , YYYYY->"MARKET_DATA"
3. Access Mostly Uused Products by 50000+ Subscribers

Question : When you use mapred API, to run your job. Select statement which is true

1. JobClient.submitJob() is an Asynchronous call

2. JobClinet.runJob() is a synchronous call

3. Access Mostly Uused Products by 50000+ Subscribers

4. 1,2

5. 2,3

Question : Which of the following is an ideal way to chain multiple jobs, which also has non map-reduce job?

1. JobControl

2. OOZie workflow

3. Access Mostly Uused Products by 50000+ Subscribers

4. Streaming