Cloudera Hadoop Developer Certification Questions and Answer (Dumps and Practice Questions)

Question : Can you suppress reducer output?

1. Yes, there is a special data type that will suppress job output
2. No, map reduce job will always generate output.
3. Access Mostly Uused Products by 50000+ Subscribers
4. Yes, but only during map execution when reducers have been set to zero

Correct Answer : Get Lastest Questions and Answer :

Explanation: There are a number of scenarios where output is not required from reducers. For instance, web crawling or image processing does not require external fetch or data processing.

Question :Is there a map input format?

1. Yes, but only in Hadoop 0.22+
2. Yes, there is a special format for map files
3. Access Mostly Uused Products by 50000+ Subscribers
4. Both 2 and 3 are correct answers.

Correct Answer : Get Lastest Questions and Answer :

Explanation: Map files are just a variation of sequence files. They store data in sorted order

Question :What is the most important feature of map-reduce

1. Ability to store large amount of data
2. Ability to process data on the cluster of the machines without copying all the data over
3. Access Mostly Uused Products by 50000+ Subscribers
4. Ability to process large amounts of data in parallel

Correct Answer : Get Lastest Questions and Answer :

Explanation: The fundamental difference of the Hadoop framework is that multiple machines will be used to process the same data and data is readily available for processing in distributed file system

Related Questions

Question : In HBase, lookup of a row is done by single key only?

1. True
2. False

Question : Hbase supports the transaction for single row ?

1. True
2. False

Question : Which are the supported method to access the data from HBase

1. get
2. put
3. scan
4. 1,2 and 3 are correct
5. It has to be accessed by Query only

Question : Rows from the HBase cab directly be inserted as input to Mapreduce job

1. True
2. False

Question : In which of the following scenerio we should use HBase

1. If it require random read, write or both
2. If it requires to do many thousands of operations per second on multiple TB of data
3. If access pattern is well known and simple
4. All of the above

Question : In which scenerio HBase should not be used

1. You only append to your dataset, and tend to read the whole thing
2. For ad-hoc analytics
3. If data volume is quite small
4. All of the above
5. None of the above