Cloudera Hadoop Developer Certification Questions and Answer (Dumps and Practice Questions)

Question : In case of Pig, schema definition is defined at runtime

1. True
2. False

Correct Answer : 1

RDBMS and Pig

RDBMSs store data in tables, with tightly predefined schemas.
Pig is more relaxed about the data that it processes: you can define a schema at runtime, but its optional.
Essentially, it will operate on any source of tuples.

Refer HadoopExam.com Recorded Training Module : 11

Question :
Which statement is true about HBase

1. It is a column family database
2. Can store massive amount of data like Terabytes
3. It gives high throughput
4. All of the above

Correct Answer : 4

Apache HBase : HBase is the Hadoop database
- A NoSQL datastore
- Can store massive amounts of data
- Gigabytes, terabytes, and even petabytes of data in a table
- Scales to provide very high write throughput
- Hundreds of thousands of inserts per second
- Copes well with sparse data
- Tables can have many thousands of columns
- Even if most columns are empty for any given row
- Has a very constrained access model
- Insert a row, retrieve a row, do a full or partial table scan
- Only one column (the row key ) is indexed
- Does not support multi row transaction

Refer HadoopExam.com Recorded Training Module : 18

Question : Which of the statement is wrong about HBase

1. Tables can have thousands of columns
2. It is not mendatory to have data in all the columns
3. It does not support transaction
4. All of the above
5. None of the above

Correct Answer : 5

Apache HBase : HBase is the Hadoop database
- A NoSQL datastore
- Can store massive amounts of data
- Gigabytes, terabytes, and even petabytes of data in a table
- Scales to provide very high write throughput
- Hundreds of thousands of inserts per second
- Copes well with sparse data
- Tables can have many thousands of columns
- Even if most columns are empty for any given row
- Has a very constrained access model
- Insert a row, retrieve a row, do a full or partial table scan
- Only one column (the row key ) is indexed
- Does not support multi row transaction

Refer HadoopExam.com Recorded Training Module : 18

Related Questions

Question :

If X and Y are two MapReduce jobs and their dependency is set as below

x.addDependingJob(y)

What does it mean ?

1. X will not start until y has finished
2. Y will not start until x has finished
3. Access Mostly Uused Products by 50000+ Subscribers
4. All of the above

Question :

The option or switch in "hadoop fs" command for detailed help is

1. '-show'
2. '-help'
3. Access Mostly Uused Products by 50000+ Subscribers
4. Any of the above

Question :

Which of the following method or methods of JobControl object can be used to track the execution state of Jobs

1. allFinished()
2. getFailedJobs()
3. Access Mostly Uused Products by 50000+ Subscribers
4. All of the above

Question :

Which class is use to preprocessing and postprocessing of a MapReduce Job

1. ChainMapper
2. ChainReducer
3. Access Mostly Uused Products by 50000+ Subscribers
4. 1 and 2 Both

Question :

Is Data Joining like (RDBMS Join is possible in the Hadoop MapReduce)

1. Yes
2. NO

Question :

Which method of the FileSystem object is used for reading a file in HDFS

1. open()
2. access()
3. Access Mostly Uused Products by 50000+ Subscribers
4. None of the above