Mapr (HP) Hadoop Developer Certification Questions and Answers (Dumps and Practice Questions)

Question : Workflows expressed in Oozie can contain:

1. Iterative repetition of MapReduce jobs until a desired answer or state is reached.
2. Sequences of MapReduce and Pig jobs. These are limited to linear sequences of actions with exception handlers but no forks.
3. Access Mostly Uused Products by 50000+ Subscribers
4. Sequences of MapReduce and Pig. These sequences can be combined with other actions including forks, decision points, and path joins.

Correct Answer : Get Lastest Questions and Answer :

Apache OOZie Control Nodes :

A decision control node allows Oozie to determine the workflow execution path based on some criteria
Similar to a switch case statement
fork and join control nodes split one execution path into multiple execution paths which run concurrently
fork splits the execution path
join waits for all concurrent execution paths to complete before proceeding
fork and join are used in pairs

Question : You have an employee who is a Date Analyst and is very comfortable with SQL.
He would like to run ad-hoc analysis on data in your HDFS duster.
Which of the following is a data warehousing software built on top of
Apache Hadoop that defines a simple SQL-like query language well-suited for this kind of user?
A. Pig B. Hue C. Hive D. Sqoop E. Oozie

1. A
2. B
3. Access Mostly Uused Products by 50000+ Subscribers
4. D
5. E

Correct Answer : Get Lastest Questions and Answer :

Apache Hive :

Hive is an abstraction on top of MapReduce
Allows users to query data in the Hadoop cluster without knowing Java or MapReduce
- Uses the HiveQL language
- Very similar to SQL
- The Hive Interpreter runs on a client machine
- Turns HiveQL queries into MapReduce jobs
- Hive Submits jobs to the cluster
Note: this does not turn the cluster into a relational database server!
It is still simply running MapReduce jobs
Those jobs are created by the Hive Interpreter

Refer HadoopExam.com Recorded Training Module : 12 and 13

Question : You need to import a portion of a relational database every day as files to HDFS,
and generate Java classes to Interact with your imported data. Which of the following tools should you use to accomplish this?
A. Pig B. Hue C. Hive D. Flume E. Sqoop F. Oozie G. fuse-dfs

1. A,B
2. B,C
3. Access Mostly Uused Products by 50000+ Subscribers
4. F,G

Correct Answer : Get Lastest Questions and Answer :

Apache Hive :

Hive is an abstraction on top of MapReduce
Allows users to query data in the Hadoop cluster without knowing Java or MapReduce
- Uses the HiveQL language
- Very similar to SQL
- The Hive Interpreter runs on a client machine
- Turns HiveQL queries into MapReduce jobs
- Hive Submits jobs to the cluster
Note: this does not turn the cluster into a relational database server!
It is still simply running MapReduce jobs
Those jobs are created by the Hive Interpreter

Sqoop provides a method to import data from tables in a relational database into HDFS
- Does this very efficiently via a Map only MapReduce job
- Can also go the other way
- Populate database tables from files in HDFS

Refer HadoopExam.com Recorded Training Module : 12 and 13 and 19

Related Questions

Question : You can compress sequence file

1. Record Level

2. Block Level

3. Access Mostly Uused Products by 50000+ Subscribers

4. 1,2
5. 1,2,3

Question : Sync Marker in a sequence file are
A. boundaries for records
B. boundaries for blocks
C. boundaries for keys
D. boundaries for input split

1. boundaries for records

2. boundaries for blocks

3. Access Mostly Uused Products by 50000+ Subscribers

4. boundaries for input split

Question : Which statement is true regarding Distributed Cache

1. Once file submitted as a distributed cache , then both Mapper tasks and Reduce task can use it.

2. You can submit maximum two files in a Distributed cache

3. Access Mostly Uused Products by 50000+ Subscribers

4. You can use Distributed Cache files in Mapper only

Question : Select correct statememnt regarding OOZie workflow

1. It is a Client-Server workflow engine for Hadoop ecosystem components

2. It is a Direct Acyclic Graph

3. Access Mostly Uused Products by 50000+ Subscribers

4. 1,2

5. 1,2,3

Question : In a OOZie workflow , we can

1. Control flow node can be start of the workflow

2. Controlflow node can be end of the workflow

3. Access Mostly Uused Products by 50000+ Subscribers

4. 1,2

5. 1,2,3

Question : What is/are wll ways by which job complete?

1. 0

2. 1

3. Access Mostly Uused Products by 50000+ Subscribers

4. 3