Mapr (HP) Hadoop Developer Certification Questions and Answers (Dumps and Practice Questions)

Question : Place the following steps in order of execution for MapR Direct Shuffle work flow in YARN.

A. After that the Node Manager on each node launches containers using information about the node s local volume from the LocalVolumeAuxiliaryService.
B. The Application Master service initializes the application by calling initialize Application() on the LocalVolumeAuxiliaryService.
C. Application Master Service requests task containers from the Resource Manager.
D. Then the Resource Manager sends the App Master information that AppMaster uses to request containers from the NodeManager.

1. A,B,C,D
2. D,C,B,A
3. Access Mostly Uused Products by 50000+ Subscribers
4. B,C,D,A
5. C,D,B.A

Correct Answer : Get Lastest Questions and Answer :
Explanation: MapR Direct Shuffle work flow in YARN
The Application Master service initializes the application by calling initialize Application() on the LocalVolumeAuxiliaryService.
The Application Master service requests task containers from the Resource Manager.
The Resource Manager sends the App Master information that App Master uses to request containers from the Node Manager.
Then the Node Manager on each node launches containers using information about the node s local volume from the LocalVolumeAuxiliaryService.
Data from map tasks is saved in the App Master for later use in Task Completion events which are requested by reduce tasks.
As the map tasks completes, map outputs and map-side spills are written to the local volumes on the map task nodes, generating Task Completion events.
Reduce tasks fetches Task Completion events from the Application Manager.
The task Completion events include information on the location of map output data, enabling reduce tasks to copy data from MapOutput locations.
Reduce tasks reads the map output information.
Spills and interim merges are written to local volumes on the reduce task nodes.
Finally the Application Master calls stopApplication() on the LocalVolumeAuxiliaryService to clean up data on the local volume.

Question : You have an Hadoop ecosystem components that use MapReduce under the hood , how can you define which Version of the to use either Classic or yarn

1. set a parameter called default_mode in their /opt/mapr/conf/hadoop_version configuration file of each component

2. While submitting the job we must use maprcli command and setting mode argument

3. Access Mostly Uused Products by 50000+ Subscribers

4. None of the above

Correct Answer : Get Lastest Questions and Answer :
Explanation: You can set the MapReduce mode with an environment variable on a client or cluster node. The MapReduce mode set in the environment
variable overrides the default_mode set on the client node and the cluster.
Using this method, you can open multiple terminals and set each shell to use a different mode. Therefore, you can run both MapReduce v1 and MapReduce v2 jobs
from the same client machine at the same time.
For information on how to set the environment variable for an ecosystem service, see Managing the MapReduce Mode for Ecosystem Components.
To set the MapReduce mode with an Environment Variable:
Open terminal on the client node.
Enter one of the following commands on the shell:
export MAPR_MAPREDUCE_MODE=yarn
export MAPR_MAPREDUCE_MODE=classic

You can use the maprcli command line or the MapR Control System (MCS) to change the MapReduce mode for the entire cluster. The maprcli command line and the
MCS utilize Central Configuration to update all the nodes in the cluster with the MapReduce mode.
To set the cluster MapReduce mode using maprcli:
Run the following command:
maprcli cluster mapreduce set -mode

You can edit the hadoop_version file on a MapR client node to configure the default MapReduce mode for all jobs and application that you submit from the
client node. The mode that you set on the client overrides the MapReduce mode of the cluster.
For information about how this impacts ecosystem clients.
To set the MapReduce mode on the client node:
Open the hadoop_version file in the following location: /opt/mapr/conf/
Edit the default_mode parameter.
The following values are valid for the default_mode parameter:
default_mode=classic
Specifies that the client submits MapReduce v1 jobs to the cluster.
default_mode=yarn Specifies that the client submits MapReduce v2 applications to the cluster.

You can set the MapReduce mode in the command line to override the MapReduce mode set in an environment variable, and the default mode of the client node and
the cluster.
Using this method, you can run MapReduce v1 and MapReduce v2 jobs one after the other from the same shell.

To set the MapReduce mode for ecosystem component that connect directly to the cluster:
Open a terminal on the client node.
Enter one of the following commands on the shell:
export MAPR_MAPREDUCE_MODE= yarn
export MAPR_MAPREDUCE_MODE= classic
Launch the ecosystem client and submit the job or application.

Question : In YARN cluster, which of the following component, you can use to monitor your job?

1. Job Tracker

2. Task Tracker

3. Access Mostly Uused Products by 50000+ Subscribers

4. 1,3
5. 1,2,3

Correct Answer : Get Lastest Questions and Answer :
Explanation: The history server REST API s allow the user to get status on finished applications.
YARN has single Mapreduce job history server on master node. As indicated by the name, the function of the mapreduce job history server is to store and
serve a
history of the mapreduce jobs that were run on the cluster

ou need only one historyserver. It can run on any node you like, including a dedicated node of its own, but traditionally runs on the same node as the
resourcemanager. The one history server is declared in mapred-site.xml:

mapreduce.jobhistory.address: MapReduce JobHistory Server host:port Default port is 10020.
mapreduce.jobhistory.webapp.address: MapReduce JobHistory Server Web UI host:port Default port is 19888.
mapreduce.jobhistory.intermediate-done-dir: Directory where history files are written by MapReduce jobs (in HDFS). Default is /mr-history/tmp
mapreduce.jobhistory.done-dir: Directory where history files are managed by the MR JobHistory Server (in HDFS). Default is /mr-history/done
You can access the history via the historyserver REST API, you do not access directly the internal history files. For casual browsing, the history is
available in the resouremanager web UI.

Related Questions

Question : Which is the correct command to delete the directory

1. hadoop fs -r pappu
2. hadoop fs -remove pappu
3. Access Mostly Uused Products by 50000+ Subscribers
4. hadoop fs -rem pappu

Question : Which statement is correct for the MapReduce

1. MapReduce is a method for distribution a task across multiple nodes
2. Each node processes data stored on that node
3. Access Mostly Uused Products by 50000+ Subscribers
4. All of the above

Question : Which of the following are responsibilities of the ApplicationMater

1. Before starting any task, create job's output directory for job's OutputCommitter.
2. Both map tasks and reduce tasks are created by Application Master.
3. Access Mostly Uused Products by 50000+ Subscribers
4. If job doesn't qualify as Uber task, Application Master requests containers for all map tasks and reduce tasks.

1. 1,2,3
2. 2,3,4
3. Access Mostly Uused Products by 50000+ Subscribers
4. 1,2,4
5. 1,2,3,4

Question : A _____ is the basic unit of processing capacity in YARN, and is an encapsulation of resource elements (memory, cpu etc.)

1. Node Manager
2. Container
3. Access Mostly Uused Products by 50000+ Subscribers
4. DataNode

Question : __________ are responsible for local monitoring of resource availability, fault reporting,
and container life-cycle management (e.g., starting and killing
jobs).

1. NodeManagers
2. Application Manager
3. Access Mostly Uused Products by 50000+ Subscribers
4. Resource Manager

Question : Your cluster has slave nodes in three different racks, and you have written a rack topology script identifying each machine as being in hadooprack,
hadooprack2, or hadooprack3. A client machine outside of the cluster writes a small (one-block) file to HDFS. The first replica of the block is written
to a node on hadooprack2. How is block placement determined for the other two replicas?

1. One will be written to another node on hadooprack2, and the other to a node on a different rack.

2. Either both will be written to nodes on hadooprack1, or both will be written to nodes on hadooprack3.

3. Access Mostly Uused Products by 50000+ Subscribers

4. One will be written to hadooprack1, and one will be written to hadooprack3.