Premium

Cloudera Hadoop Administrator Certification Certification Questions and Answer (Dumps and Practice Questions)



Question : Please map the following configuration in "yarn-site.xml" and mapred-site.xml

1. yarn.nodemanager.resource.memory-mb
2. yarn.scheduler.minimum-allocation-mb
3. mapreduce.reduce.memory.mb

A. RAM-per-container
B. containers * RAM-per-container
C. 2 * RAM-per-container

 :  Please map the following configuration in
1. 1-A,2-B,3-C
2. 1-C,2-B,3-A
3. 1-B,2-A,3-C
4. 1-B, 2-C, 3-A

Correct Answer : 3
Configuration File Configuration Setting Value Calculation
yarn-site.xml yarn.nodemanager.resource.memory-mb = containers * RAM-per-container
yarn-site.xml yarn.scheduler.minimum-allocation-mb = RAM-per-container
yarn-site.xml yarn.scheduler.maximum-allocation-mb = containers * RAM-per-container
mapred-site.xml mapreduce.map.memory.mb = RAM-per-container
mapred-site.xml mapreduce.reduce.memory.mb = 2 * RAM-per-container
mapred-site.xml mapreduce.map.java.opts = 0.8 * RAM-per-container
mapred-site.xml mapreduce.reduce.java.opts = 0.8 * 2 * RAM-per-container
yarn-site.xml (check) yarn.app.mapreduce.am.resource.mb = 2 * RAM-per-container
yarn-site.xml (check) yarn.app.mapreduce.am.command-opts = 0.8 * 2 * RAM-per-container





Question : If Cluster nodes have CPU cores, GB RAM, and disks, and HBase is also installed, then what should be Reserved Memory

 :  If Cluster nodes have  CPU cores,  GB RAM, and  disks, and HBase is also installed, then what should be Reserved Memory
1. 6 GB reserved for system memory + (if HBase) 8 GB for HBase
2. 4 GB reserved for system memory + (if HBase) 8 GB for HBase
3. 2 GB reserved for system memory + (if HBase) 8 GB for HBase
4. 12 GB reserved for system memory + (if HBase) 8 GB for HBase

Correct Answer : 1


Explanation: Cluster nodes have 12 CPU cores, 48 GB RAM, and 12 disks.

Reserved Memory = 6 GB reserved for system memory + (if HBase) 8 GB for HBase

Min container size = 2 GB

If there is no HBase:

# of containers = min (2*12, 1.8* 12, (48-6)/2) = min (24, 21.6, 21) = 21

RAM-per-container = max (2, (48-6)/21) = max (2, 2) = 2

Configuration Value Calculation
yarn.nodemanager.resource.memory-mb = 21 * 2 = 42*1024 MB
yarn.scheduler.minimum-allocation-mb = 2*1024 MB
yarn.scheduler.maximum-allocation-mb = 21 * 2 = 42*1024 MB
mapreduce.map.memory.mb = 2*1024 MB
mapreduce.reduce.memory.mb = 2 * 2 = 4*1024 MB
mapreduce.map.java.opts = 0.8 * 2 = 1.6*1024 MB
mapreduce.reduce.java.opts = 0.8 * 2 * 2 = 3.2*1024 MB
yarn.app.mapreduce.am.resource.mb = 2 * 2 = 4*1024 MB
yarn.app.mapreduce.am.command-opts = 0.8 * 2 * 2 = 3.2*1024 MB







Question : MapReduce runs on top of YARN and utilizes YARN Containers to schedule and execute its Map and Reduce tasks.
When configuring MapReduce resource utilization on YARN, which of the aspects to consider:


 : MapReduce runs on top of YARN and utilizes YARN Containers to schedule and execute its Map and Reduce tasks.
1. The physical RAM limit for each Map and Reduce task
2. The JVM heap size limit for each task.
3. The amount of virtual memory each task will receive.
4. 1 and 3
5. All 1,2 and 3


Correct Answer : 5
MapReduce runs on top of YARN and utilizes YARN Containers to schedule and execute its Map and Reduce tasks. When configuring MapReduce resource utilization on YARN, there are three aspects to consider:

The physical RAM limit for each Map and Reduce task.

The JVM heap size limit for each task.

The amount of virtual memory each task will receive.

You can define a maximum amount of memory for each Map and Reduce task. Since each Map and Reduce task will run in a separate Container, these maximum memory settings should be equal to or greater than the YARN minimum Container allocation.




Related Questions


Question : Each machine in our cluster has GB of RAM. Some of this RAM should be reserved for Operating System usage. On each node,
we will assign 40 GB RAM for YARN to use and keep 8 GB for the Operating System. The following property sets the maximum memory YARN can utilize on the node:
yarn.nodemanager.resource.memory-mb --> 40960
YARN takes the resource management capabilities that were in MapReduce and packages them so they can be used by new engines.
This also streamlines MapReduce to do what it does best, process data. With YARN, you can now run multiple applications in Hadoop,
all sharing a common resource management.

an example physical cluster of slave nodes each with 48 GB ram, 12 disks and 2 hex core CPUs (12 total cores).
In yarn-site.xml yarn.nodemanager.resource.memory-mb --> 40960s

The next step is to provide YARN guidance on how to break up the total resources available into Containers. You do this by specifying the minimum unit of RAM to allocate for a Container. We want to allow for a maximum of 20 Containers, and thus need (40 GB total RAM) / (20 # of Containers) = 2 GB minimum per container:

In yarn-site.xml yarn.scheduler.minimum-allocation-mb --> 2048
YARN will allocate Containers with RAM amounts greater than the yarn.scheduler.minimum-allocation-mb.

MapReduce 2 runs on top of YARN and utilizes YARN Containers to schedule and execute its map and reduce tasks.
When configuring MapReduce 2 resource utilization on YARN, there are three aspects to consider: * Physical RAM limit for each Map And Reduce task * The JVM heap size limit for each task * The amount of virtual memory each task will get

You can define how much maximum memory each Map and Reduce task will take. Since each Map and each Reduce will run in a separate Container, these maximum memory settings should be at least equal to or more than the YARN minimum Container allocation. For our example cluster, we have the minimum RAM for a Container (yarn.scheduler.minimum-allocation-mb) = 2 GB. We will thus assign 4 GB for Map task Containers, and 8 GB for Reduce tasks Containers.

In mapred-site.xml: mapreduce.map.memory.mb -->4096 mapreduce.reduce.memory.mb-->8192

Each Container will run JVMs for the Map and Reduce tasks. The JVM heap size should be set to lower than the Map and Reduce memory defined above, so that they are within the bounds of the Container memory allocated by YARN.In mapred-site.xml:
mapreduce.map.java.opts -->-Xmx3072m mapreduce.reduce.java.opts --> -Xmx6144m

In above scenerio..
 : Each machine in our cluster has  GB of RAM. Some of this RAM should be reserved for Operating System usage. On each node,
1. YARN will be able to allocate on each node up to 10 mappers or 3 reducers or a permutation within that.
2. YARN will be able to allocate on each node up to 8 mappers or 5 reducers or a permutation within that.
3. Access Mostly Uused Products by 50000+ Subscribers
4. With YARN and MapReduce 2, you will have pre-configured static slots for Map and Reduce tasks


Question :
Which basic configuration parameters
must you set to migrate
your cluster from MapReduce 1 (MRv1) to
MapReduce V2 (MRv2)?



 :
1. A,B,C
2. B,C,D
3. Access Mostly Uused Products by 50000+ Subscribers
4. B,D,E
5. D,E,F


Question :
Which is the default scheduler in YARN?
  :
1. YARN doesn't configure a default scheduler, you must first assign an appropriate scheduler class in yarn-site.xml
2. Capacity Scheduler
3. Access Mostly Uused Products by 50000+ Subscribers
4. FIFO Scheduler


Question : Select the correct option which you will use to kill the already running MapReduce job in MRv

 : Select the correct option which you will use to kill the already running MapReduce job in MRv
1. Do the ssh to the node running the ApplicationMaster and by using grep command find the process id and kill the session.

2. yarn application -kill "my_submited_job_application_id"
3. Access Mostly Uused Products by 50000+ Subscribers
4. hadoop datanode -rollback


Question : Which of the following is a correct command to submit yarn job, assuming your code is deployed in hadoopexam.jar
 :  Which of the following is a correct command to submit yarn job, assuming your code is deployed in hadoopexam.jar
1. java jar hadoopexam.jar [mainClass] args...
2. yarn jar hadoopexam.jar [mainClass] args...
3. Access Mostly Uused Products by 50000+ Subscribers
4. yarn jar hadoopexam.jar args...


Question : Which of the following, command can be used to list all the jobs or application running in the resource manager
  : Which of the following, command can be used to list all the jobs or application running in the resource manager
1. yarn application -list
2. yarn application -listAll
3. Access Mostly Uused Products by 50000+ Subscribers
4. yarn application -allJobs