Mapr (HP) Hadoop Developer Certification Questions and Answers (Dumps and Practice Questions)

Question : Select the recommended approach for setting memory parameters
A. mapred.child.ulimit parameter be more than twice the heap size.
B. io.sort.mb parameter must be less than the heap size.
C. It is better to use environment variable to set JVM heap size. Instead of job specific parameters
D. A,B
E. A,C

1. mapred.child.ulimit parameter be more than twice the heap size.

2. io.sort.mb parameter must be less than the heap size.

3. Access Mostly Uused Products by 50000+ Subscribers

4. 1,2

5. 1,3

Correct Answer : Get Lastest Questions and Answer :
Explanation: 1. mapred.child.java.opts and mapred.job.map.memory.mb are different memory usage aspects. mapred.child.java.opts just gives the
maximum heap size which child JVM can use; mapred.job.map.memory.mb is maximum virtual memory

allowed by a Hadoop task subprocess and can be larger than maprechild.java.opts because it also need store other memory stuff(stack, etc) except heap.

2. sometimes setting mapred.child.java.opts is not enough for these reasons:
a. this property only consider heap size for each JVM, so it would not flexible(eg. if my mapper task wants more memory but my reducer only need a little
memory);
b. this property doesn't consider the situation of spawning new process from original task which are not constrained in their total memory, this situation
may bring a huge memory usage affect for whole task process tree.

Hadoop gives us two choices for above disadvantages: one is by setting mapred.child.ulimit. This property which is strict upper bound could prevent single
JVM process from leaking memory and affecting other running processes.
But, this property also is not flexible and does't consider the spawned processes.

another choice is by setting mapred.cluster.map.memory.mb (The size, in terms of virtual memory, of a single map slot in the Map-Reduce framework, used by
the scheduler.
A job can ask for multiple slots for a single map task via mapred.job.map.memory.mb, up to the limit specified by mapred.cluster.max.map.memory.mb) and
setting mapre.job.map.memory.mb (The size, in terms of virtual memory, of a single map task for the job. if this map task use more memory than this property,
this task will be terminated).

-Xmx specify the maximum heap space of the allocated jvm. This is the space reserved for object allocation that is managed by the garbage collector. On the
other hand, mapred.job.map.memory.mb specifies the maximum virtual memory allowed by a Hadoop task subprocess. If you exceed the max heap size, the JVM
throws an OutOfMemoryException.

The JVM may use more memory than the max heap size because it also needs space to store object definitions (permgen space) and the stack. If the process
uses more virtual memory than mapred.job.map.memory.mb it is killed by hadoop.

So one doesn't take precedence over the other (and they measure different aspects of memory usage), but -Xmx is a parameter to the JVM and
mapred.job.map.memory.mb is a hard upper-bound of the virtual memory a task attempt can use, enforced by hadoop.

Question : Which of the following can help to improve performance of MapReduce job ?

1. Custom Combiner

2. Using codec to compress Map Output

3. Access Mostly Uused Products by 50000+ Subscribers

4. 1,2

5. 1,2,3

Correct Answer : Get Lastest Questions and Answer :
Explanation:

Question : Inbuilt counter will help , for which of the following

1. To find number of toatl files processed

2. To find total number output file generated

3. Access Mostly Uused Products by 50000+ Subscribers

4. To find total Mappers across all jobs

Correct Answer : Get Lastest Questions and Answer :
Explanation:

Related Questions

Question : Which of the following information can be captured using Framework level counters?
A. CPU Statistics e.g. total time spent executing map and reduce tasks
B. Garbage collect ion counter
C. How much RAM was consumed by all tasks
D. A,B
E. A,B,C

1. CPU Statistics e.g. total time spent executing map and reduce tasks

2. Garbage collect ion counter

3. Access Mostly Uused Products by 50000+ Subscribers

4. 1,2

5. 1,2,3

Question : Select correct statement regarding Counters
A. We can use Custom Counter in Both Reducer and Mapper for Count Bad Records or Outlier checks
B. Counters can be incremented as well decremented in Mapper and Reducer
C. Counter can be checked in Job History Server
D. Counters are stored in the JobTracker Memory
E. We can create maximum 100 Counters

1. A,B,C,D
2. A,B,C,E
3. Access Mostly Uused Products by 50000+ Subscribers
4. A,B,D,E
5. B,C,D,E

Question : What of the following information can be captured using existing counter?
A. Total number of Bytes read and written
B. Total number of Tasks Launched (Mapper + Reducer)
C. CPU Consumed
D. Memory Used
E. Number of records which are ignored while Map Tasks

1. A,B,C,D
2. A,B,C,E
3. Access Mostly Uused Products by 50000+ Subscribers
4. A,B,D,E
5. B,C,D,E

Question : Which all statements are true regarding MCS (MapR Control Server)

1. MCS, can be used for both Managing and Monitoring your MapR Cluster

2. We have to do one time configuration of Matrices database

3. Access Mostly Uused Products by 50000+ Subscribers

4. 1,2
5. 1,2,3

Question : You have given screenshot of MCS ,please select correct statement

1. Total time taken by job is approx. 14Seconds

2. Time taken by Map Tasks is 7.9 Seconds

3. Access Mostly Uused Products by 50000+ Subscribers

4. 2,3
5. 1,2,3

Question : Which of the following logs can be viewed from MCS
A. Standard out generated from a task
B. Standard error generated from a task
C. Syslog log file entries generated by a task
D. Profile output generated by a task
E. Debug script output generated by a task

1. A,B,C,D
2. A,B,C,E
3. Access Mostly Uused Products by 50000+ Subscribers
4. A,B,D,E
5. A,B,C,D,E