Premium

Mapr (HP) Hadoop Developer Certification Questions and Answers (Dumps and Practice Questions)



Question : Which is a true statement regarding Capacity Scheduler?
 : Which is a true statement regarding Capacity Scheduler?
1. Queues can be configured with their weighted access

2. Hierarchical Queue can be configured

3. Access Mostly Uused Products by 50000+ Subscribers

4. 1,3
5. 1,2,3

Correct Answer : Get Lastest Questions and Answer :
Explanation: The capacity scheduler shares some of the principles of the fair scheduler but has distinct differences, too. First, capacity
scheduling was defined for
large clusters, which may have multiple, independent consumers and target applications. For this reason, capacity scheduling provides greater control as well
as the ability to provide a minimum capacity guarantee and share excess capacity among users. The capacity scheduler was developed by Yahoo!.

In capacity scheduling, instead of pools, several queues are created, each with a configurable number of map and reduce slots. Each queue is also assigned a
guaranteed capacity (where the overall capacity of the cluster is the sum of each queue's capacity).
Queues are monitored; if a queue is not consuming its allocated capacity, this excess capacity can be temporarily allocated to other queues. Given that queues
can represent a person or larger organization, any available capacity is redistributed for use by other users.

Another difference of fair scheduling is the ability to prioritize jobs within a queue. Generally, jobs with a higher priority have access to resources sooner
than lower-priority jobs

Another difference is the presence of strict access controls on queues (given that queues are tied to a person or organization). These access controls are
defined on a per-queue basis. They restrict the ability to submit jobs to queues and the ability to view and modify jobs in queues.
You configure the capacity scheduler within multiple Hadoop configuration files. The queues are defined within hadoop-site.xml, and the queue configurations
are set in capacity-scheduler.xml. You can configure ACLs within mapred-queue-acls.xml. Individual queue properties include capacity percentage (where the
capacity of all queues in the cluster is less than or equal to 100), the maximum capacity (limit for a queue's use of excess capacity), and whether the queue
supports priorities. Most importantly, these queue properties can be manipulated at run time, allowing them to change and avoid disruptions in cluster use.





Question : Select correct statement regarding Fair Scheduler


 : Select correct statement regarding Fair Scheduler
1. Fair scheduler gives each user equal share. By default 1 pool is per user.

2. When a slot is free, most starved job will get free slots

3. Access Mostly Uused Products by 50000+ Subscribers

4. 1,2

5. 1,2,3

Correct Answer : Get Lastest Questions and Answer :
Explanation: The core idea behind the fair share scheduler was to assign resources to jobs such that on average over time, each job gets an equal
share of the
available resources. The result is that jobs that require less time are able to access the CPU and finish intermixed with the execution of jobs that require
more time to execute. This behavior allows for some interactivity among Hadoop jobs and permits greater responsiveness of the Hadoop cluster to the variety of
job types submitted.

The Hadoop implementation creates a set of pools into which jobs are placed for selection by the scheduler. Each pool can be assigned a set of shares to
balance resources across jobs in pools (more shares equals greater resources from which jobs are executed). By default, all pools have equal shares, but
configuration is possible to provide more or fewer shares depending upon the job type. The number of jobs active at one time can also be constrained, if
desired, to minimize congestion and allow work to finish in a timely manner.

To ensure fairness, each user is assigned to a pool. In this way, if one user submits many jobs, he or she can receive the same share of cluster resources as
all other users (independent of the work they have submitted). Regardless of the shares assigned to pools, if the system is not loaded, jobs receive the shares
that would otherwise go unused (split among the available jobs).

The scheduler implementation keeps track of the compute time for each job in the system. Periodically, the scheduler inspects jobs to compute the difference
between the compute time the job received and the time it should have received in an ideal scheduler. The result determines the deficit for the task. The job
of the scheduler is then to ensure that the task with the highest deficit is scheduled next.





Question : In capacity scheduler Jobs are submitted to Queues. What order is maintained inside the Queue (Default)
 : In capacity scheduler Jobs are submitted to Queues. What order is maintained inside the Queue (Default)
1. LIFO

2. FIFO

3. Access Mostly Uused Products by 50000+ Subscribers

4.

Correct Answer : Get Lastest Questions and Answer :
Explanation: Jobs within queue are FIFO.


Related Questions


Question : Which statement is true about the storing files in HDFS


  : Which statement is true about the storing files in HDFS
1. Files are split in the block
2. All the blocks of the files should remain on same machine
3. Access Mostly Uused Products by 50000+ Subscribers
4. All of the above
5. 1 and 3 are correct


Question : Select the correct statement ?
  : Select the correct statement ?
1. Block size is usually 64 MB or 128 MB
2. Blocks are replicated across multiple machine
3. Access Mostly Uused Products by 50000+ Subscribers
4. All of the above


Question : Which is the master node for tracking the files block in HDFS ?

  : Which is the master node for tracking the files block in HDFS ?
1. JOBTracker
2. DataNode
3. Access Mostly Uused Products by 50000+ Subscribers
4. DataMasteNode


Question : Select the correct options

  : Select the correct options
1. NameNode store the metadata for the files
2. DataNode holds the actual blocks
3. Access Mostly Uused Products by 50000+ Subscribers
4. All of the above
5. 1 and 2 are correct


Question : Select the correct statement for the NameNode ?

  :  Select the correct statement for the NameNode ?
1. NameNode daemon must be running at all the times
2. NameNode holds all its metadata in RAM for fast access.
3. Access Mostly Uused Products by 50000+ Subscribers
4. 1,2 and 3 are correct
5. 1 and 2 are correct




Question : If NameNode stops, the cluster becomes inaccessible ?

   : If NameNode stops, the cluster becomes inaccessible ?
1. True
2. Flase