Cloudera Hadoop Administrator Certification Certification Questions and Answer (Dumps and Practice Questions)

Question : In HadoopExam Inc's Geneva Datacenter you have a Hadoop Clucter with NameNode as HadoopExam and Secondary Namenode as HadoopExam all other
remaining nodes are data nodes. Select the appropriate way to determine the available HDFS space in your Geneva datacenter cluster.
1. Using command "hdfs dfsadmin -report"
2. Using command hdfs duadmin -available
3. Using command hadoop fs -du /
4. From the web ui of NameNode http://HadoopExam1:50070/

1. 1,2
2. 2,3
3. Access Mostly Uused Products by 50000+ Subscribers
4. 1,4
5. 1,3

Correct Answer : Get Lastest Questions and Answer :

Explanation: -refreshNodes Re-read the hosts and exclude files to update the set of Datanodes that are allowed to connect to the Namenode and those that should be decommissioned or recommissioned.
-finalizeUpgrade Finalize upgrade of HDFS. Datanodes delete their previous version working directories, followed by Namenode doing the same. This completes the upgrade process.
-upgradeProgress status | details | force Request current distributed upgrade status, a detailed status or force the upgrade to proceed.
-metasave filename Save Namenode's primary data structures to (filename> in the directory specified by hadoop.log.dir property. (filename> will contain one line for each of the following
1. Datanodes heart beating with Namenode 2. Blocks waiting to be replicated 3. Blocks currrently being replicated 4. Blocks waiting to be deleted
-setQuota (quota> (dirname>...(dirname> Set the quota (quota> for each directory (dirname>. The directory quota is a long integer that puts a hard limit on the number of names in the directory tree.
Best effort for the directory, with faults reported if 1. N is not a positive integer, or 2. user is not an administrator, or 3. the directory does not exist or is a file, or 4. the directory would immediately exceed the new quota.
-clrQuota (dirname>...(dirname> Clear the quota for each directory (dirname>.
Best effort for the directory. with fault reported if
1. the directory does not exist or is a file, or 2. user is not an administrator.
It does not fault if the directory has no quota.
-help [cmd] Displays help for the given command or all commands if none is specified. Both the NameNode's Web UI and the hdfs dfsadmin -report command display the amount of space remaining in HDFS. There is no -SpaceQuota option to hdfs dfsadmin. hadoop fs -du / will display the number of bytes used in each directory under the root directory, but will not display the amount of space still available. dfsadmin : Runs a HDFS dfsadmin client. With the following Options -report Reports basic filesystem information and statistics.
-safemode enter | leave | get | wait Safe mode maintenance command. Safe mode is a Namenode state in which it
1. does not accept changes to the name space (read-only)
2. does not replicate or delete blocks. Safe mode is entered automatically at Namenode startup, and leaves safe mode automatically when the configured minimum percentage of blocks satisfies the minimum replication condition. Safe mode can also be entered manually, but then it can only be turned off manually as well.
-refreshNodes Re-read the hosts and exclude files to update the set of Datanodes that are allowed to connect to the Namenode and those that should be decommissioned or recommissioned.
-finalizeUpgrade Finalize upgrade of HDFS. Datanodes delete their previous version working directories, followed by Namenode doing the same. This completes the upgrade process. -upgradeProgress status | details | force Request current distributed upgrade status, a detailed status or force the upgrade to proceed.
-metasave filename Save Namenode's primary data structures to (filename> in the directory specified by hadoop.log.dir property. (filename> will contain one line for each of the following
1. Datanodes heart beating with Namenode 2. Blocks waiting to be replicated 3. Blocks currrently being replicated 4. Blocks waiting to be deleted
-setQuota (quota> (dirname>...(dirname> Set the quota (quota> for each directory (dirname>. The directory quota is a long integer that puts a hard limit on the number of names in the directory tree. Best effort for the directory, with faults reported if 1. N is not a positive integer, or 2. user is not an administrator, or 3. the directory does not exist or is a file, or 4. the directory would immediately exceed the new quota. -clrQuota (dirname>...(dirname> Clear the quota for each directory (dirname>. Best effort for the directory. with fault reported if 1. the directory does not exist or is a file, or 2. user is not an administrator. It does not fault if the directory has no quota. -help [cmd] Displays help for the given command or all commands if none is specified.

Question :Select the correct statement which applies to container

1. a container is a collection of physical resources such as RAM, CPU cores, and disks on a single node.
2. There can be only one container on a single node
3. Access Mostly Uused Products by 50000+ Subscribers
4. 1 and 2
5. 1 and 3

Correct Answer : Get Lastest Questions and Answer :

Explanation: At the fundamental level, a container is a collection of physical resources such as RAM, CPU cores, and disks on a single node. There can be multiple
containers on a single node (or a single large one). Every node in the system is considered to be composed of multiple containers of minimum size of
memory (e.g., 512 MB or 1 GB) and CPU. The ApplicationMaster can request any container so as to occupy a multiple of the minimum size.
A container thus represents a resource (memory, CPU) on a single node in a given cluster. A container is supervised by the NodeManager and
scheduled by the ResourceManager.
Each application starts out as an ApplicationMaster, which is itself a container (often referred to as container 0). Once started, the ApplicationMaster must
negotiate with the ResourceManager for more containers. Container requests (and releases) can take place in a dynamic fashion at run time. For instance,
a MapReduce job may request a certain amount of mapper containers; as they finish their tasks, it may release them and request more reducer containers
to be started.

Question : Select the correct statement which applies to Node Manager

1. On start-up, the NodeManager registers with the ResourceManager
2. Its primary goal is to manage only the containers (On the node) assigned to it by the ResourceManager
3. Access Mostly Uused Products by 50000+ Subscribers
4. 1 and 2
5. 1 and 3

Correct Answer : Get Lastest Questions and Answer :

Explanation: The NodeManager is YARN's per-node worker agent, taking care of the individual compute nodes in a Hadoop cluster. Its duties include keeping up-todate
with the ResourceManager, overseeing application containers life-cycle management, monitoring resource usage (memory, CPU) of individual
containers, tracking node health, log management, and auxiliary services that may be exploited by different YARN applications.
On start-up, the NodeManager registers with the ResourceManager; it then sends heartbeats with its status and waits for instructions. Its primary goal is
to manage application containers assigned to it by the ResourceManager.
YARN containers are described by a container launch context (CLC). This record includes a map of environment variables, dependencies stored in
remotely accessible storage, security tokens, payloads for NodeManager services, and the command necessary to create the process. After validating the
authenticity of the container lease, the NodeManager configures the environment for the container, including initializing its monitoring subsystem with the
resource constraints specified application. The NodeManager also kills containers as directed by the ResourceManager.

Related Questions

Question : My container is being killed by the Node Manager, why ?

1. This is likely due to high memory usage exceeding your requested container memory size.
2. you have exceeded physical memory limits
3. you have exceeded virtual memory
4. 1 and 2
5. 1,2 and 3

Question :
I have written a Hadoop MapReduce job, which uses the native libarary in the Job. So which is the best way to include the native libraries.

1. Setting -Djava.library.path on the command line while launching a container
2. use LD_LIBRARY_PATH
3. Setting -Dnative.library.path on the command line while launching a container
4. By Adding the Jar's in the Hadoop Job Jar

Question : Select the correct flow, for submitting the YARN application

1. ApplicationMaster needs to register itself with the ResourceManager
2. The client communicates with the ResourceManager using the 'ClientRMProtocol' to first acquire a new 'ApplicationId'
3. Client submits an 'Application' to the YARN Resource Manager
4. The YARN ResourceManager will then launch the ApplicationMaster (as specified) on an allocated container
5. ApplicationMaster has to signal the ResourceManager of its completion
6. ApplicationMaster communicates with the NodeManager using ContainerManager
7. ApplicationMaster can then request for and receive container

1. 7,2,1,4,3,6,5
2. 2,3,4,1,7,5,6
3. 3,2,4,1,7,6,5
4. 3,4,2,7,1,6,5

Question : Map the fillowing

1. ClientRMProtocol
2. AMRMProtocol
3. ContainerManager

A. The protocol used by the ApplicationMaster to talk to the NodeManager to start/stop containers and get status updates on the containers if needed.
B. The protocol for a client that wishes to communicate with the ResourceManager to launch a new application (i.e. the ApplicationMaster),
check on the status of the application or kill the application. For example, a job-client (a job launching program from the gateway) would use this protocol.
C. The protocol used by the ApplicationMaster to register/unregister itself to/from the ResourceManager as well as to request for resources from the Scheduler to complete its tasks.

1. 1-C, 2-A, 3-B
2. 1-A, 2-B, 3-C
3. 1-B, 2-C, 3-A
4. 1-A, 2-C, 3-B

Question : The ApplicationReport received from the ResourceManager consists of the :

1. General application information: ApplicationId, queue to which the application was submitted,
user who submitted the application and the start time for the application.
2. ApplicationMaster details: the host on which the ApplicationMaster is running, the rpc port (if any)
on which it is listening for requests from clients and a token that the client needs to communicate with the ApplicationMaster.
3. Application tracking information: If the application supports some form of progress tracking,
it can set a tracking url which is available via ApplicationReport#getTrackingUrl that a client can look at to monitor progress.
4. ApplicationStatus: The state of the application as seen by the ResourceManager is available via ApplicationReport#getYarnApplicationState.
If the YarnApplicationState is set to FINISHED, the client should refer to ApplicationReport#getFinalApplicationStatus to check for the actual
success/failure of the application task itself. In case of failures, ApplicationReport#getDiagnostics may be useful to shed some more light on the the failure.
5. All of the above

Question :

Select the correct statement for the YARN

1. The ApplicationMaster is the actual owner of the job. It will be launched by the ResourceManager
and via the client will be provided all the necessary information and resources about the job that it has been tasked with to oversee and complete.
2. As the ApplicationMaster is launched within a container that may (likely will) be sharing a physical
host with other containers, given the multi-tenancy nature, amongst other issues, it cannot make any assumptions of things like
pre-configured ports that it can listen on.
3. When the ApplicationMaster starts up, several parameters are made available to it via the environment.
These include the ContainerId for the ApplicationMaster container, the application submission time and details about
the NodeManager host running the Application Master. Ref ApplicationConstants for parameter names
4. 1 and 2
5. 1,2 and 3