Premium

Cloudera Hadoop Administrator Certification Certification Questions and Answer (Dumps and Practice Questions)



Question : While upgrading the CDH with YARN, you have to install or update and start a Zookeeper, why you have to do that?
  : While upgrading the CDH with YARN, you have to install or update and start a Zookeeper, why you have to do that?
1. For High Availability of NameNode
2. For High Availability of JobTracker
3. For High Availability of Resource Manager
4. 1 and 2 only
5. All 1,2 and 3

Correct Answer : 4


Explanation: Install and deploy ZooKeeper.
Cloudera recommends that you install (or update) and start a ZooKeeper cluster before proceeding. This is a requirement if you are deploying high availability (HA) for the NameNode or JobTracker.







Question :

Which of the following step needs to be followed for upgradng the HDFS Metadata for an HA deployement

1. Run the following command on the active NameNode only, and make sure the JournalNodes have been upgraded to CDH 5 and are up and running before you run the command. $ sudo service hadoop-hdfs-namenode -upgrade
2. If Kerberos is not enabled: $ sudo -u hdfs hdfs namenode -bootstrapStandby
$ sudo service hadoop-hdfs-namenode start

3. Start up the DataNodes: On each DataNode: $ sudo service hadoop-hdfs-datanode start

4. Wait for NameNode to exit safe mode, and then start the Secondary NameNode.
a. To check that the NameNode has exited safe mode, look for messages in the log file, or the NameNode's web interface, that say "...no longer in safe mode."
b. To start the Secondary NameNode (if used), enter the following command on the Secondary NameNode host: $ sudo service hadoop-hdfs-secondarynamenode start
  :
1. 1,2,3
2. 2,3,4
3. 1,3,4
4. 1,2,4



Correct Answer : 1
Explanation: Upgrade the HDFS Metadata Note: What you do in this step differs depending on whether you are upgrading an HDFS HA deployment using Quorum-based storage, or a non-HA deployment using a secondary NameNode. (If you have an HDFS HA deployment using NFS storage, do not proceed; you cannot upgrade that configuration to CDH 5. Unconfigure your NFS shared storage configuration before you attempt to upgrade.) o For an HA deployment, do sub-steps 1, 2, and 3. o For a non-HA deployment, do sub-steps 1, 3, and 4.
1. To upgrade the HDFS metadata, run the following command on the NameNode. If HA is enabled, do this on the active NameNode only, and make sure the JournalNodes have been upgraded to CDH 5 and are up and running before you run the command.
$ sudo service hadoop-hdfs-namenode -upgrade Important: In an HDFS HA deployment, it is critically important that you do this on only one NameNode.
You can watch the progress of the upgrade by running: $ sudo tail -f /var/log/hadoop-hdfs/hadoop-hdfs-namenode-(hostname>.log
Look for a line that confirms the upgrade is complete, such as: /var/lib/hadoop-hdfs/cache/hadoop/dfs/(name> is complete
Note: The NameNode upgrade process can take a while depending on how many files you have.
2. Do this step only in an HA configuration. Otherwise skip to starting up the DataNodes.
Wait for NameNode to exit safe mode, and then re-start the standby NameNode.
o If Kerberos is enabled: $ kinit -kt /path/to/hdfs.keytab hdfs/(fully.qualified.domain.name@YOUR-REALM.COM> && hdfs namenode -bootstrapStandby
$ sudo service hadoop-hdfs-namenode start
o If Kerberos is not enabled: o $ sudo -u hdfs hdfs namenode -bootstrapStandby
$ sudo service hadoop-hdfs-namenode start
For more information about the haadmin -failover command, see Administering an HDFS High Availability Cluster.
3. Start up the DataNodes: On each DataNode: $ sudo service hadoop-hdfs-datanode start
4. Do this step only in a non-HA configuration. Otherwise skip to starting YARN or MRv1.
Wait for NameNode to exit safe mode, and then start the Secondary NameNode.
a. To check that the NameNode has exited safe mode, look for messages in the log file, or the NameNode's web interface, that say "...no longer in safe mode."
b. To start the Secondary NameNode (if used), enter the following command on the Secondary NameNode host: $ sudo service hadoop-hdfs-secondarynamenode start
c. To complete the cluster upgrade, follow the remaining steps below.






Question : . You have correctly configured the YARN cluster, now you have to start YARN cluster properly with following steps.

1. On the ResourceManager system:
$ sudo service hadoop-yarn-resourcemanager start

2. On each NodeManager system (typically the same ones where DataNode service runs):
$ sudo service hadoop-yarn-nodemanager start

3. On the MapReduce JobHistory Server system:
$ sudo service hadoop-mapreduce-historyserver start
Select the correct order of the above steps

  : . You have correctly configured the YARN cluster, now you have to start YARN cluster properly with following steps.
1. 1,2,3
2. 2,3,1
3. 3,2,1
4. In any random order you can start above services



Correct Answer : 1


Explanation: To start YARN, start the ResourceManager and NodeManager services:
Make sure you always start ResourceManager before starting NodeManager services.
On the ResourceManager system:
$ sudo service hadoop-yarn-resourcemanager start
On each NodeManager system (typically the same ones where DataNode service runs):
$ sudo service hadoop-yarn-nodemanager start

To start the MapReduce JobHistory Server
On the MapReduce JobHistory Server system:
$ sudo service hadoop-mapreduce-historyserver start
For each user who will be submitting MapReduce jobs using MapReduce v2 (YARN), or running Pig, Hive, or Sqoop in a YARN installation, make sure that the HADOOP_MAPRED_HOME environment variable is set correctly as follows:
$ export HADOOP_MAPRED_HOME=/usr/lib/hadoop-mapreduce




Related Questions


Question : All of the files required for running a particular YARN application will be put here(Path)
for the duration of the application run. Which of the following property in yarn-site.xml, will be used to configure this path

 :  All of the files required for running a particular YARN application will be put here(Path)
1. yarn.nodemanager.log-dirs

2. yarn.nodemanager.local-dirs

3. yarn.nodemanager.remote-app-log-dir

4. yarn.nodemanager.dirs





Question : Which of the following component in the MRv maintains the History of the JOB

 : Which of the following component in the MRv maintains the History of the JOB
1. MapReduce Server
2. MapReduce JobHistory Server
3. Application Master
4. 2 and 3
5. 1 , 2 and 3


Question : YARN requires a staging directory for temporary files created by running jobs. By default it creates /tmp/hadoop-yarn/staging
But user can not run the jobs, what could be reason.

 : YARN requires a staging directory for temporary files created by running jobs. By default it creates /tmp/hadoop-yarn/staging
1. Directory path is not correct
2. stagging directory is full
3. Directry has restrictive permissions
4. None of the above



Question : In MrV Map or Reduce tasks runs in a contatiner, which of the following compoent is responsible for launching that container
 :   In MrV Map or Reduce tasks runs in a contatiner, which of the following compoent is responsible for launching that container
1. JobHistoryServer
2. NodeManager
3. Application Master
4. Resource Manager


Question : Which of the follwoing is the required properties to run YARN architecture

 :  Which of the follwoing is the required properties to run YARN architecture
1. yarn-site.xml: yarn.resourcemanager.hostname
your.hostname.com

2. yarn-site.xml: yarn.nodemanager.aux-services
mapreduce_shuffle

3. mapred-site.xml:mapreduce.framework.name
yarn

4. All 1,2 and 3
5. No, configuration is needed for CDH5, by default it will run in YARN mode


Question : In MR, each node was configured with a fixed number of map slots and a fixed number of reduce slots.
Under YARN, there is no distinction between resources available for maps and resources available for reduces - all resources are available for both



 :  In MR, each node was configured with a fixed number of map slots and a fixed number of reduce slots.
1. True
2. False