Question : YARN requires a staging directory for temporary files created by running jobs. By default it creates /tmp/hadoop-yarn/staging But user can not run the jobs, what could be reason.
1. Directory path is not correct 2. staging directory is full 3. Directory has restrictive permissions 4. None of the above
Correct Answers: 3
Explanation: YARN requires a staging directory for temporary files created by running jobs. By default it creates /tmp/hadoop-yarn/staging with restrictive permissions that may prevent your users from running jobs
Question : In MrV Map or Reduce tasks runs in a container, which of the following component is responsible for launching that container 1. JobHistoryServer 2. NodeManager 3. Application Master 4. Resource Manager
Correct Answer : 2
Explanation: The MapReduce-specific capabilities of the JobTracker have moved into the MapReduce Application Master, one of which is started to manage each MapReduce job and terminated when the job completes. The JobTracker's function of serving information about completed jobs has been moved to the JobHistoryServer. The TaskTracker has been replaced with the NodeManager, a YARN service that manages resources and deployment on a node. NodeManager is responsible for launching containers, each of which can house a map or reduce task.
Question : In MR, each node was configured with a fixed number of map slots and a fixed number of reduce slots. Under YARN, there is no distinction between resources available for maps and resources available for reduces - all resources are available for both
1. True 2. False
Correct Answer : 1 One of the larger changes in MR2 is the way that resources are managed. In MR1, each node was configured with a fixed number of map slots and a fixed number of reduce slots. Under YARN, there is no distinction between resources available for maps and resources available for reduces - all resources are available for both. Second, the notion of slots has been discarded, and resources are now configured in terms of amounts of memory (in megabytes) and CPU (in "virtual cores",). Resource configuration is an inherently difficult topic, and the added flexibility that YARN provides in this regard also comes with added complexity. Cloudera Manager will pick sensible values automatically, but if you are setting up your cluster manually or just interested in the details
1. The physical RAM limit for each Map and Reduce task 2. The JVM heap size limit for each task. 3. The amount of virtual memory each task will receive. 4. 1 and 3 5. All 1,2 and 3
1. Two active NameNodes and two Standby NameNodes 2. One active NameNode and one Standby NameNode 3. Two active NameNodes and on Standby NameNode 4. Unlimited. HDFS High Availability (HA) is designed to overcome limitations on the number of NameNodes you can deploy
Question : Table schemas in Hive are: 1. Stored as metadata on the NameNode 2. Stored along with the data in HDFS 3. Stored in the Metadata 4. Stored in ZooKeeper 5. Stored in Hive Metastore