Exp: The ApplicationMaster is the process that coordinates an application's execution in the cluster. Each application has its own unique ApplicationMaster, which is tasked with negotiating resources (containers) from the ResourceManager and working with the NodeManager(s) to execute and monitor the tasks. In the YARN design, Map-Reduce is just one application framework; this design permits building and deploying distributed applications using other frameworks. For example, YARN ships with a Distributed-Shell application that allows a shell script to be run on multiple nodes on the YARN cluster.
Once the ApplicationMaster is started (as a container), it will periodically send heartbeats to the ResourceManager to affirm its health and to update the record of its resource demands. After building a model of its requirements, the ApplicationMaster encodes its preferences and constraints in a heartbeat message to the ResourceManager. In response to subsequent heartbeats, the ApplicationMaster will receive a lease on containers bound to an allocation of resources at a particular node in the cluster. Depending on the containers it receives from the ResourceManager, the ApplicationMaster may update its execution plan to accommodate the excess or lack of resources. Container allocation/deallocation can take place in a dynamic fashion as the application progresses.
Question : Select the correct statement for HDFS in Hadoop . 1. NameNode federation significantly improves the scalability and performance of HDFS by introducing the ability to deploy multiple NameNodes for a single cluster. 2. built-in high availability for the NameNode via a new feature called the Quorum Journal Manager (QJM). QJM-based HA features an active NameNode and a standby NameNode 3. Access Mostly Uused Products by 50000+ Subscribers 4. 1 and 3 5. 1,2 and 3
Correct Answer : Get Lastest Questions and Answer : Exp: Hadoop 2 offers significant improvements beyond YARN-namely, improvements in the HDFS (Hadoop File System) that can influence infrastructure decisions. Whether to use NameNode federation and NameNode HA (high availability) are the two important decisions that must be made by most organizations. NameNode federation significantly improves the scalability and performance of HDFS by introducing the ability to deploy multiple NameNodes for a single cluster. In addition, HDFS introduces built-in high availability for the NameNode via a new feature called the Quorum Journal Manager (QJM). QJM-based HA features an active NameNode and a standby NameNode. The standby NameNode can become active either by a manual process or automatically. Automatic failover works in coordination with ZooKeeper. Hadoop 2 HDFS introduces the ZKFailoverController, which uses ZooKeeper's election functionality to determine the active NameNode.
Question : In Acmeshell Inc Hadoop cluster you have slave datanodes and single Active NameNode configured. Now suddenly you have to change some of the configuration for the all 100 slave nodes. Select the correct statement in such scenario.
1. Change the slaves configuration on your NameNode . 2. Restart all 100 DataNode daemons .
Explanation: You can stop the DataNodes and TaskTrackers from NameNode's hadoop bin directory. ./hadoop-daemon.sh stop tasktracker ./hadoop-daemon.sh stop datanode So this script checks for slaves file in conf directory of hadoop to stop the DataNodes and same with the TaskTracker. To start To change the configuration of a DataNode daemon, you must modify the configuration file on the machine on which the daemon is running, and then restart that daemon. So to change the configuration of all datanodes, after changing the configuration files you must restart all six of the DataNodes. You do not need to restart the NameNode, since its configuration has not changed. Again this script checks for slaves file in conf directory of hadoop to start the DataNodes and TaskTrackers.
./hadoop-daemon.sh start tasktracker ./hadoop-daemon.sh start datanode You need to do something like this:
bin/stop-all.sh (or stop-dfs.sh and stop-yarn.sh in the 2.x serie) rm -Rf /app/tmp/hadoop-your-username/* bin/hadoop namenode -format (or hdfs in the 2.x serie)
1. YARN takes into account all of the available compute resources on each machine in the cluster. 2. Based on the available resources, YARN negotiates resource requests from applications (such as MapReduce) running in the cluster. 3. YARN then provides processing capacity to each application by allocating Containers. 4. 1 and 3 5. 1,2 and 3