Premium

IBM Certified Data Architect - Big Data Certification Questions and Answers (Dumps and Practice Questions)



Question : YARN requires a staging directory for temporary files created by running jobs. By default it creates /tmp/hadoop-yarn/staging
But user can not run the jobs, what could be reason.

 : YARN requires a staging directory for temporary files created by running jobs. By default it creates /tmp/hadoop-yarn/staging
1. Directory path is not correct
2. staging directory is full
3. Directory has restrictive permissions
4. None of the above


Correct Answers: 3

Explanation: YARN requires a staging directory for temporary files created by running jobs. By default it creates /tmp/hadoop-yarn/staging with restrictive permissions that may prevent
your users from running jobs





Question : In MrV Map or Reduce tasks runs in a container, which of the following component is responsible for launching that container
 :   In MrV Map or Reduce tasks runs in a container, which of the following component is responsible for launching that container
1. JobHistoryServer
2. NodeManager
3. Application Master
4. Resource Manager

Correct Answer : 2

Explanation: The MapReduce-specific capabilities of the JobTracker have moved into the MapReduce Application Master, one of which is started to manage each MapReduce job and terminated
when the job completes. The JobTracker's function of serving information about completed jobs has been moved to the JobHistoryServer. The TaskTracker has been replaced with the
NodeManager, a YARN service that manages resources and deployment on a node. NodeManager is responsible for launching containers, each of which can house a map or reduce task.





Question : In MR, each node was configured with a fixed number of map slots and a fixed number of reduce slots.
Under YARN, there is no distinction between resources available for maps and resources available for reduces - all resources are available for both

 :  In MR, each node was configured with a fixed number of map slots and a fixed number of reduce slots.
1. True
2. False


Correct Answer : 1
One of the larger changes in MR2 is the way that resources are managed. In MR1, each node was configured with a fixed number of map slots and a fixed number of reduce slots. Under
YARN, there is no distinction between resources available for maps and resources available for reduces - all resources are available for both. Second, the notion of slots has been
discarded, and resources are now configured in terms of amounts of memory (in megabytes) and CPU (in "virtual cores",). Resource configuration is an inherently difficult topic, and
the added flexibility that YARN provides in this regard also comes with added complexity. Cloudera Manager will pick sensible values automatically, but if you are setting up your
cluster manually or just interested in the details




Related Questions


Question : In the Hadoop . framework, if HBase is also running on the same node for which available RAM is GB, so what is the ideal configuration
for "Reserved System Memory"

 :  In the Hadoop . framework, if HBase is also running on the same node for which available RAM is  GB, so what is the ideal configuration
1. 1GB
2. 2GB
3. 3GB
4. No need to reserve


Question : MapReduce runs on top of YARN and utilizes YARN Containers to schedule and execute its Map and Reduce tasks.
When configuring MapReduce resource utilization on YARN, which of the aspects to consider:


  : MapReduce runs on top of YARN and utilizes YARN Containers to schedule and execute its Map and Reduce tasks.
1. The physical RAM limit for each Map and Reduce task
2. The JVM heap size limit for each task.
3. The amount of virtual memory each task will receive.
4. 1 and 3
5. All 1,2 and 3



Question : Assuming you're not running HDFS Federation, what is the maximum number of NameNode daemons you
should run on your cluster in order to avoid a split-brain scenario with your NameNode when running HDFS
High Availability (HA) using Quorum-based storage?


 : Assuming you're not running HDFS Federation, what is the maximum number of NameNode daemons you
1. Two active NameNodes and two Standby NameNodes
2. One active NameNode and one Standby NameNode
3. Two active NameNodes and on Standby NameNode
4. Unlimited. HDFS High Availability (HA) is designed to overcome limitations on the number of NameNodes you can deploy



Question : When running with N JournalNodes, the system can tolerate at most _____ failures and continue to function normally.
 : When running with N JournalNodes, the system can tolerate at most _____ failures and continue to function normally.
1. N/2
2. (N - 1) / 2
3. (N + 1) / 2
4. (N - 2) / 2


Question : Table schemas in Hive are:
 : Table schemas in Hive are:
1. Stored as metadata on the NameNode
2. Stored along with the data in HDFS
3. Stored in the Metadata
4. Stored in ZooKeeper
5. Stored in Hive Metastore


Question : __________ are responsible for local monitoring of resource availability, fault reporting,
and container life-cycle management (e.g., starting and killing
jobs).


 : __________ are responsible for local monitoring of resource availability, fault reporting,
1. NodeManagers
2. Application Manager
3. Application Master
4. Resource Manager