Correct Answer : Get Lastest Questions and Answer : Explanation: The YARN ResourceManager (RM) is responsible for tracking the resources in a cluster and scheduling applications (for example, MapReduce jobs). Before CDH 5, the RM was a single point of failure in a YARN cluster. The RM high availability (HA) feature adds redundancy in the form of an Active/Standby RM pair to remove this single point of failure. Furthermore, upon failover from the Standby RM to the Active, the applications can resume from their last check-pointed state; for example, completed map tasks in a MapReduce job are not re-run on a subsequent attempt. This allows events such the following to be handled without any significant performance effect on running applications.: Unplanned events such as machine crashes Planned maintenance events such as software or hardware upgrades on the machine running the ResourceManager. RM HA requires ZooKeeper and HDFS services to be running.
Question : A company has to design a new data system. They will need to support several OLTP applications. Every three days a batch job will run to load specific data into a set of 10 large tables (with historical data) where OLAP analytics will be performed. Performance for both OLTP and OLAP queries is important. Which of the following designs would you suggest to the company?
1. Use a NoSQL data store such as MongoDB or Cloudant on the cloud to provide needed scalability
2. Use DB2 Data Partition Feature (DPF), partitioning all tables into different partitions
4. Use DB2 with BLU Acceleration, use columnar store for the 10 tables where Analytics will be run
Correct Answer : Get Lastest Questions and Answer : Explanation: BLU Acceleration technology combines ease of use with unprecedented storage savings and performance acceleration for analytic workloads. This section gives an overview of the innovative in-memory, CPU, and I/O optimizations behind the BLU Acceleration technology.
The BLU Acceleration feature is intended for analytic or data mart workloads. Such workloads typically involve regular reporting queries as well as ad-hoc business intelligence queries that can't be tuned in advance. If your workload is primarily transaction processing, you may want to consider using row-organized tables. For mixed workloads, shadow tables provide the best of both worlds by maintaining a column-organized copy of row table. Analytic queries against a row table are transparently routed to the shadow table thereby leveraging all the advantages of BLU acceleration without any change to the application. The following table identifies some of the workload characteristics that are optimal for column-organized tables and others that are well-suited for row-organized tables.
Question : Which data format stores all of the data in a binary format making the files more compact, and will even add in markers to help Map Reduce jobs determine where to break large files for more efficient processing?
Correct Answer : Get Lastest Questions and Answer : Explanation: Avro is a storage format for data. It stores data by putting data definition with the data allowing for Avro files to be read and interpreted by many different programs. It stores all of the data in a binary format making the files more compact, and will even add in markers to help Map Reduce jobs find where to break large files for more efficient processing.
1. Purchase specific products from multiple Independent Software Vendors (ISV) for your requirements in order to take advantage of vendor-specific features 2. Develop your own platform of software components to allow for maximum customization