Premium

IBM Certified Data Architect - Big Data Certification Questions and Answers (Dumps and Practice Questions)



Question : On startup, the NameNode enters a special state called Safemode. Replication of
data blocks does not occur when the NameNode is in the Safemode state. The
NameNode receives Heartbeat and Blockreport messages from the DataNodes.
Which of the following contains the list of data blocks that a DataNode is hosting?


  : On startup, the NameNode enters a special state called Safemode. Replication of
1. Rack Awareness

2. Snapshot

3. Access Mostly Uused Products by 50000+ Subscribers

4. Blockreport


Correct Answer : Get Lastest Questions and Answer :
Explanation: The DataNode stores HDFS data in files in its local file system. The DataNode has no knowledge about HDFS files. It stores each block of HDFS data in a
separate file in its local file system. The DataNode does not create all files in the same directory. Instead, it uses a heuristic to determine the optimal number of files per
directory and creates subdirectories appropriately. It is not optimal to create all local files in the same directory because the local file system might not be able to
efficiently support a huge number of files in a single directory. When a DataNode starts up, it scans through its local file system, generates a list of all HDFS data blocks that
correspond to each of these local files and sends this report to the NameNode: this is the Blockreport.

A DataNode identifies block replicas in its possession to the NameNode by sending a block report. A block report contains the block id, the generation stamp and the length for
each block replica the server hosts. The first block report is sent immediately after the DataNodes registrations. Subsequent block reports are sent every hour and provide the
NameNode with an up-to date view of where block replicas are located on the cluster.





Question : Which all are the following, security products

A. BigInsights
B. Guardium
C. OSSEC
D. Shiro

 : Which all are the following, security products
1. A,B,C
2. B,C,D
3. Access Mostly Uused Products by 50000+ Subscribers
4. A,B,C,D

Correct Answer : Get Lastest Questions and Answer :
Explanation: IBM Security Guardium is a comprehensive data security platform that provides a full range of capabilities from discovery and classification of
sensitive data to vulnerability assessment to data and file activity monitoring to masking, encryption, blocking, alerting and quarantining to protect sensitive data.

OSSEC: SSEC is a platform to monitor and control your systems. It mixes together all the aspects of HIDS (host-based intrusion detection), log monitoring, and Security Incident
Management (SIM)/Security Information and Event Management (SIEM) together in a simple, powerful, and open source solution.

Apache Shiro (pronounced shee-roh, the Japanese word for ˜castle) is a powerful and easy-to-use Java security framework that performs authentication, authorization, cryptography,
and session management and can be used to secure any application - from the command line applications, mobile applications to the largest web and enterprise applications.





Question : You have a need for Storm real time processing and you realize that your Storm
processing is detrimental to the timely execution of your MapReduce batch jobs.
Which of the following would be your best course of action?
  : You have a need for Storm real time processing and you realize that your Storm
1. Implement a Storm-YARN integration to facilitate the management of elastic workloads

2. Implement the Oozie 2.0 framework optimized for elastic workload management

3. Access Mostly Uused Products by 50000+ Subscribers

4. Implement Apache ACE 2.0 for Storm

Correct Answer : Get Lastest Questions and Answer :
Explanation: Apache Oozie is an open source project based on Java" technology that simplifies the process of creating workflows and managing coordination among jobs.
In principle, Oozie offers the ability to combine multiple jobs sequentially into one logical unit of work. One advantage of the Oozie framework is that it is fully integrated
with the Apache Hadoop stack and supports Hadoop jobs for Apache MapReduce, Pig, Hive, and Sqoop. In addition, it can be used to schedule jobs specific to a system, such as Java
programs. Therefore, using Oozie, Hadoop administrators are able to build complex data transformations that can combine the processing of different individual tasks and even
sub-workflows. This ability allows for greater control over complex jobs and makes it easier to repeat those jobs at predetermined periods.
In practice, there are different types of Oozie jobs:
Oozie Workflow jobs " Represented as directed acyclical graphs to specify a sequence of actions to be executed.
Oozie Coordinator jobs " Represent Oozie workflow jobs triggered by time and data availability.
Oozie Bundle" Facilitates packaging multiple coordinator and workflow jobs, and makes it easier to manage the life cycle of those jobs.



Related Questions


Question : You want to create a BigData Solution, using open source product. Which is having following requirement.

- Text Search Solution, in existing data
- Infrastructure Monitoring is required

Which of the following components can be used

A. HBase
B. Lucene
C. Nagios
D. OOzie
E. Spark
  : You want to create a BigData Solution, using open source product. Which is having following requirement.
1. A,B
2. B,C
3. Access Mostly Uused Products by 50000+ Subscribers
4. D,E
5. A,E


Question : A large global enterprise customer has a Big Data environment set up on Hadoop.
After a year in operation they are now looking to extend access to multiple
functions that will need different views into different aspects/portions of the data.
As you consider these requirements, which of the following statements is TRUE
and also applies to the scenario?
  : A large global enterprise customer has a Big Data environment set up on Hadoop.
1. Hadoop does not support multi tenancy but can easily scale to support this by replicating data to new clusters with commodity hardware.

2. Hadoop can support multi tenancy but only if YARN is used, so if not already used, the customer will need to upgrade to a YARN supported version.

3. Access Mostly Uused Products by 50000+ Subscribers

4. Hadoop can support multi tenancy by using a distributed file system for storage, allowing all nodes to access the data.



Question : What term applies to the data elements in Infosphere Streams?
  : What term applies to the data elements in Infosphere Streams?
1. Tuples

2. Operators

3. Access Mostly Uused Products by 50000+ Subscribers

4. Composite operators


Question : The NameNode uses a file in its _______ to store the EditLog.
  : The NameNode uses a file in its _______  to store the EditLog.
1. Any HDFS Block
2. metastore
3. Access Mostly Uused Products by 50000+ Subscribers
4. local hdfs block



Question : Select the correct option
  : Select the correct option
1. When a file is deleted by a user or an application, it is immediately removed from HDFS
2. When a file is deleted by a user or an application, it is not immediately removed from HDFS. Instead, HDFS first renames it to a file in the /trash directory.
3. Access Mostly Uused Products by 50000+ Subscribers
4. 1,2
5. 2,3


Question : You have data already stored in HDFS and are considering using HBase. Which additional feature does HBase provide to HDFS?

  : You have data already stored in HDFS and are considering using HBase. Which additional feature does HBase provide to HDFS?
1. Random writes
2. Fault tolerance
3. Access Mostly Uused Products by 50000+ Subscribers
4. Batch processing
5. 2,3