IBM Certified Data Architect - Big Data Certification Questions and Answers (Dumps and Practice Questions)

Question : The Annotation Query Language (AQL) is the easiest and most flexible tool to pull structured output from which of the following?

1. Hive data structures

2. Unstructured text

3. Access Mostly Uused Products by 50000+ Subscribers

4. JDBC connected relational data marts

Correct Answer : Get Lastest Questions and Answer :
Explanation: Annotation Query Language (AQL) is a language for building extractors that extract structured information from unstructured or semistructured text. AQL is
the primary method of creating new extractors in the InfoSphere BigInsights Text Analytics component.
The syntax of AQL is similar to that of Structured Query Language (SQL), but with several important differences:
AQL is case-sensitive.
AQL allows, but does not require, regular expressions to be expressed in Perl syntax. For example, /regex/ instead of regex.
AQL does not currently support advanced SQL features like correlated subqueries and recursive queries.
AQL has a new statement type, extract, that is not present in SQL.
AQL does not allow keywords as view, column, or function names.

Question : You have been, storing data in IBm NoSQL solution, known as IBM Cloudant. And you want to pre-create some of the functions as a view. So that they can be used later
on to fetch the data e.g avrage sale price of a product id. Which of the language, you will be using to write views for Cloudant

1. Go

2. Java

3. Access Mostly Uused Products by 50000+ Subscribers

4. Python

5. Scala

Correct Answer : Get Lastest Questions and Answer :
Explanation: Views (MapReduce) for Cloudant

Views are used to obtain data stored within a database. Views are written using Javascript functions.

Views are mechanisms for working with document content in databases. A view can selectively filter documents. It can speed up searching for content. It can be used to 'pre-process'
the results before they are returned to the client.

Views are simply Javascript functions, defined within the view field of a design document. When you use a view, or more accurately when you perform a query using your view, the
system applies the Javascript function to each and every document in the database. Views can be complex. You might choose to define a collection of Javascript functions to create
the overall view required.

Question : Cloudant is a graph database ?

1. True
2. False

Correct Answer : Get Lastest Questions and Answer :
Explanation: IBM Cloudant is a managed NoSQL JSON database service built to ensure that the flow of data between an application and its database remains uninterrupted
and highly performant. Developers are then free to build more, grow more and sleep more.

IBM Graph is an easy-to-use, fully managed graph database service for storing, querying, and visualizing data points, their connections, and properties. IBM Graph is based on the
Apache TinkerPop stack for building high-performance graph applications. This means that the service provides you with a set of simplified HTTP APIs, an Apache TinkerPop v3
compatible API, and the full Apache TinkerPop v3 query language. The service gives you flexibility and capabilities, based on a familiar environment. Using the Bluemix dashboard,
you can bind IBM Graph to your applications easily.

Related Questions

Question : You have a file bases data source. Where data is continuously added. Now ,you need to import this data in Hadoop , HDFS. Which of the following tool help you to
implement this?

1. Sqoop

2. Pig

3. Access Mostly Uused Products by 50000+ Subscribers

4. Flume

5. BigSQL

Question : Which of the following feature is supported by IBM GPFS

1. There is a requirement where logical isolation and physical isolation need to be supported

2. There is a requirement for separate clusters for analytics and databases

3. Access Mostly Uused Products by 50000+ Subscribers

4. There is a need to run the NameNode in a separate high availability environment

Question : You are creating an hadoop based solution. You need to consider archival size of the cluster. Which of the following , you need to consider for deciding the cluster
archival size?

1. Replication factor

2. Number of nodes required

3. Access Mostly Uused Products by 50000+ Subscribers

4. Number of batches

Question : In Hadoop YARN based cluster, which of the following needs to be configured for High Availabilty?

1. JobTracker

2. TaskTracker

3. Access Mostly Uused Products by 50000+ Subscribers

4. DataNode

Question : A company has to design a new data system. They will need to support several
OLTP applications. Every three days a batch job will run to load specific data into
a set of 10 large tables (with historical data) where OLAP analytics will be
performed. Performance for both OLTP and OLAP queries is important. Which of
the following designs would you suggest to the company?

1. Use a NoSQL data store such as MongoDB or Cloudant on the cloud to provide needed scalability

2. Use DB2 Data Partition Feature (DPF), partitioning all tables into different partitions

3. Access Mostly Uused Products by 50000+ Subscribers

4. Use DB2 with BLU Acceleration, use columnar store for the 10 tables where Analytics will be run

Question : Which data format stores all of the data in a binary format making the files more
compact, and will even add in markers to help Map Reduce jobs determine where to break large files for more efficient processing?

1. Parquet

2. Avro

3. Access Mostly Uused Products by 50000+ Subscribers

4. Sequence File

5. Map File