Premium

IBM Certified Data Architect - Big Data Certification Questions and Answers (Dumps and Practice Questions)



Question : Whihc one of the following statemnet is true about BigSQL
  : Whihc one of the following statemnet is true about BigSQL
1. Big SQL doesnt need any secondary indices to access HBase tables

2. Big SQL processes queries locally either on disk or in memory

3. Access Mostly Uused Products by 50000+ Subscribers

4. Executing Big SQL queries through MapReduce framework would always be a better choice

Correct Answer : Get Lastest Questions and Answer :
Explanation: The Big SQL query capabilities, which follow the specifications of the SQL:2011 language standard, include significant levels of SQL PL compatibility,
including stored procedures, SQL-bodied functions, and a rich library of scalar, table and online analytical processing (OLAP) functions. It contains an SQL compatibility between
database platforms and maintains interoperability with established BigInsights tools. The following SQL features are included in Big SQL:

The only limit to the size of the queries, groups, and sorting is the disk capacity of the cluster.Big SQL uses in-memory caching, and can spill large data sets to the local disk
at each node that is processing a query.
You can use subqueries anywhere that an expression can be used. Subqueries can be correlated or uncorrelated.
You can use table expressions, such as common table expressions, built-in or user-defined table functions, VALUES expressions, and lateral joins.
You can perform all valid SQL standard join operations, group sets, and union operations.
You can perform all of the standard OLAP specifications, for windowing and analytic functions.
Big SQL supports scalar functions, table functions, and procedures.

Big SQL enables you to create secondary indices for HBase using a CREATE INDEX statement. As you might imagine, these indices can improve the runtime performance of queries that
filter on indexed columns. HBase indices can be based on a single or composite key, and using Big SQL to insert data or load data from a file into an HBase table will automatically
update its indices. However, in BigInsights 2.1, loading data from a remote relational database into an HBase table will not automatically update its secondary indices. Instead, an
administrator needs to drop and re-create the necessary indices.





Question : You are working , with a product based company. Thee are launching their new product of data visualization. And before launching this product they run advertising
campaign on Twitter and FaceBook. Based on the response on Twitter and Facebook you want to decide whetheror not they should continue a particular campaign. Which of the
following should be selected to meet these requirements?
  : You are working , with a product based company. Thee are launching their new product of data visualization. And before launching this product they run advertising
1. IBM Cloudant

2. Apche Spark

3. Access Mostly Uused Products by 50000+ Subscribers

4. Unica (IBM Campaign)

5. IBM Analytics Engine

Correct Answer : Get Lastest Questions and Answer :
Explanation: IBM Campaign helps marketers design, execute, measure and optimize outbound marketing campaigns. This sophisticated omnichannel campaign management
solution allows marketers to perform deep segmentation over multiple data sources to deliver tailored messages to huge volumes of contacts.





Question : Big data is often defined as the ability to derive new insights from data that has
scaled up along three axes known as the three vs. Which of the following is the
fourth v? (Hint: It has something to do with the uncertainty.)
  :  Big data is often defined as the ability to derive new insights from data that has
1. volume

2. variety

3. Access Mostly Uused Products by 50000+ Subscribers

4. veracity

Correct Answer : Get Lastest Questions and Answer :
Explanation: Volume :Scale of Data
Velocity : Speed of Data
Veracity : Certainty of Data
Variety : Diversity of Data

Volume-based value: The more comprehensive your 360-degree view of customers and the more historical data you have on them, the more insight you can extract from it all and, all
things considered, the better decisions you can make in the process of acquiring, retaining, growing and managing those customer relationships.

Velocity-based value: The more customer data you can ingest rapidly into your big-data platform and the more questions that a user can pose more rapidly against that data (via
queries, reports, dashboards, etc.) within a given time period prior, the more likely you are to make the right decision at the right time to achieve your customer relationship
management objectives.

Variety-based value: The more varied customer data you have " from the CRM system, social media, call-center logs, etc. " the more nuanced portrait you have on customer
profiles, desires and so on, hence the better-informed decisions you can make in engaging with them.

Veracity-based value: The more consolidated, conformed, cleansed, consistent current the data you have on customers, the more likely you are to make the right decisions based on
the most accurate data.



Related Questions


Question : Which tool is best suited to import a portion of a relational database every day as files into HDFS,
and generate Java classes to interact with that imported data?




  : Which tool is best suited to import a portion of a relational database every day as files into HDFS,
1. Oozie
2. Hue
3. Access Mostly Uused Products by 50000+ Subscribers
4. Sqoop
5. Pig or Hive


Question : Workflows expressed in Oozie can contain:
  : Workflows expressed in Oozie can contain:
1. Sequences of MapReduce jobs only; no Pig or Hive tasks or jobs. These MapReduce sequences can be combined with forks and path joins.
2. Iterative repetition of MapReduce jobs until a desired answer or state is reached.

3. Access Mostly Uused Products by 50000+ Subscribers
4. Sequences of MapReduce and Pig. These sequences can be combined with other actions including forks, decision points, and path joins.


Question :

What is PIG?
  :
1. Pig is a subset fo the Hadoop API for data processing
2. Pig is a part of the Apache Hadoop project that provides scripting language interface for data processing
3. Access Mostly Uused Products by 50000+ Subscribers
4. None of Above



Question : Which statement most accurately describes the relationship between MapReduce and Pig?
  : Which statement most accurately describes the relationship between MapReduce and Pig?
1. Pig programs rely on MapReduce but are extensible, allowing developers to do special-purpose processing not provided by MapReduce.
2. Pig provides no additional capabilities to MapReduce. Pig programs are executed as MapReduce jobs via the Pig interpreter.
3. Access Mostly Uused Products by 50000+ Subscribers
4. Pig provides the additional capability of allowing you to control the flow of multiple MapReduce jobs.




Question : You are working in Acmeshell Inc as an architect, and you have designed comlete BigData Software Architecture for this company and you asked your juniour team member to
create Architecture Overview Document for this. As you are well qualified architect hence, you know, what to include and what not to include in this document and you found that
following things are included in the Architecture Overview Document. Which all, you think are correct ?

A. Architectural Goals
B. Key Concepts
C. Architectural Overview Diagram
D. Component Model
E. Logical Model
F. Operational Model
 : You are working in Acmeshell Inc as an architect, and you have designed comlete BigData Software Architecture for this company and you asked your juniour team member to
1. A,B,C
2. C,D,E
3. Access Mostly Uused Products by 50000+ Subscribers
4. A,B,E
5. A,C,F


Question : You have been working on IBM cloude and you are quite comfortable with the services and support provided by IBM for their cloud solution. However, sometime some
services fails and you ask IBM support to fix the same and they fix it perfectly. However, you need to give the SLA to your seniors about the service failure in cloud ? Which all
you can find relatively easier and give in SLA

A. Root cause for service interruptions
B. Turn-Around-Time (TAT)
C. Mean Time To Recover (MTTR)
D. First Call Resolution (FCR)
E. Abandonment Rate