IBM Certified Data Architect - Big Data Certification Questions and Answers (Dumps and Practice Questions)

Question : Unless specified otherwise, Big R automatically assumes all data to be strings.

1. True
2. False

Correct Answer : Get Lastest Questions and Answer :
Explanation: Unless specified otherwise, Big R automatically assumes all data to be strings.

Question : use of R as a query language for big data: Big R hides many of the complexities pertaining to the underlying Hadoop / MapReduce framework.

1. True
2. False

Correct Answer : Get Lastest Questions and Answer :
Explanation: Big R provides the following capabilities:

Enable the use of R as a query language for big data: Big R hides many of the complexities pertaining to the underlying Hadoop / MapReduce framework. Using classes such as
bigr.frame, bigr.vector and bigr.list, a user is presented with an API that is heavily inspired by Rs foundational API on data.frames, vectors and frames.

Enable the pushdown of R functions such that they run right on the data: Via mechanisms such as groupApply, rowApply and tableApply, user-written functions composed in R can be
shipped to the cluster. BigInsights transparently parallelizes execution of these functions and provides consolidated results back to the user. Almost any R code, including most
packages available on open-source repositories such as CRAN (Comprehensive R Archive Network), can be run using this mechanism.

Question : You are working as a BigData Analytics , and you need to integrate IBM SPSS Modeler to use big data as a source for predictive modeling. Which of the following will
help to do this

1. IBM Infospere

2. Big R

3. Access Mostly Uused Products by 50000+ Subscribers

4. IBM Big Integration Engine

5. IBM SPSS Analytic Server

Correct Answer : Get Lastest Questions and Answer :
Explanation: IBM SPSS Analytic Server enables IBM SPSS Modeler to use big data as a source for predictive modeling. Together they can provide an integrated predictive
analytics platform using data from Hadoop distributions and Spark applications. Move analytics to the data to optimize performance. Access data from Hadoop and combine it with
RDBMS to expand data access. Apply real-time processing and machine learning to conduct deeper analysis and accelerate resultsand reduce coding and simplify algorithm
development. The combination also provides defined interfaces that simplify big data analysis for both analysts and business users.

Related Questions

Question :

Select the correct statement which applies to "Fair Scheduler"

1. Fair Scheduler allows assigning guaranteed minimum shares to queues
2. queue does not need its full guaranteed share, the excess will not be splitted between other running apps.
3. it is also possible to limit the number of running apps per user and per queue
4. 1 and 3
5. 1,2 and 3

Question : YARN then provides processing capacity to each application by allocating Containers.
A Container is the basic unit of processing capacity in YARN, and is an encapsulation of resource elements

1. CPU
2. Memory
3. CPU and Memory
4. Each Data Node of Hadoop Cluster

Question : A developer has submitted a long running MapReduce job with wrong data sets.
You want to kill the running MapReduce job so that a new job with the correct data sets can be started.
What method can be used to terminate the submitted MapReduce job?

1. Open a remote terminal to the node running the ApplicationMaster and kill the JVM.

2. yarn application -kill "application_id"
3. Use CTRL-C from the terminal where the MapReduce job was started.
4. hadoop datanode -rollback
5. rmadmin -refreshQueues

Question : Which of the following is a correct command to submit yarn job, assuming your code is deployed in hadoopexam.jar

1. java jar hadoopexam.jar [mainClass] args...
2. yarn jar hadoopexam.jar [mainClass] args...
3. yarn hadoopexam.jar [mainClass] args...
4. yarn jar hadoopexam.jar args...

Question : Which of the following, command can be used to list all the jobs or application running in the resource manager

1. yarn application -list
2. yarn application -listAll
3. Access Mostly Uused Products by 50000+ Subscribers
4. yarn application -allJobs

Question : Select the correct command/commands which can be used to Dump the container logs

1. yarn logs -applicationId ApplicationId
2. yarn logs -appOwner AppOwner
3. Access Mostly Uused Products by 50000+ Subscribers
4. yarn logs -nodeAddress NodeAddress

1. 1,2,3
2. 2,3,4
3. Access Mostly Uused Products by 50000+ Subscribers
4. 1,2,4
5. All 1,2,3,4