Dell EMC Data Science and BigData Certification Questions and Answers

Question : Which characteristic applies mainly to Data Science as opposed to Business Intelligence?

1. Data dashboards
2. Focus on structured data
3. Access Mostly Uused Products by 50000+ Subscribers
4. Advanced analytical methods

Correct Answer : Get Lastest Questions and Answer :

Explanation: Data Science is different than the traditional Business Analytics in some key areas. For example, data science:

uses predictive and prescriptive analytics to predict what might happen using probabilities and confidence levels, not just report tools to report on what did happen.
Note: when we're dealing with historical data, there is a strong desire and need for the data to be 100% accurate. If you have your financial results wrong for the past quarter, folks are likely to go to jail. However
predicting performance for the next quarter is usually measured in probabilities and confidence levels (e.g., "There is a 95% confidence that our revenues will come in next quarter between $200M to $212M).
is used for dealing with and mitigating the uncertainty in the data. It uses several analytic and visualization techniques to understand where uncertainty may lay in the data, and then uses data transformation
techniques to massage the data into a workable form - not perfect, but again not necessary when dealing with probabilities and not absolutes.
is able to create as-needed data transformations (versus the traditional ETL process) to put the data into a format so that it can be combined with other data sources in search in insights about customers, products
and operations.

Question : Which word or phrase completes the statement?
Theater actor is to "Artistic and Expressive" as Data Scientist is to ________________

1. Introverted and Technical
2. Logical and Steadfast
3. Access Mostly Uused Products by 50000+ Subscribers
4. Communicative and Collaborative

Correct Answer : Get Lastest Questions and Answer :
Exp: Data scientists are generally thought of as having five main sets of skills and behavioral characteristics.
Quantitative skill: such as mathematics or statistics
Technical aptitude: namely, software engineering, machine learning, and
programming skills
Skeptical mind-set and critical thinking: It is important that data scientists can
examine their work critically rather than in a one-sided way.
Curious and creative: Data scientists are passionate about data and finding creative
ways to solve problems and portray information.
Communicative and collaborative: Data scientists must be able to articulate the
business value in a clear way and collaboratively work with other groups, including
project sponsors and key stakeholders.

Question : Which process in text analysis can be used to reduce dimensionality?

1. Parsing
2. Stemming
3. Access Mostly Uused Products by 50000+ Subscribers
4. Sorting

Correct Answer : Get Lastest Questions and Answer :
Exp: Stemming is the term used in linguistic morphology and information retrieval to describe the process for reducing inflected (or sometimes derived) words to their word stem, base or root form-generally a written
word form. The stem needs not to be identical to the morphological root of the word; it is usually sufficient that related words map to the same stem, even if this stem is not in itself a valid root. Algorithms for
stemming have been studied in computer science since the 1960s. Many search engines treat words with the same stem as synonyms as a kind of query expansion, a process called conflation.

Stemming programs are commonly referred to as stemming algorithms or stemmers. A stemmer for English, for example, should identify the string "cats" (and possibly "catlike", "catty" etc.) as based on the root "cat",
and "stemmer", "stemming", "stemmed" as based on "stem". A stemming algorithm reduces the words "fishing", "fished", and "fisher" to the root word, "fish". On the other hand, "argue", "argued", "argues", "arguing",
and "argus" reduce to the stem "argu" (illustrating the case where the stem is not itself a word or root) but "argument" and "arguments" reduce to the stem "argument".
A stemming algorithm is a process of linguistic normalisation, in which the variant forms of a word are reduced to a common form, for example,

connection
connections
connective ---> connect
connected
connecting
It is important to appreciate that we use stemming with the intention of improving the performance of IR systems. It is not an exercise in etymology or grammar. In fact from an etymological or grammatical viewpoint, a
stemming algorithm is liable to make many mistakes. In addition, stemming algorithms - at least the ones presented here - are applicable to the written, not the spoken, form of the language.

For some of the world's languages, Chinese for example, the concept of stemming is not applicable, but it is certainly meaningful for the many languages of the Indo-European group. In these languages words tend to be
constant at the front, and to vary at the end:

-ion
-ions
connect-ive
-ed
-ing
The variable part is the 'ending', or 'suffix'. Taking these endings off is called 'suffix stripping' or 'stemming', and the residual part is called the stem.

Related Questions

Question : Which of the following is true with regards to the Apriori Algorithms?
A. Algorithm starts with the combination of all the distinct item, to find the frequent itemset and in next iteration, it reduces one item from that frequent Itemset.
B. Algorithm starts with one distinct item, to find the frequent itemset and in next iteration, it add one item to find the frequent itemset.
C. If combination has frequent itemset than its subset will also be frequent dataset.
D. If combination has frequent itemset than it does not guarantee that subset of that combination will also be frequent dataset.

1. A,B
2. B,C
3. C,D
4. A,D
5. B,D

Question : If you have Association Rule as X->Y, which of the below represent the Confidence?

1. Support for {X}/Support for{X,Y}

2. Support for {X,Y}/Support for{X}

3. Access Mostly Uused Products by 50000+ Subscribers

4. Support for {Y}/Support for{X}

Question : In the Apriori algorithm which statement is true with regards to Confidence for Association Rule {X->Y}?
A. It consider antecedent {X}
B. It consider consequent {Y}
C. It consider co-occurrence of {X,Y}
D. It does not consider consequent {Y}
E. Confidence cannot tell if a rule contains true implication of the relationship of if the rule is purely coincidental.

1. A,B,C
2. B,C,D
3. Access Mostly Uused Products by 50000+ Subscribers
4. A,C,D,E
5. A,B,C,D,E

Question : Which of the below is a correct formula for Lift in Association Rule?

1. A
2. B
3. Access Mostly Uused Products by 50000+ Subscribers
4. D
5. E

Question : Suppose you have transactions where transaction appears as below
{M,E} appear 300 times, {M} appear 500 times, {E} appear 400 times, {B} appear 400 times and {M,B} appears 400.
What you can conclude with that?

1. You can say {M,B} had stronger Association then {M,E}

2. You can say {M,E} had stronger Association then {M,B}

3. Access Mostly Uused Products by 50000+ Subscribers

4. You can say M,E are independent

Question : How do you define the leverage, in case of the Apriori algorithms?

1. Support(X and Y) * Support (X) *Support(Y)

2. Support(X and Y) * Support (X) /Support(Y)

3. Access Mostly Uused Products by 50000+ Subscribers

4. Support (X) /Support(Y)

5. Support(X U Y) ( Support (X) *Support(Y))