Premium

Dell EMC Data Science and BigData Certification Questions and Answers



Question : Which of the below is a correct formula for Lift in Association Rule?



 : Which of the below is a correct formula for Lift in Association Rule?
1. A
2. B
3. Access Mostly Uused Products by 50000+ Subscribers
4. D
5. E

Correct Answer : Get Lastest Questions and Answer :
Explanation: Using the lift we can measure, how many times more often X and Y occur together than expected if they are statistically independent to each other. Lift is a measure of how X and Y are really
related rather than coincidently happening together. If Lift = 1 means X and Y are independent. If Lift is larger than 1, then lift suggest a greater strength of the association between X and Y.




Question : Suppose you have transactions where transaction appears as below
{M,E} appear 300 times, {M} appear 500 times, {E} appear 400 times, {B} appear 400 times and {M,B} appears 400.
What you can conclude with that?


 : Suppose you have  transactions where transaction appears as below
1. You can say {M,B} had stronger Association then {M,E}

2. You can say {M,E} had stronger Association then {M,B}

3. Access Mostly Uused Products by 50000+ Subscribers

4. You can say M,E are independent


Correct Answer : Get Lastest Questions and Answer :
Explanation: To check whether Association is stronger or now we can use the formula as below
Lift {X->Y} = Support for the {X,Y}/Support for {X} * Support for {Y}
Based on that
Lift{M,E} = .3/(.5*.4) = 1.5
Lift{M,B} = .4/.5*.4 = 2
Hence, we can say that Association between {M,B} is more stronger than M,E





Question : How do you define the leverage, in case of the Apriori algorithms?


 : How do you define the leverage, in case of the Apriori algorithms?
1. Support(X and Y) * Support (X) *Support(Y)

2. Support(X and Y) * Support (X) /Support(Y)

3. Access Mostly Uused Products by 50000+ Subscribers

4. Support (X) /Support(Y)

5. Support(X U Y) ( Support (X) *Support(Y))


Correct Answer : Get Lastest Questions and Answer :
Explanation: Leverage is used to measure the difference in the probability of appearing X and Y together in the dataset , compared to what would be expected if X and Y were statistically independent to
each other.
If Leverage is 0 means, X and Y are statically independent to each other. If Leverage is not zero, it means X and Y has some kind of relationship. Larger the value means, more stronger relationship.

Related Questions


Question : In which lifecycle stage are test and training data sets created?


 : In which lifecycle stage are test and training data sets created?
1. Model planning
2. Discovery
3. Access Mostly Uused Products by 50000+ Subscribers
4. Data preparation



Question : Under which circumstance do you need to implement N-fold cross-validation after creating a regression model?

 : Under which circumstance do you need to implement N-fold cross-validation after creating a regression model?
1. The data is unformatted.
2. There is not enough data to create a test set.
3. Access Mostly Uused Products by 50000+ Subscribers
4. There are categorical variables in the model.




Question : Your company has different sales teams. Each team's sales manager has developed incentive
offers to increase the size of each sales transaction. Any sales manager whose incentive program
can be shown to increase the size of the average sales transaction will receive a bonus.
Data are available for the number and average sale amount for transactions offering one of the
incentives as well as transactions offering no incentive.
The VP of Sales has asked you to determine analytically if any of the incentive programs has
resulted in a demonstrable increase in the average sale amount. Which analytical technique would
be appropriate in this situation?



 : Your company has  different sales teams. Each team's sales manager has developed incentive
1. One-way ANOVA
2. Multi-way ANOVA
3. Access Mostly Uused Products by 50000+ Subscribers
4. Wilcoxson Rank Sum Test



Question : In data visualization, what is used to focus the audience on a key part of a chart?
 : In data visualization, what is used to focus the audience on a key part of a chart?
1. Detailed text
2. Emphasis colors
3. Access Mostly Uused Products by 50000+ Subscribers
4. A data table




Question : Which word or phrase completes the statement? Data-ink ratio is to data visualization as
__________ .


 :  Which word or phrase completes the statement? Data-ink ratio is to data visualization as
1. Confusion matrix is to classifier
2. Data scientist is to big data
3. Access Mostly Uused Products by 50000+ Subscribers
4. K-means is to Naive Bayes



Question : Consider a database with transactions:
Transaction 1: {cheese, bread, milk}
Transaction 2: {soda, bread, milk}
Transaction 3: {cheese, bread}
Transaction 4: {cheese, soda, juice}
You decide to run the association rules algorithm where minimum support is 50%. Which rule has
a confidence at least 50%?

 : Consider a database with  transactions:
1. {soda} => {milk}
2. {milk} => {soda}
3. Access Mostly Uused Products by 50000+ Subscribers
4. {cheese} => {bread}