Premium

Dell EMC Data Science Associate Certification Questions and Answers (Dumps and Practice Questions)



Question : Refer to the exhibit.
In the exhibit, the x-axis represents the derived probability of a borrower defaulting on a loan. Also
in the exhibit, the pink represents borrowers that are known to have not defaulted on their loan,
and the blue represents borrowers that are known to have defaulted on their loan.
Which analytical method could produce the probabilities needed to build this exhibit?

 : Refer to the exhibit.
1. Linear Regression
2. Logistic Regression
3. Access Mostly Uused Products by 50000+ Subscribers
4. Association Rules



Correct Answer : Get Lastest Questions and Answer :

Explanation:





Question : Refer to the exhibit.
You have created a density plot of purchase
amounts from a retail website as shown. What should
you do next?
 : Refer to the exhibit.
1. Recreate the plot using the barplot() function
2. Use the rug() function to add elements to the plot
3. Access Mostly Uused Products by 50000+ Subscribers
4. Reduce the sample size of the purchase amount data used to create the plot



Correct Answer : Get Lastest Questions and Answer :

Explanation:





Question : Refer to the exhibit.
You are building a decision tree. In this exhibit, four variables are listed with their respective values
of info-gain.
Based on this information, on which attribute would you expect the next split to be in the decision
tree?


 : Refer to the exhibit.
1. Credit Score
2. Age
3. Access Mostly Uused Products by 50000+ Subscribers
4. Gender



Correct Answer : Get Lastest Questions and Answer :

Explanation:



Related Questions


Question : In data visualization, what is used to focus the audience on a key part of a chart?
 : In data visualization, what is used to focus the audience on a key part of a chart?
1. Detailed text
2. Emphasis colors
3. Access Mostly Uused Products by 50000+ Subscribers
4. A data table




Question : Which word or phrase completes the statement? Data-ink ratio is to data visualization as
__________ .


 :  Which word or phrase completes the statement? Data-ink ratio is to data visualization as
1. Confusion matrix is to classifier
2. Data scientist is to big data
3. Access Mostly Uused Products by 50000+ Subscribers
4. K-means is to Naive Bayes



Question : Consider a database with transactions:
Transaction 1: {cheese, bread, milk}
Transaction 2: {soda, bread, milk}
Transaction 3: {cheese, bread}
Transaction 4: {cheese, soda, juice}
You decide to run the association rules algorithm where minimum support is 50%. Which rule has
a confidence at least 50%?

 : Consider a database with  transactions:
1. {soda} => {milk}
2. {milk} => {soda}
3. Access Mostly Uused Products by 50000+ Subscribers
4. {cheese} => {bread}



Question : You are using the Apriori algorithm to determine the likelihood that a person who owns a home
has a good credit score. You have determined that the confidence for the rules used in the
algorithm is > 75%. You calculate lift = 1.011 for the rule, "People with good credit are
homeowners". What can you determine from the lift calculation?


 : You are using the Apriori algorithm to determine the likelihood that a person who owns a home
1. Support for the association is low
2. Leverage of the rules is low
3. Access Mostly Uused Products by 50000+ Subscribers
4. The rule is true




Question : Consider a database with transactions:
Transaction 1: {cheese, bread, milk}
Transaction 2: {soda, bread, milk}
Transaction 3: {cheese, bread}
Transaction 4: {cheese, soda, juice}
The minimum support is 25%. Which rule has a confidence equal to 50%?

 : 	Consider a database with  transactions:
1. {bread} => {milk}
2. {bread, milk} => {cheese}
3. Access Mostly Uused Products by 50000+ Subscribers
4. {bread} => {cheese}



Question : Under which circumstance do you need to implement N-fold cross-validation after creating a regression model?

 : Under which circumstance do you need to implement N-fold cross-validation after creating a regression model?
1. The data is unformatted.
2. There is not enough data to create a test set.
3. Access Mostly Uused Products by 50000+ Subscribers
4. There are categorical variables in the model.