Premium

Dell EMC Data Science and BigData Certification Questions and Answers



Question : You are using k-means clustering to discover groupings within a data set. You plot within-sum-ofsquares
(wss) of multiple cluster sizes. Based on the exhibit, how many clusters should you use in
your analysis?
 : You are using k-means clustering to discover groupings within a data set. You plot within-sum-ofsquares
1. 2
2. 3
3. Access Mostly Uused Products by 50000+ Subscribers
4. 8

Correct Answer : Get Lastest Questions and Answer :

Explanation:






Question : Certain individuals are more susceptible to autism if they have particular combinations of genes
expressed in their DNA. Given a sample of DNA from persons who have autism and a sample of
DNA from persons who do not have autism, determine the best technique for predicting whether
or not a given individual is susceptible to developing autism?
 : Certain individuals are more susceptible to autism if they have particular combinations of genes
1. Naive Bayes
2. Survival analysis
3. Access Mostly Uused Products by 50000+ Subscribers
4. Sequencealignment

Correct Answer : Get Lastest Questions and Answer :

Explanation:






Question : You are working with a logistic regression model to predict the probability that a user will click on
an ad. Your model has hundreds of features, and you're not sure if all of those features are
helping your prediction. Which regularization technique should you use to prune features that
aren't contributing to the model?
 : You are working with a logistic regression model to predict the probability that a user will click on
1. Convex
2. Uniform
3. Access Mostly Uused Products by 50000+ Subscribers
4. L1

Correct Answer : Get Lastest Questions and Answer :

Explanation:




Related Questions


Question As part of HadoopExam consultency team, you have been given a requirement by a Hotel to create
a GUI apllication, so all the hotel's sales or booking you will add and edit the customer information, and you dont want to spend the
money on enterprize RDBMS, hence decided simple file as a storage and considered the csv file. So HDFS is the better choice for
storing such information in the file.
  As part of HadoopExam consultency team, you have been given a requirement by a Hotel to create
1. No, because HDFS is optimized for read-once, streaming access for relatively large files.
2. No, because HDFS is optimized for write-once, streaming access for relatively large files.
3. Access Mostly Uused Products by 50000+ Subscribers
4. Yes, because HDFS is optimized for write-once, streaming access for relatively large files.


Question : All HadoopExam website subscribers information is stored in the MySQL database,
Which tool is best suited to import a portion of a subscribers information every day as files into HDFS,
and generate Java classes to interact with that imported data?
 : All HadoopExam website subscribers information is stored in the MySQL database,
1. Hive
2. Pig
3. Access Mostly Uused Products by 50000+ Subscribers
4. Flume


Question : You are using K-means clustering to classify customer behavior for a large retailer. You need to
determine the optimum number of customer groups. You plot the within-sum-of-squares (wss)
data as shown in the exhibit. How many customer groups should you specify?
 : You are using K-means clustering to classify customer behavior for a large retailer. You need to
1. 2
2. 3
3. Access Mostly Uused Products by 50000+ Subscribers
4. 8


Question : You are using k-means clustering to discover groupings within a data set. You plot within-sum-ofsquares
(wss) of multiple cluster sizes. Based on the exhibit, how many clusters should you use in
your analysis?
 : You are using k-means clustering to discover groupings within a data set. You plot within-sum-ofsquares
1. 2
2. 3
3. Access Mostly Uused Products by 50000+ Subscribers
4. 8


Question : Certain individuals are more susceptible to autism if they have particular combinations of genes
expressed in their DNA. Given a sample of DNA from persons who have autism and a sample of
DNA from persons who do not have autism, determine the best technique for predicting whether
or not a given individual is susceptible to developing autism?
 : Certain individuals are more susceptible to autism if they have particular combinations of genes
1. Naive Bayes
2. Survival analysis
3. Access Mostly Uused Products by 50000+ Subscribers
4. Sequencealignment


Question : You are working with a logistic regression model to predict the probability that a user will click on
an ad. Your model has hundreds of features, and you're not sure if all of those features are
helping your prediction. Which regularization technique should you use to prune features that
aren't contributing to the model?
 : You are working with a logistic regression model to predict the probability that a user will click on
1. Convex
2. Uniform
3. Access Mostly Uused Products by 50000+ Subscribers
4. L1