Cloudera Databricks Data Science Certification Questions and Answers (Dumps and Practice Questions)

Question : In which of the following scenario we can use naive Bayes theorem for classification

1. Classify whether a given person is a male or a female based on the measured features. The features include height, weight, and foot size.
2. To classify whether an email is spam or not spam
3. To identify whether a fruit is an orange or not based on features like diameter, color and shape
4. All 1,2 and 3
5. None of the above

Correct Answer : 4

Explanation: naive Bayes classifiers have worked quite well in many real-world situations, famously document classification and spam filtering. They requires a small amount of training data to estimate the necessary parameters

Question :

Select the choice where Regression algorithms are not best fit

1. When the dimension of the object given
2. Weight of the person is given
3. Temperature in the atmosphere
4. Employee status

Correct Answer : 4

Explanation: Regression algorithms are usually employed when the data points are inherently numerical variables (such as the dimensions of an object, the weight of a person, or the temperature in the atmosphere) but, unlike Bayesian algorithms, they're not very good for categorical data (such as employee status or credit score description).

Question : Logistic regression does not work well in case of binary classification

1. True
2. False

Correct Answer : 2

Explanation: : In logistic regression, the model (the logistic function) takes values between 0 and 1, which can be interpreted as the probability of class membership and works well in the case of binary classification.

Related Questions

Question : In which of the scenario you can use the regression to predict the values

1. Samsung can use it for mobile sales forecast
2. Mobile companies can use it to forecast manufacturing defects
3. Probability of the celebrity divorce
4. Only 1 and 2
5. All 1 , 2 and 3

Question s: RMSE is a good measure of accuracy, but only to compare forecasting errors of different models for a ______, as it is scale-dependent

1. Between Variables
2. Particular Variable
3. Among all the variables
4. All of the above are correct

Question : You are creating a Classification process where input is the income, education and
current debt of a customer, what could be the possible output of this process.

1. Probability of the customer default on loan repayment
2. Percentage of the customer loan repayment capability
3. Percentage of the customer should be given loan or not
4. The output might be a risk class, such as "good", "acceptable", "average", or "unacceptable".
5. All of the above

Question : Let's say you have two cases as below for the movie ratings
1. You recommend to a user a movie with four stars and he really doesn't like it and he'd rate it two stars
2. You recommend a movie with three stars but the user loves it (he'd rate it five stars).
So which statement correctly applies?

1. In both cases, the contribution to the RMSE is the same
2. In both cases, the contribution to the RMSE is the different
3. In both cases, the contribution to the RMSE, could varies
4. None of the above

Question : The root-mean-square deviation (RMSD) or root-mean-square error (RMSE) is a frequently used measure of the
differences between values predicted by a model or an estimator and the values actually observed.
RMSD is a useful metric for evaluating which types of models?

1. Logistic regression
2. Naive Bayes classifier
3. Linear regression
4. All of the above

Question : Select the correct statement which applies to logistic regression

1. Computationally inexpensive, easy to implement, knowledge representation easy to interpret
2. May have low accuracy
3. Works with Numeric values
4. Only 1 and 3 are correct
5. All 1,2 and 3 are correct