Cloudera Databricks Data Science Certification Questions and Answers (Dumps and Practice Questions)

Question : Which is an example of supervised learning?

1. PCA
2. k-means clustering
3. SVD
4. EM
5. SVM

Correct Answer : 5

Explanation:

Question : Which of the following are point estimation methods?

1. MAP
2. MLE
3. MMSE
4. 1 and 2 only
5. 1,2 and 3

Correct Answer : 5

Point estimators
minimum-variance mean-unbiased estimator (MVUE), minimizes the risk (expected loss) of the squared-error loss-function.
best linear unbiased estimator (BLUE)
minimum mean squared error (MMSE)
median-unbiased estimator, minimizes the risk of the absolute-error loss function
maximum likelihood (ML)
method of moments, generalized method of moments

Question : In statistics, maximum-likelihood estimation (MLE) is a method of estimating the parameters of
a statistical model. When applied to a data set and given a statistical model, maximum-likelihood estimation
provides estimates for the model's parameters and the normalizing constant usually ignored in MLEs because ___________

1. The normalizing constant is always just .01 to .09 lesser than 1
2. The normalizing constant is always just .01 to .09 higher than 1
3. The normalizing constant is can never be zero which is exepected 1 time zero out of 10
4. There is no impact of normalizing constant on the maximizing value

Correct Answer : 4
Explanation: In statistics, maximum-likelihood estimation (MLE) is a method of estimating the parameters of a statistical model. When applied to a data set and given a statistical model, maximum-likelihood estimation provides estimates for the model's parameters. The method of maximum likelihood corresponds to many well-known estimation methods in statistics. For example, one may be interested in the heights of adult female penguins, but be unable to measure the height of every single penguin in a population due to cost or time constraints. Assuming that the heights are normally (Gaussian) distributed with some unknown mean and variance, the mean and variance can be estimated with MLE while only knowing the heights of some sample of the overall population. MLE would accomplish this by taking the mean and variance as parameters and finding particular parametric values that make the observed results the most probable (given the model).
In general, for a fixed set of data and underlying statistical model, the method of maximum likelihood selects the set of values of the model parameters that maximizes the likelihood function. Intuitively, this maximizes the "agreement" of the selected model with the observed data, and for discrete random variables it indeed maximizes the probability of the observed data under the resulting distribution. Maximum-likelihood estimation gives a unified approach to estimation, which is well-defined in the case of the normal distribution and many other problems. However, in some complicated problems, difficulties do occur: in such problems, maximum-likelihood estimators are unsuitable or do not exist.
In probability theory, a normalizing constant is a constant by which an everywhere non-negative function must be multiplied so the area under its graph is 1, e.g., to make it a probability density function or a probability mass function. Note that if the probability density function is a function of various parameters, so too will be its normalizing constant. The parametrised normalizing constant for the Boltzmann distribution plays a central role in statistical mechanics. In that context, the normalizing constant is called the partition function. A normalizing constant is positive, and multiplying or dividing a series of values by a positive number does not affect which of them is the largest. Maximum likelihood estimation is concerned only with finding a maximum value, so normalizing constants can be ignored.

Related Questions

Question : Select the statement which applies correctlty to the Naive Bayes

1. Works with a small amount of data
2. Sensitive to how the input data is prepared
3. Works with nominal values
4. All of above

Question :

Select the correct statement which applies to Bayes rule

1. Bayesian probability and Bayes' rule gives us a way to estimate unknown probabilities from known values.
2. You can reduce the need for a lot of data by assuming conditional independence among the features in your data.
3. Bayes' theorem finds the actual probability of an event from the results of your tests.
4. Only 1 and 2
5. All 1,2 and 3 are correct

Question : Which of the following technique can be used to the design of recommender systems?

1. Naive Bayes classifier
2. Power iteration
3. Collaborative filtering
4. 1 and 3
5. 2 and 3

Question : You are working on a problem where you have to predict whether the claim is done valid or not.
And you find that most of the claims which are having spelling errors as well as corrections in the manually
filled claim forms compare to the honest claims. Which of the following technique is suitable to find
out whether the claim is valid or not?

1. Naive Bayes
2. Logistic Regression
3. Random Decision Forests
4. Any one of the above

Question : . Bayes' Theorem allows you to look at an event that has already happened and make an
educated guess about the chain of events that may have led up to that event

1. True
2. False

Question :

Scenario: Suppose that Bob can decide to go to work by one of three modes of transportation,
car, bus, or commuter train. Because of high traffic, if he decides to go by car, there is a 50%
chance he will be late. If he goes by bus, which has special reserved lanes but is sometimes overcrowded,
the probability of being late is only 20%. The commuter train is almost never late, with a probability of
only 1%, but is more expensive than the bus.

Question : Suppose that Bob is late one day, and his boss wishes to estimate the probability that he
drove to work that day by car. Since he does not know which mode of transportation Bob usually uses,
he gives a prior probability of 1 3 to each of the three possibilities. Which of the following method the
boss will use to estimate of the probability that Bob drove to work?

1. Naive Bayes
2. Linear regression
3. Random decision forests
4. None of the above