Cloudera Databricks Data Science Certification Questions and Answers (Dumps and Practice Questions)

Question :Suppose there are three events then which formula must always be equal to P(E|E,E)?

1. P(E1,E2,E3)P(E1) / P(E2,E3)
2. P(E1,E2,E3) / P(E2,E3)
3. Access Mostly Uused Products by 50000+ Subscribers
4. P(E1,E2|E3)P(E3)
5. P(E1,E2,E3)P(E2)P(E3)

Correct Answer : Get Lastest Questions and Answer :

Explanation: This is an application of conditional probability; P(E1,E2)=P(E1|E2)P(E2), so P(E1|E2) = P(E1,E2) / P(E2)
P(E1,E2,E3) / P(E2,E3)
If the events are A and B respectively, this is said to be "the probability of A given B".
It is commonly denoted by P(A|B), or sometimes PB(A). In case that both "A" and "B" are categorical variables, conditional probability table is typically used to represent the conditional probability.

Question : Which of the following is an example of Gaussian distribution application?

1. If the average man is 175 cm tall with a variance of 6 cm, what is the probability that a man found at random will be 183 cm tall?
2. If the average man is 175 cm tall with a variance of 6 cm and the average woman is 168 cm
tall with a variance of 3cm, what is the probability that the average man will be shorter than the average woman?
3. Access Mostly Uused Products by 50000+ Subscribers
order to ensure that the 99% of all cans have a weight of at least 250 grams?
4. 1 and 2 only
5. Both 1 and 2

Correct Answer : Get Lastest Questions and Answer :

Explanation: Normal distribution is without exception the most widely used distribution. It also goes under the name Gaussian distribution. It assumes that the observations are closely clustered around the mean, ?, and this amount is decaying quickly as we go farther away from the mean.

Question : It was found that the mean length of parts produced by a lathe was . mm with a standard deviation of . mm.
Find the probability that a part selected at random would have a length
between 20.03 mm and 20.08 mm

1. .11
2. 0.33
3. Access Mostly Uused Products by 50000+ Subscribers
4. 0.77

Correct Answer : Get Lastest Questions and Answer :

X = length of part
(a) 20.03 is 1 standard deviation below the mean;
20.08 is 20.08?20.050.02=1.5 standard deviations above the mean.
P(20.03 less than X less than 20.08)
=P(-1 less than Z less than 1.5)
=0.3413+0.4332
=0.7745
So the probability is 0.7745.

Related Questions

Question :

Classification and regression are examples of___________.

1. supervised learning
2. un-supervised learning
3. Clustering
4. Density estimation

Question : Reducing the data from many features to a small number so that we can
properly visualize it in two or three dimensions. It is done in_______

1. supervised learning
2. un-supervised learning
3. k-Nearest Neighbors
4. Support vector machines

Question : If you are trying to predict or forecast a discrete target value, then which is the correct options

1. Supervised Learning regression algorithms
2. Supervised Learning classification algorithms
3. Un supervised Learning
4. Density estimation algorithm

Question : Select the correct option from the below

1. If you're trying to predict or forecast a target value, then you need to look into supervised learning.
2. If you've chosen supervised learning, with discrete target value like Yes/No, 1/2/3, A/B/C, or Red/Yellow/Black, then look into classification.
3. If the target value can take on a number of values, say any value from 0.00 to 100.00, or -999 to 999, or +_ to -_, then you need to look unsupervised learning
4. If you're not trying to predict a target value, then you need to look into unsupervised learning
5. Are you trying to fit your data into some discrete groups? If so and that's all you need, you should look into clustering.

1. 1, 2,3,4,5
2. 2,3,4,5
3. 1,2,4,5
4. 1,2,3,5
5. 2,3,4,5

Question : Select the sequence of the developing machine learning applications
A. Analyze the input data
B. Prepare the input data
C. Collect data
D. Train the algorithm
E. Test the algorithm
F. Use It

1. A,B,C,D,E,F
2. C,B,A,D,E,F
3. C,A,B,D,E,F
4. C,B,A,D,E,F

Question :

Select the correct statement which applies to K-Nearest Neighbors

1. No Assumption about the data
2. Computationaly expensive
3. Require less memory
4. Works with Numeric Values

1. 1,2,3,4
2. 2,3,4
3. 1,3,4
4. 1,2,4