Premium

Dell EMC Data Science and BigData Certification Questions and Answers



Question : You have been given two population HEPop and HEPop, you need to do Hypothesis testing on this data to find that they are equal or not. However, you cannot assume that data is normally distributed. Which
of the following test would help?


 : You have been given two population HEPop and HEPop, you need to do Hypothesis testing on this data to find that they are equal or not. However, you cannot assume that data is normally distributed. Which
1. Use Welch t-test

2. Use Student t-test

3. Use Teacher t-test

4. Use Wilcoxon rank sum test


Correct Answer : 4
Explanation: There are two types of t-test one is parametric t-test and another is non-parametric t-test. When you use parametric t-test than it makes the assumption about the population distributions from
which you take the samples. Suppose you can not assumed or transformed to follow a normal distribution, then a non-parametric test can be used.
Wilcoxon rank-sum test is a nonparametric hypothesis test and checks whether two population are identically distributed or not. As Wilcoxon test does not assume anything about the population distribution, it generally
considered more robust than the t-test. In other words, there are fewer assumptions to violate.
If you can assume that data is normally distributed than you can use the Student or Welch t-tests.





Question : You are conducting a Hypothesis test and Null Hypothesis is true. But you have rejected that Null Hypothesis, what type of this error?


 : You are conducting a Hypothesis test and Null Hypothesis is true. But you have rejected that Null Hypothesis, what type of this error?
1. Type-I Error

2. Type-II Error

3. Type-III Error

4. Type-IV Error

5. There is no error


Correct Answer : 1
Explanation: As question is saying that Null Hypothesis is true, but you still rejected that Null Hypothesis it means there is error. You can say that option 5 is not correct.
Next what all Error types we have
Type-I Error: You are rejecting Null Hypothesis even it is true and it is denoted by sign alpha.
Type-II Error: You are accepting Null Hypothesis even it is False. And that is denoted by sign Beta.

Hence, based on that we can say option-1 is correct. Usually you will calculate the probability of committing type-1 and type-2 error. If probability is 5%, it means that committing type-1 error is 0.5%, we can say
that there are 5% chances that you will reject the Null Hypothesis even it is true.





Question : You are conducting a Hypothesis test for two populations HEPop and HEPop. Which of the following statements are correct with regards to the Power and Sample Size?
A. The power of a test is the probability of correctly rejecting the null hypothesis
B. The power of a test is the probability of correctly accepting the null hypothesis
C. It is represented as (1-Probability of Type two Error)
D. Power of a test improves when the sample size increases.

 : You are conducting a Hypothesis test for two populations HEPop and HEPop. Which of the following statements are correct with regards to the Power and Sample Size?
1. A,B
2. A,C,D
3. A,B,C
4. B,C,D
5. A,B,C,D

Correct Answer : 5
Explanation: In the Hypothesis t-test , power for a test is the probability of correctly rejecting the Null Hypothesis. Which is denoted by the 1-Beta. Where Beta is the probability of a type 2 error. As
your sample size increases the power will also increases. Power can be used to determine the sample sizes.
Power of Hypothesis tests depends on the true difference of the population means.



Related Questions


Question : Your colleague, who is new to Hadoop, approaches you with a question. They want to know how
best to access their data. This colleague has previously worked extensively with SQL and
databases.
Which query interface would you recommend?


 : Your colleague, who is new to Hadoop, approaches you with a question. They want to know how
1. Flume
2. Pig
3. Access Mostly Uused Products by 50000+ Subscribers
4. HBase


Question : In linear regression, what indicates that an estimated coefficient is significantly different than zero?

  : In linear regression, what indicates that an estimated coefficient is significantly different than zero?
1. R-squared near 1
2. R-squared near 0
3. Access Mostly Uused Products by 50000+ Subscribers
4. A small p-value





Question : Which graphical representation shows the distribution and multiple summary statistics of a
continuous variable for each value of a corresponding discrete variable?

 : Which graphical representation shows the distribution and multiple summary statistics of a
1. box and whisker plot
2. dotplot
3. Access Mostly Uused Products by 50000+ Subscribers
4. binplot



Question : Assume that you have a data frame in R. Which function would you use to display descriptive
statistics about this variable?

  : Assume that you have a data frame in R. Which function would you use to display descriptive
1. levels
2. attributes
3. Access Mostly Uused Products by 50000+ Subscribers
4. summary




Question : What is the mandatory Clause that must be included when using Window functions?
 : What is the mandatory Clause that must be included when using Window functions?
1. OVER
2. RANK
3. Access Mostly Uused Products by 50000+ Subscribers
4. RANK BY




Question : What is the purpose of the process step "parsing" in text analysis?
  :  What is the purpose of the process step
1. computes the TF-IDF values for all keywords and indices
2. executes the clustering and classification to organize the contents
3. Access Mostly Uused Products by 50000+ Subscribers
4. imposes a structure on the unstructured/semi-structured text for downstream analysis