Question : There are students who subscribed for the training materials from an Educational Portal and then appear for the final exam. Portal provides three means of preparing for the exam as below 1. Prepare using Books 2. Prepare using Recorded Video Trainings . Prepare using Sample Practice Questions and Study Notes You divide 90 students in three groups as below Group-1: Is using only Books for exam preparation Group-2: Is using only recorded video trainings for exam preparation Group-: Is using only Practice Questions for the exam preparation Which of the following Hypothesis test you can use in this scenario to compare their exam scores to find that which of the exam preparation technique is more effective?
1. You will be using Student t-test
2. You will be using Welch's t-test
3. You will be using Wilcoxon sun test
4. You will be using ANOVA
5. You would be applying 3student's t-tests, by creating three pairs
Correct Answer : 4 Explanation: : In the Hypothesis t-test, you can use Student, Welch or Wilcoxon t-test. If there are only two groups. Hence, we can discard the option 1,2 and 3. Only remaining option is Option-4 which is ANOVA (Analysis of variance) and its correct as well. You can apply multiple t-tests by creating pairs for example Group1 with Group2, Group2 with Group3 and Growp1 with Group3. Hence, total 3 t-test can be applied. However, multiple t-tests may not perform well on several populations for two reasons. Because 1. If number of groups increases then number of t-test also increases 2. If you increase number of t-test, then probability of committing type-1 error also increases. Above two issues can be taken care using the ANOVA (Analysis of Variance). ANOVA is a generalization of the Hypothesis testing of the difference of two population means. ANOVA tests whether any of the population means differ from the other population means. In case of ANOVA following are the Null and Alternate Hypothesis Null Hypothesis: All the population means are equal (u1=u2=u3=u4.=un) Alternate Hypothesis: In this case at least one pair of population means is not equal (u(i) <> u(j)) In this case also we are assuming that each population is assumed to be normally distributed with the same variance.
Question : Which of the following is true about the clustering? A. It is a supervised learning B. It is a unsupervised learning C. This technique can be used to finding hidden structure within the labelled data D. Dividing employees in three groups based on their salary is an example of Clustering
1. A,B 2. B,C 3. B,C,D 4. A,B,D 5. A,B,C,D
Correct Answer : 3 Explanation: Clustering is unsupervised machine learning technique to group the data, without even having the pre-defined labels to group them, based on their similarity in characteristics. It can help you in finding the hidden structure in the unlabeled data. Unsupervised means, you are not applying any labels in the advance on the data. For example in a large company you can create three groups of all the employee based on their salary. Clustering is an exploratory data analysis technique and you don't make any predictions in this. Major applications of the clustering's are marketing, economics, and various branches of science.
Question : You are working in a data analytics company as a data scientist, you have been given a set of various types of Pizzas available across various premium food centers in a country. This data is given as numeric values like Calorie, Size, and Sale per day etc. You need to group all the pizzas with the similar properties, which of the following technique you would be using for that?
1. Association Rules
2. Naive Bayes Classifier
3. K-means Clustering
4. Linear Regression
5. Grouping
Correct Answer : 3 Explanation: Using K means clustering you can create group of objects based on their properties. Where K is number of the groups. In this case, in each group you determine the center of the group and then find the how far each object characteristics from the center. If it is near the center than it can be part of the group. Suppose we have 100 objects and we need to determine 4 groups. Hence, here K=4. Now we determine 4 center values and based on that center value we determine the distance of each object from the center.
1. Formula A and Formula B are about equally effective at promoting weight gain. 2. Formula A and Formula B are both effective at promoting weight gain. 3. Access Mostly Uused Products by 50000+ Subscribers 4. Either Formula A or Formula B is effective at promoting weight gain.