Premium

Dell EMC Data Science and BigData Certification Questions and Answers



Question : Refer to exhibit

You are asked to write a report on how specific variables impact your client's sales using a data
set provided to you by the client. The data includes 15 variables that the client views as directly
related to sales, and you are restricted to these variables only.
After a preliminary analysis of the data, the following findings were made:
1. Multicollinearity is not an issue among the variables
2. Only three variables-A, B, and C-have significant correlation with sales
You build a linear regression model on the dependent variable of sales with the independent
variables of A, B, and C. The results of the regression are seen in the exhibit.
You cannot request additional datA. what is a way that you could try to increase the R2 of the
model without artificially inflating it?

  : Refer to exhibit
1. Create clusters based on the data and use them as model inputs
2. Force all 15 variables into the model as independent variables
3. Access Mostly Uused Products by 50000+ Subscribers
4. Break variables A, B, and C into their own univariate models


Correct Answer : Get Lastest Questions and Answer :

Explanation: In statistics, linear regression is an approach for modeling the relationship between a scalar dependent variable y and one or more explanatory variables (or independent variable) denoted X. The case of one
explanatory variable is called simple linear regression. For more than one explanatory variable, the process is called multiple linear regression. (This term should be distinguished from multivariate linear
regression, where multiple correlated dependent variables are predicted, rather than a single scalar variable.)
In linear regression, data are modeled using linear predictor functions, and unknown model parameters are estimated from the data. Such models are called linear models.[3] Most commonly, linear regression refers to a
model in which the conditional mean of y given the value of X is an affine function of X. Less commonly, linear regression could refer to a model in which the median, or some other quantile of the conditional
distribution of y given X is expressed as a linear function of X. Like all forms of regression analysis, linear regression focuses on the conditional probability distribution of y given X, rather than on the joint
probability distribution of y and X, which is the domain of multivariate analysis.






Question : You have two tables of customers in your database. Customers in cust_table_ were sent an email
promotion last year, and customers in cust_table_2 received a newsletter last year.
Customers can only be entered in once per table. You want to create a table that includes all
customers, and any of the communications they received last year. Which type of join would you
use for this table?


  :  You have two tables of customers in your database. Customers in cust_table_ were sent an email
1. Full outer join
2. Inner join
3. Access Mostly Uused Products by 50000+ Subscribers
4. Cross join


Correct Answer : Get Lastest Questions and Answer :

Explanation: The FULL OUTER JOIN keyword returns all rows from the left table (table1) and from the right table (table2).

The FULL OUTER JOIN keyword combines the result of both LEFT and RIGHT joins.

SQL FULL OUTER JOIN Syntax
SELECT column_name(s)
FROM table1
FULL OUTER JOIN table2
ON table1.column_name=table2.column_name;






Question : In which lifecycle stage are initial hypotheses formed?


  :  In which lifecycle stage are initial hypotheses formed?
1. Model planning
2. Discovery
3. Access Mostly Uused Products by 50000+ Subscribers
4. Data preparation


Correct Answer : Get Lastest Questions and Answer :

Explanation: Phase 1-Discovery: In Phase 1, the team learns the business domain, including
relevant history such as whether the organization or business unit has attempted
similar projects in the past from which they can learn. The team assesses the
resources available to support the project in terms of people, technology, time, and
data. Important activities in this phase include framing the business problem as an
analytics challenge that can be addressed in subsequent phases and formulating initial
hypotheses (IHs) to test and begin learning the data.




Related Questions


Question :

Logistic regression is a model used for prediction of the probability of occurrence of an event.
It makes use of several variables that may be___________
  :
1. Numerical
2. Categorical
3. Access Mostly Uused Products by 50000+ Subscribers
4. None of the 1 and 2 are correct




Question : Select the correct statement regarding the naive Bayes classification

1. it only requires a small amount of training data to estimate the parameters
2. Independent variables can be assumed
3. Access Mostly Uused Products by 50000+ Subscribers
4. for each class entire covariance matrix need to be determined

  : Select the correct statement regarding the naive Bayes classification
1. 1,2,3
2. 2,3,4
3. Access Mostly Uused Products by 50000+ Subscribers
4. 2,3,4



Question : Spam filtering of the emails is an example of
  : Spam filtering of the emails is an example of
1. Supervised learning
2. Unsupervised learning
3. Access Mostly Uused Products by 50000+ Subscribers
4. 1 and 3 are correct
5. 2 and 3 are correct


Question : In which of the following scenario we can use naive Bayes theorem for classification
  : In which of the following scenario we can use naive Bayes theorem for classification
1. Classify whether a given person is a male or a female based on the measured features. The features include height, weight, and foot size.
2. To classify whether an email is spam or not spam
3. Access Mostly Uused Products by 50000+ Subscribers
4. All 1,2 and 3
5. None of the above



Question :

Select the choice where Regression algorithms are not best fit
  :
1. When the dimension of the object given
2. Weight of the person is given
3. Access Mostly Uused Products by 50000+ Subscribers
4. Employee status





Question : Logistic regression does not work well in case of binary classification

  : Logistic regression does not work well in case of binary classification
1. True
2. False