Premium

Dell EMC Data Science Associate Certification Questions and Answers (Dumps and Practice Questions)



Question : Which word or phrase completes the statement? Unix is to bash as Hadoop is to:


  :   Which word or phrase completes the statement? Unix is to bash as Hadoop is to:
1. NameNode
2. Sqoop
3. HDFS
4. Flume
5. Pig


Correct Answer : Get Lastest Questions and Answer :

Explanation: Apache Pig consists of a data flow language, Pig Latin, and an environment to execute the
Pig code. The main benefit of using Pig is to utilize the power of MapReduce in a
distributed system, while simplifying the tasks of developing and executing a MapReduce
job. In most cases, it is transparent to the user that a MapReduce job is running in the
background when Pig commands are executed. This abstraction layer on top of Hadoop
simplifies the development of code against data in HDFS and makes MapReduce more
accessible to a larger audience. With Apache Hadoop and Pig already installed, the basics of using Pig include entering
the Pig execution environment by typing pig at the command prompt and then entering a
sequence of Pig instruction lines at the grunt prompt.





Question : A call center for a large electronics company handles an average of , support calls a day.
The head of the call center would like to optimize the staffing of the call center during the rollout of
a new product due to recent customer complaints of long wait times. You have been asked to
create a model to optimize call center costs and customer wait times.
The goals for this project include:
1. Relative to the release of a product, how does the call volume change over time?
2. How to best optimize staffing based on the call volume for the newly released product, relative
to old products.
3. Historically, what time of day does the call center need to be most heavily staffed?
4. Determine the frequency of calls by both product type and customer language.
Which goals are suitable to be completed with MapReduce?


  : A call center for a large electronics company handles an average of ,  support calls a day.
1. Goal 2 and 4
2. Goal 1 and 3
3. Goals 1, 2, 3, 4
4. Goals 2, 3, 4


Correct Answer : Get Lastest Questions and Answer :

Explanation:





Question : Consider the example of an analysis for fraud detection on credit card usage. You will need to
ensure higher-risk transactions that may indicate fraudulent credit card activity are retained in your
data for analysis, and not dropped as outliers during pre-processing. What will be your approach
for loading data into the analytical sandbox for this analysis?


  :   Consider the example of an analysis for fraud detection on credit card usage. You will need to
1. ETL
2. ELT
3. EDW
4. OLTP


Correct Answer : Get Lastest Questions and Answer :

Explanation: Phase 2-Data preparation: Phase 2 requires the presence of an analytic sandbox,
in which the team can work with data and perform analytics for the duration of the
project. The team needs to execute extract, load, and transform (ELT) or extract,
transform and load (ETL) to get data into the sandbox. The ELT and ETL are
sometimes abbreviated as ETLT. Data should be transformed in the ETLT process so
the team can work with it and analyze it. In this phase, the team also needs to
familiarize itself with the data thoroughly and take steps to condition the data




Related Questions


Question : In linear regression modeling, which action can be taken to improve the linearity of the relationship
between the dependent and independent variables?

 : In linear regression modeling, which action can be taken to improve the linearity of the relationship
1. Apply a transformation to a variable
2. Use a different statistical package
3. Access Mostly Uused Products by 50000+ Subscribers
4. Change the units of measurement on the independent variable




Question : Data visualization is used in the final presentation of an analytics project. For what else is this
technique commonly used?

 : Data visualization is used in the final presentation of an analytics project. For what else is this
1. ETLT
2. Descriptive statistics
3. Access Mostly Uused Products by 50000+ Subscribers
4. Model selection




Question : You have been assigned to do a study of the daily revenue effect of a pricing model of online
transactions. All the data currently available to you has been loaded into your analytics database;
revenue data, pricing data, and online transaction data. You find that all the data comes in
different levels of granularity. The transaction data has timestamps (day, hour, minutes, seconds),
pricing is stored at the daily level, and revenue data is only reported monthly. What is your next
step?


 : You have been assigned to do a study of the daily revenue effect of a pricing model of online
1. Interpolate a daily model for revenue from the monthly revenue data.
2. Aggregate all data to the monthly level in order to create a monthly revenue model.
3. Access Mostly Uused Products by 50000+ Subscribers
question.
4. Disregard revenue as a driver in the pricing model, and create a daily model based on pricing
and transactions only.




Question : Which SQL OLAP extension provides all possible grouping combinations?

 : Which SQL OLAP extension provides all possible grouping combinations?
1. ROLLUP
2. UNION ALL
3. Access Mostly Uused Products by 50000+ Subscribers
4. CROSS JOIN




Question : What is the primary bottleneck in text classification?

 : What is the primary bottleneck in text classification?
1. The ability to parse unstructured text data.
2. The availablilty of tagged training data.
3. Access Mostly Uused Products by 50000+ Subscribers
4. The fact that text corpora are dynamic.





Question : Which characteristic applies only to Business Intelligence as opposed to Data Science?


 : Which characteristic applies only to Business Intelligence as opposed to Data Science?
1. Uses only structured data
2. Supports solving "what if" scenarios
3. Access Mostly Uused Products by 50000+ Subscribers
4. Uses predictive modeling techniques