Premium

Dell EMC Data Science Associate Certification Questions and Answers (Dumps and Practice Questions)



Question : What is an appropriate data visualization to use in a presentation for a project sponsor?

 : What is an appropriate data visualization to use in a presentation for a project sponsor?
1. Box and Whisker plot
2. Pie chart
3. Access Mostly Uused Products by 50000+ Subscribers
4. Density plot

Correct Answer : Get Lastest Questions and Answer :

Explanation: Project Sponsor: Responsible for the genesis of the project. Provides the impetus and requirements for the project and defines the core business problem. Generally provides the funding and gauges the degree of value from the final outputs of the working team. This person sets the priorities for the project and clarifies the desired outputs. Because the presentation is often circulated within an organization, it is critical to articulate the results properly and position the findings in a way that is appropriate for the audience. Presentation for project sponsors: This contains high-level takeaways for executive level stakeholders, with a few key messages to aid their decision-making rocess. Focus on clean, easy visuals for the presenter to explain and for the viewer to grasp. Project Sponsor: Responsible for the genesis of the project. Provides the impetus and requirements for the project and defines the core business problem. Generally provides the funding and gauges the degree of value from the final outputs of the working team. This person sets the priorities for the project and clarifies the desired outputs. Because the presentation is often circulated within an organization, it is critical to articulate the results properly and position the findings in a way that is appropriate for the audience. Presentation for project sponsors: This contains high-level takeaways for executive level stakeholders, with a few key messages to aid their decision-making rocess. Focus on clean, easy visuals for the presenter to explain and for the viewer to grasp.







Question : In a Student's t-test, what is the meaning of the p-value?

 : In a Student's t-test, what is the meaning of the p-value?
1. it is the "power" of the Student's t-test
2. it is the mean of the distribution for the null hypothesis
3. Access Mostly Uused Products by 50000+ Subscribers
4. it is the area under the appropriate tails of the Student's distribution


Correct Answer : Get Lastest Questions and Answer :
Explanation: The P value is used all over statistics, from t-tests to regression analysis. Everyone knows that you use P values to determine statistical significance in a hypothesis test. In fact, P values often determine what studies get published and what projects get funding.

Despite being so important, the P value is a slippery concept that people often interpret incorrectly. How do you interpret P values?

In this post, I'll help you to understand P values in a more intuitive way and to avoid a very common misinterpretation that can cost you money and credibility.
P values evaluate how well the sample data support the devil's advocate argument that the null hypothesis is true. It measures how compatible your data are with the null hypothesis. How likely is the effect observed in your sample data if the null hypothesis is true?

High P values: your data are likely with a true null.
Low P values: your data are unlikely with a true null.
A low P value suggests that your sample provides enough evidence that you can reject the null hypothesis for the entire population. In technical terms, a P value is the probability of obtaining an effect at least as extreme as the one in your sample data, assuming the truth of the null hypothesis.

For example, suppose that a vaccine study produced a P value of 0.04. This P value indicates that if the vaccine had no effect, you'd obtain the observed difference or more in 4% of studies due to random sampling error.

P values address only one question: how likely are your data, assuming a true null hypothesis? It does not measure support for the alternative hypothesis. This limitation leads us into the next section to cover a very common misinterpretation of P values.P Values Are NOT the Probability of Making a Mistake

Incorrect interpretations of P values are very common. The most common mistake is to interpret a P value as the probability of making a mistake by rejecting a true null hypothesis (a Type I error).

There are several reasons why P values can't be the error rate. First, P values are calculated based on the assumptions that the null is true for the population and that the difference in the sample is caused entirely by random chance. Consequently, P values can't tell you the probability that the null is true or false because it is 100% true from the perspective of the calculations. Second, while a low P value indicates that your data are unlikely assuming a true null, it can't evaluate which of two competing cases is more likely:
The null is true but your sample was unusual.
The null is false.
Determining which case is more likely requires subject area knowledge and replicate studies.
Let's go back to the vaccine study and compare the correct and incorrect way to interpret the P value of 0.04:
Correct: Assuming that the vaccine had no effect, you'd obtain the observed difference or more in 4% of studies due to random sampling error.
Incorrect: If you reject the null hypothesis, there's a 4% chance that you're making a mistake.




Question : In addition to less data movement and the ability to use larger datasets in calculations, what is a
benefit of analytical calculations in a database?

 : In addition to less data movement and the ability to use larger datasets in calculations, what is a
1. improved connections between disparate data sources
2. more efficient handling of categorical values
3. Access Mostly Uused Products by 50000+ Subscribers
4. full use of data aggregation functionality



Correct Answer : Get Lastest Questions and Answer :

Explanation: Online Analytical Processing (OLAP) databases facilitate business-intelligence queries. OLAP is a database technology that has been optimized for querying and reporting, instead of processing transactions. The source data for OLAP is Online Transactional Processing (OLTP) databases that are commonly stored in data warehouses. OLAP data is derived from this historical data, and aggregated into structures that permit sophisticated analysis. OLAP data is also organized hierarchically and stored in cubes instead of tables. It is a sophisticated technology that uses multidimensional structures to provide rapid access to data for analysis. This organization makes it easy for a PivotTable report or PivotChart report to display high-level summaries, such as sales totals across an entire country or region, and also display the details for sites where sales are particularly strong or weak.

OLAP databases are designed to speed up the retrieval of data. Because the OLAP server, rather than Microsoft Office Excel, computes the summarized values, less data needs to be sent to Excel when you create or change a report. This approach enables you to work with much larger amounts of source data than you could if the data were organized in a traditional database, where Excel retrieves all of the individual records and then calculates the summarized values.

OLAP databases contain two basic types of data: measures, which are numeric data, the quantities and averages that you use to make informed business decisions, and dimensions, which are the categories that you use to organize these measures. OLAP databases help organize data by many levels of detail, using the same categories that you are familiar with to analyze the data.

The following sections describe each of these components in more detail:
Cube A data structure that aggregates the measures by the levels and hierarchies of each of the dimensions that you want to analyze. Cubes combine several dimensions, such as time, geography, and product lines, with summarized data, such as sales or inventory figures. Cubes are not "cubes" in the strictly mathematical sense because they do not necessarily have equal sides. However, they are an apt metaphor for a complex concept.
Measure A set of values in a cube that are based on a column in the cube's fact table and that are usually numeric values. Measures are the central values in the cube that are preprocessed, aggregated, and analyzed. Common examples include sales, profits, revenues, and costs.
Member An item in a hierarchy representing one or more occurrences of data. A member can be either unique or nonunique. For example, 2007 and 2008 represent unique members in the year level of a time dimension, whereas January represents nonunique members in the month level because there can be more than one January in the time dimension if it contains data for more than one year.

Calculated member A member of a dimension whose value is calculated at run time by using an expression. Calculated member values may be derived from other members' values. For example, a calculated member, Profit, can be determined by subtracting the value of the member, Costs, from the value of the member, Sales.

Dimension A set of one or more organized hierarchies of levels in a cube that a user understands and uses as the base for data analysis. For example, a geography dimension might include levels for Country/Region, State/Province, and City. Or, a time dimension might include a hierarchy with levels for year, quarter, month, and day. In a PivotTable report or PivotChart report, each hierarchy becomes a set of fields that you can expand and collapse to reveal lower or higher levels.

Hierarchy A logical tree structure that organizes the members of a dimension such that each member has one parent member and zero or more child members. A child is a member in the next lower level in a hierarchy that is directly related to the current member. For example, in a Time hierarchy containing the levels Quarter, Month, and Day, January is a child of Qtr1. A parent is a member in the next higher level in a hierarchy that is directly related to the current member. The parent value is usually a consolidation of the values of all of its children. For example, in a Time hierarchy that contains the levels Quarter, Month, and Day, Qtr1 is the parent of January.
Level Within a hierarchy, data can be organized into lower and higher levels of detail, such as Year, Quarter, Month, and Day levels in a Time hierarchy.




Related Questions


Question :

Digit recognition, is an example of___________

 :
1. Classification
2. Clustering
3. Access Mostly Uused Products by 50000+ Subscribers
4. None of the above




Question : Clustering is a type of unsupervised learning with the following goals


 : Clustering is a type of unsupervised learning with the following goals
1. Maximize a utility function
2. Find similarities in the training data
3. Access Mostly Uused Products by 50000+ Subscribers
4. 1 and 2
5. 2 and 3



Question : Of all the smokers in a particular district, % prefer brand A and % prefer brand B.
Of those smokers who prefer brand A, 30% are females, and of those who prefer brand B, 40% are female.
What is the probability that a randomly selected smoker prefers brand A, given that the person selected is a female?

Which of the following is a best way to solve this problem?
  : Of all the smokers in a particular district, % prefer brand A and % prefer brand B.
1. Bays Theorem
2. Poission Distribution
3. Access Mostly Uused Products by 50000+ Subscribers
4. None of the above




Question :
In a certain hospital there are two surgeons. Surgeon A operates on 100 patients, and 95 survive.
Surgeon B operates on 80 patients and 72 survive.
We are considering having surgery performed in this hospital and living through the operation is something
that is important. We want to choose the better of the two surgeons.
We did some further research into the data and found that originally the hospital had considered
two different types of surgeries, but then lumped all of the data together to report on each of its surgeons.
Not all surgeries are equal; some were considered high-risk emergency surgeries, while others were
of a more routine nature that had been scheduled in advance.

Of the 100 patients that surgeon A treated, 50 were high risk, of which three died. The other 50
were considered routine, and of these 2 died.
Now we look more carefully at the data for surgeon B and find that of 80 patients, 40 were high risk,
of which seven died. The other 40 were routine and only one died.
Now select the which statement is true about above scenario


 :
1. If your surgery is to be a routine one, then surgeon B is actually the better surgeon
2. If your surgery is to be a routine one, then surgeon A is actually the better surgeon
3. Access Mostly Uused Products by 50000+ Subscribers
4. Data is not sufficient


Question :

You are a doctor in charge of a large hospital, and you have to decide which treatment should be used for a particular disease.
You have the following data from last month: there were 390 patients with the disease. Treatment A was given to 160 patients of
whom 100 were men and 60 were women; 20 of the men and 40 of the women recovered. Treatment B was given to 230 patients of
whom 210 were men and 20 were women; 50 of the men and 15 of the women recovered. Which treatment would you recommend
we use for people with the disease in future?
 :
1. Treatment A, which seemed better in the overall data, was worse for both men and women when considered separately.
2. Treatment B, which seemed better in the overall data, was worse for both men and women when considered separately.
3. Access Mostly Uused Products by 50000+ Subscribers
4. We can safely give everyone treatment B




Question :

Select the correct statement for AUC which is a commonly used evaluation method in measuring the accuracy and quality of a recommender system
 :
1. is a commonly used evaluation method for binary choice problems,
2. It involves classifying an instance as either positive or negative
3. Access Mostly Uused Products by 50000+ Subscribers
4. 1 and 2 only
5. All 1,2 and 3
Ans :4
Exp : AUC is a commonly used evaluation method for binary choice problems, which involve classifying an instance as either positive or negative. Its main advantages over other evaluation methods, such as the simpler misclassification error, are:
1. It's insensitive to unbalanced datasets (datasets that have more installeds than not-installeds or vice versa).
2. For other evaluation methods, a user has to choose a cut-off point above which the target variable is part of the positive class (e.g. a logistic regression model returns any real number between 0 and 1 - the modeler might decide that predictions greater than 0.5 mean a positive class prediction while a prediction of less than 0.5 mean a negative class prediction). AUC evaluates entries at all cut-off points, giving better insight into how well the classifier is able to separate the two classes.





Question : You have created a recommender system for QuickTechie.com website, where you recommend the Software professional
based on some parameters like technologies, location, companies etc. Now but you have little doubt that this model is not
giving proper recommendation as Rahul is working on Hadoop in Mumbai and John from france is working on UI application created in flash,
are recommended as a similar professional, which is not correct. Select the correct option which will be helpful to measure the accuracy and quality of a recommender system you created for QuickTechie.com?


 :
1. Cluster Density
2. Support Vector Count
3. Access Mostly Uused Products by 50000+ Subscribers
4. Sum of Absolute Errors

Ans : 3
Exp : AUC is a commonly used evaluation method for binary choice problems, which involve classifying an instance as either positive or negative. Its main advantages over other evaluation methods, such as the simpler misclassification error, are:
1. It's insensitive to unbalanced datasets (datasets that have more installeds than not-installeds or vice versa).
2. For other evaluation methods, a user has to choose a cut-off point above which the target variable is part of the positive class (e.g. a logistic regression model returns any real number between 0 and 1 - the modeler might decide that predictions greater than 0.5 mean a positive class prediction while a prediction of less than 0.5 mean a negative class prediction). AUC evaluates entries at all cut-off points, giving better insight into how well the classifier is able to separate the two classes.

The MAE measures the average magnitude of the errors in a set of forecasts, without considering their direction. It measures accuracy for continuous variables. The equation is given in the library references. Expressed in words, the MAE is the average over the verification sample of the absolute values of the differences between forecast and the corresponding observation. The MAE is a linear score which means that all the individual differences are weighted equally in the average.

The sum of absolute errors is a valid metric, but doesn't give any useful sense of how the recommender system is performing.
Support vector count and cluster density do not apply to recommender systems.
MAE and AUC are both valid and useful metrics for measuring recommender systems.








Question :

Scater plots provide the following information about the relationship between two variables
1. Strength
2. Shape - linear, curved, etc.
3. Access Mostly Uused Products by 50000+ Subscribers
4. Presence of outliers


 :
1. 1,2,3
2. 1,3,4
3. Access Mostly Uused Products by 50000+ Subscribers
4. 2,3,4
5. All 1,2,3,4

Ans :5
Exp : Scatter plots show the relationship between two variables by displaying data points on a two-dimensional graph. The variable that might be considered an explanatory variable is plotted on the x axis, and the response variable is plotted on the y axis.
Scatter plots are especially useful when there are a large number of data points. They provide the following information about the relationship between two variables
Strength
Shape - linear, curved, etc.
Direction - positive or negative
Presence of outliers
A correlation between the variables results in the clustering of data points along a line. The following is an example of a scatter plot suggestive of a positive linear relationship.




Question : You are given a data set that contains information about tv advertisement placed between and of Zee News Channel
(Total Asia continent information). With the following detailed information.
Advertisement duration, Cost rate per minute of Advertissement, Country of the Advertisers, City from which addvertiser
Country to which advertise needs to be shown., City to which advertise needs to be shown., Month total advertisement
Days (of month) advertisement shown, Total hourds for which advertisement shown. , Total Minutes for which advertisement shown.
From the data set you can determine the frequencies of all the advertisement shown in Asia continent. For example, between 1990 and 2014,
500 advertisement were given from China to Shown in India, While 2000 advertisement given by Russia to shown in Japan.
Now you want to draw the pictue which shows the relation between Ad duration and cost per Minute, which technique you feel would be better.

 :
1. Scatter plot
2. Tree map
3. Access Mostly Uused Products by 50000+ Subscribers
4. Box plot
5. Bar chart

Ans : 1
Exp : A scatter plot, scatterplot, or scattergraph is a type of mathematical diagram using Cartesian coordinates to display values for two variables for a set of data. The data is displayed as a collection of points, each having the value of one variable determining the position on the horizontal axis and the value of the other variable determining the position on the vertical axis. This kind of plot is also called a scatter chart, scattergram, scatter diagram, or scatter graph.
A heat map is a two-dimensional representation of data in which values are represented by colors. A simple heat map provides an immediate visual summary of information. More elaborate heat maps allow the viewer to understand complex data sets. Another type of heat map, which is often used in business, is sometimes referred to as a tree map. This type of heat map uses rectangles to represent components of a data set. The largest rectangle represents the dominant logical division of data and smaller rectangles illustrate other sub-divisions within the data set. The color and size of the rectangles on this type of heat map can correspond to two different values, allowing the viewer to perceive two variables at once. Tree maps are often used for budget proposals, stock market analysis, risk management, project portfolio analysis, market share analysis, website design and network management. In descriptive statistics, a box plot or boxplot is a convenient way of graphically depicting groups of numerical data through their quartiles. Box plots may also have lines extending vertically from the boxes (whiskers) indicating variability outside the upper and lower quartiles, hence the terms box-and-whisker plot and box-and-whisker diagram. Outliers may be plotted as individual points. To visualize correlations between two variables, a scatter plot is typically the best choice. By plotting the data on a scatter plot, you can easily see any trends in the correlation, such as a linear relationship, a log normal relationship, or a polynomial relationship. A heat map uses three dimensions and so would be a poor choice for this purpose. Box plots, bar charts, and tree maps do not provide the kind of uniform special mapping of the data onto the graph that is required to see trends.




Question :

Which of the following provide the kind of uniform special mapping of the data onto the graph that is required to see trends.


 :
1. Box plots
2. Bar charts
3. Access Mostly Uused Products by 50000+ Subscribers
4. All 1,2 and 3
5. None of 1,2 and 3

Ans 5
Exp : Box Plots:
In descriptive statistics, a box plot or boxplot is a convenient way of graphically depicting groups of numerical data through their quartiles. Box plots may also have lines extending vertically from the boxes (whiskers) indicating variability outside the upper and lower quartiles, hence the terms box-and-whisker plot and box-and-whisker diagram. Outliers may be plotted as individual points.
Box plots display differences between populations without making any assumptions of the underlying statistical distribution: they are non-parametric. The spacings between the different parts of the box help indicate the degree of dispersion (spread) and skewness in the data, and identify outliers. In addition to the points themselves, they allow one to visually estimate various L-estimators, notably the interquartile range, midhinge, range, mid-range, and trimean. Boxplots can be drawn either horizontally or vertically.
A heat map is a two-dimensional representation of data in which values are represented by colors. A simple heat map provides an immediate visual summary of information. More elaborate heat maps allow the viewer to understand complex data sets.
In the United States, many people are familiar with heat maps from viewing television news programs. During a presidential election, for instance, a geographic heat map with the colors red and blue will quickly inform the viewer which states each candidate has won.
Another type of heat map, which is often used in business, is sometimes referred to as a tree map. This type of heat map uses rectangles to represent components of a data set. The largest rectangle represents the dominant logical division of data and smaller rectangles illustrate other sub-divisions within the data set. The color and size of the rectangles on this type of heat map can correspond to two different values, allowing the viewer to perceive two variables at once. Tree maps are often used for budget proposals, stock market analysis, risk management, project portfolio analysis, market share analysis, website design and network management.




Question : You are given a data set that contains information about tv advertisement placed between and of Zee News Channel
(Total Asia continent information). With the following detailed information.
Advertisement duration, Cost rate per minute of Advertissement, Country of the Advertisers, City from which addvertiser
Country to which advertise needs to be shown., City to which advertise needs to be shown., Month total advertisement
Days (of month) advertisement shown, Total hourds for which advertisement shown. , Total Minutes for which advertisement shown.
From the data set you can determine the frequencies of all the advertisement shown in Asia continent. For example, between 1990 and 2014,
500 advertisement were given from China to Shown in India, While 2000 advertisement given by Russia to shown in Japan.
Now you want to draw the pictue which shows the relation between which contries given most advertisement in the other country.
Select the correct option.
 :
1. Heat map
2. Tree map
3. Access Mostly Uused Products by 50000+ Subscribers
4. Bar chart
5. Scatter plot

Ans :1 Exp : A scatter plot, scatterplot, or scattergraph is a type of mathematical diagram using Cartesian coordinates to display values for two variables for a set of data. The data is displayed as a collection of points, each having the value of one variable determining the position on the horizontal axis and the value of the other variable determining the position on the vertical axis. This kind of plot is also called a scatter chart, scattergram, scatter diagram, or scatter graph.
A heat map is a two-dimensional representation of data in which values are represented by colors. A simple heat map provides an immediate visual summary of information. More elaborate heat maps allow the viewer to understand complex data sets.
Another type of heat map, which is often used in business, is sometimes referred to as a tree map. This type of heat map uses rectangles to represent components of a data set. The largest rectangle represents the dominant logical division of data and smaller rectangles illustrate other sub-divisions within the data set. The color and size of the rectangles on this type of heat map can correspond to two different values, allowing the viewer to perceive two variables at once. Tree maps are often used for budget proposals, stock market analysis, risk management, project portfolio analysis, market share analysis, website design and network management. In descriptive statistics, a box plot or boxplot is a convenient way of graphically depicting groups of numerical data through their quartiles. Box plots may also have lines extending vertically from the boxes (whiskers) indicating variability outside the upper and lower quartiles, hence the terms box-and-whisker plot and box-and-whisker diagram. Outliers may be plotted as individual points.
To visualize correlations between two variables, a scatter plot is typically the best choice. By plotting the data on a scatter plot, you can easily see any trends in the correlation, such as a linear relationship, a log normal relationship, or a polynomial relationship. A heat map uses three dimensions and so would be a poor choice for this purpose. Box plots, bar charts, and tree maps do not provide the kind of uniform special mapping of the data onto the graph that is required to see trends.In order to effectively visualize the advertisement source and destination frequencies, you'll need a plot that gives at least three dimensions: the source, destination, and frequency. A heat map provides exactly that. Scatter plots, box plots, tree maps, and bar charts provide at most two dimensions. In theory, you could use a three-dimensional variant of one of the two dimensions graphs, but three-dimensional graphs are never a good idea. Three-dimensional graphs can only be shown in two dimensions in print and hence cause visual distortions to the data. They can also hide some data points, and they make it very difficult to compare data points from different parts of the graph.




Question :

Which of the following graph can be best presented in two-dimension

1. Scatter plots
2. Box plots
3. Access Mostly Uused Products by 50000+ Subscribers
4. Bar charts

 :
1. 1,2,3
2. 2,3,4
3. Access Mostly Uused Products by 50000+ Subscribers
4. 1,2,4
5. All 1,2,3 and 4

Ans : 5
Exp : A heat map provides exactly that. Scatter plots, box plots, tree maps, and bar charts provide at most two dimensions. In theory, you could use a three-dimensional variant of one of the two dimensions graphs, but three-dimensional graphs are never a good idea. Three-dimensional graphs can only be shown in two dimensions in print and hence cause visual distortions to the data. They can also hide some data points, and they make it very difficult to compare data points from different parts of the graph.



Question : You are given a data set that contains information about tv advertisement placed between and of Zee News Channel
(Total Asia continent information). With the following detailed information.
Advertisement duration, Cost rate per minute of Advertissement, Country of the Advertisers, City from which addvertiser
Country to which advertise needs to be shown., City to which advertise needs to be shown., Month total advertisement
Days (of month) advertisement shown, Total hourds for which advertisement shown. , Total Minutes for which advertisement shown.
From the data set you can determine the frequencies of all the advertisement shown in Asia continent. For example, between 1990 and 2014,
500 advertisement were given from China to Shown in India, While 2000 advertisement given by Russia to shown in Japan.
Now you want to draw the pictue which shows the relation between Ad dthat every city and country has of the overall ad data, which technique you feel would be better.
 :
1. Scatter plot
2. Heat map
3. Access Mostly Uused Products by 50000+ Subscribers
4. Tree map
Ans : 4
Exp : To show the share of advertisement originations for every city and state, you'll need a way to show hierarchical information. A tree map is a natural choice, since it's designed for exactly that purpose. You could, however, use a stacked bar chart to present the same information. A heat map has an extra, unneeded dimension, which would make the graph confusing. A scatter plot is for numeric data in both dimensions. A box plot is for groupings of multiple values.
A scatter plot, scatterplot, or scattergraph is a type of mathematical diagram using Cartesian coordinates to display values for two variables for a set of data.
The data is displayed as a collection of points, each having the value of one variable determining the position on the horizontal axis and the value of the other variable determining the position on the vertical axis. This kind of plot is also called a scatter chart, scattergram, scatter diagram, or scatter graph.
A heat map is a two-dimensional representation of data in which values are represented by colors. A simple heat map provides an immediate visual summary of information. More elaborate heat maps allow the viewer to understand complex data sets.
Another type of heat map, which is often used in business, is sometimes referred to as a tree map. This type of heat map uses rectangles to represent components of a data set. The largest rectangle represents the dominant logical division of data and smaller rectangles illustrate other sub-divisions within the data set. The color and size of the rectangles on this type of heat map can correspond to two different values, allowing the viewer to perceive two variables at once. Tree maps are often used for budget proposals, stock market analysis, risk management, project portfolio analysis, market share analysis, website design and network management.
In descriptive statistics, a box plot or boxplot is a convenient way of graphically depicting groups of numerical data through their quartiles. Box plots may also have lines extending vertically from the boxes (whiskers) indicating variability outside the upper and lower quartiles, hence the terms box-and-whisker plot and box-and-whisker diagram. Outliers may be plotted as individual points.
To visualize correlations between two variables, a scatter plot is typically the best choice. By plotting the data on a scatter plot, you can easily see any trends in the correlation, such as a linear relationship, a log normal relationship, or a polynomial relationship. A heat map uses three dimensions and so would be a poor choice for this purpose. Box plots, bar charts, and tree maps do not provide the kind of uniform special mapping of the data onto the graph that is required to see trends. In order to effectively visualize the advertisement source and destination frequencies, you'll need a plot that gives at least three dimensions: the source, destination, and frequency. A heat map provides exactly that. Scatter plots, box plots, tree maps, and bar charts provide at most two dimensions. In theory, you could use a three-dimensional variant of one of the two dimensions graphs, but three-dimensional graphs are never a good idea. Three-dimensional graphs can only be shown in two dimensions in print and hence cause visual distortions to the data. They can also hide some data points, and they make it very difficult to compare data points from different parts of the graph.




Question :

Which of the following is a correct use case for the scatter plots


 :
1. Male versus female likelihood of having lung cancer at different ages
2. technology early adopters and laggards' purchase patterns of smart phones
3. Access Mostly Uused Products by 50000+ Subscribers
4. All of the above
Ans :4
Exp : Looking to dig a little deeper into some data, but not quite sure how - or if - different
pieces of information relate? Scatter plots are an effective way to give you a sense of
trends, concentrations and outliers that will direct you to where you want to focus your
investigation efforts further.
When to use scatter plots:
o Investigating the relationship between different variables. Examples: Male
versus female likelihood of having lung cancer at different ages, technology early
adopters' and laggards' purchase patterns of smart phones, shipping costs of
different product categories to different regions.




Question :

Which of the following places where we cannot use Gantt charts

 :
1. Displaying a project schedule. Examples: illustrating key deliverables, owners, and deadlines.
2. Showing other things in use over time. Examples: duration of a machine's use,
3. Access Mostly Uused Products by 50000+ Subscribers
4. None of the above
Ans : 4
Exp : Gantt charts excel at illustrating the start and finish dates elements of a project. Hitting
deadlines is paramount to a project's success. Seeing what needs to be accomplished -
and by when - is essential to make this happen. This is where a Gantt chart comes in.
While most associate Gantt charts with project management, they can be used to
understand how other things such as people or machines vary over time. You could
use a Gantt, for example, to do resource planning to see how long it took people to hit
specific milestones, such as a certification level, and how that was distributed over time.
When to use Gantt charts:
o Displaying a project schedule. Examples: illustrating key deliverables, owners,
and deadlines.
o Showing other things in use over time. Examples: duration of a machine's use,
availability of players on a team.



Question :

Which of the following is the best example where we can use Heat maps


 :
1. Segmentation analysis of target market
2. product adoption across regions
3. Access Mostly Uused Products by 50000+ Subscribers
4. All of the above
5. None of 1,2 and 3
Ans : 4
Exp : Heat maps are a great way to compare data across two categories using color. The
Effect is to quickly see where the intersection of the categories is strongest and weakest.
When to use heat maps:
Showing the relationship between two factors. Examples: segmentation analysis of target market, product adoption across regions, sales leads by Individual rep.






Question :

Which of the following cannot be presented using TreeMap?

 :
1. Storage usage across computer machines
2. managing the number and priority of technical support cases
3. Access Mostly Uused Products by 50000+ Subscribers
4. None of the above