SAS Certified BI Content Developer for SAS 9 and Business Analytics Questions and Answer (Dumps and Practice Questions)

Question : Select the correct statement which applies to RMSE?
1. It answers the question, "what is the average magnitude of the forecast errors?"
2. Does not indicate the direction of the errors.
3. RMSE is influenced more strongly by large errors than by small errors.
4. RMSE is influenced more strongly by large errors than by small errors.
5. Its range is from 0 to infinity, with 0 being a perfect score
6. Its range is from 0 to infinity, with infinity being a perfect score

1. 1,2,3,4
2. 2,3,4,5
3. 3,4,5,6
4. 1,2,3,5

Correct Answer : 4
RMSE is a measure of the "average" error, weighted according to the square of the error. It answers the question, "what is the average magnitude of the forecast errors?", but does not indicate the direction of the errors. Because it is a squared quantity, RMSE is influenced more strongly by large errors than by small errors. Its range is from 0 to infinity, with 0 being a perfect score.

Question : Which of the following metrics are useful in measuring the accuracy and quality of a recommender system?

1. Cluster Density
2. Support Vector Count
3. Area Under the ROC Curve (AUC)
4. Sum of Absolute Errors

Correct Answer : 3

AUC is a commonly used evaluation method for binary choice problems, which involve classifying an instance as either positive or negative. Its main advantages over other evaluation methods, such as the simpler misclassification error, are:
1. It's insensitive to unbalanced datasets (datasets that have more installeds than not-installeds or vice versa).
2. For other evaluation methods, a user has to choose a cut-off point above which the target variable is part of the positive class (e.g. a logistic regression model returns any real number between 0 and 1 - the modeler might decide that predictions greater than 0.5 mean a positive class prediction while a prediction of less than 0.5 mean a negative class prediction). AUC evaluates entries at all cut-off points, giving better insight into how well the classifier is able to separate the two classes.

The MAE measures the average magnitude of the errors in a set of forecasts, without considering their direction. It measures accuracy for continuous variables. The equation is given in the library references. Expressed in words, the MAE is the average over the verification sample of the absolute values of the differences between forecast and the corresponding observation. The MAE is a linear score which means that all the individual differences are weighted equally in the average.

The sum of absolute errors is a valid metric, but doesn't give any useful sense of how the recommender system is performing.
Support vector count and cluster density do not apply to recommender systems.
MAE and AUC are both valid and useful metrics for measuring recommender systems.

Question : If you want to understanding your data at a glance, seeing how data is skewed towards one end, which is the best fit graph or chart.

1. ROC
2. Lift
3. Gains
4. Box-and-whisker plot

Correct Answer : 4
Box-and-whisker Plot
Box-and-whisker plots, or boxplots, are an important way to show distributions of data. The name refers to the two parts of the plot: the box, which contains the median of the data along with the 1st and 3rd quartiles (25% greater and less than the median), and the whiskers, which typically represents data within 1.5 times the Inter-quartile Range (the difference between the 1st and 3rd quartiles). The whiskers can also be used to also show the maximum and minimum points within the data. When to use box-and-whisker plots: o Showing the distribution of a set of a data: Examples: understanding your data at a glance, seeing how data is skewed towards one end, identifying outliers in your data.

Related Questions

Question : Refer to the following odds ratio table:
What is a correct interpretation of the estimate?

1. The odds of the event are 1.142 greater for each one dollar increase in salary.
2. The odds of the event are 1.142 greater for each one thousand dollar increase in salary.
3. Access Mostly Uused Products by 50000+ Subscribers
4. The probability of the event is 1.142 greater for each one thousand dollar increase in salary.

Question : Which method is NOT an appropriate way to score new observations with a known target in a logistic regression model?

1. Use the SCORE statement in the LOGISTIC procedure.
2. Augment the training data set with new observations and set their responses to missing.
3. Access Mostly Uused Products by 50000+ Subscribers
4. Use the saved parameter estimates from the LOGISTIC procedure and score new observations in the SCORE procedure.

Question : Consider scoring new observations in the SCORE procedure versus the SCORE statement in the LOGISTIC procedure.
Which statement is true?

1. The SCORE statement in the LOGISTIC procedure returns only predicted probabilities, whereas the SCORE procedure returns only predicted logits.
2. The SCORE statement in the LOGISTIC procedure returns only predicted logits, whereas the SCORE procedure returns only predicted probabilities.
3. Access Mostly Uused Products by 50000+ Subscribers
4. The SCORE procedure and the SCORE statement in the LOGISTIC procedure produce the same output.

Question : Select the equivalent LOGISTIC procedure model statements.
A. Mode1 Purchase * Gender Age Region;
B. Mode1 Purchase * Gender | Age | Region;
C. Mode1 Purchase * Gender|Age|Region @1;
D. Mode1 Purchase * Gender|Age|Region @2;

1. A,B
2. A,C
3. Access Mostly Uused Products by 50000+ Subscribers
4. B,D

Question : Given the following LOGISTIC procedure:
What is the difference between the datasets OUTFILE_1 and OUTFILE_2?

1. OUTFILE_1 contains the final parameter estimates while OUTFILE_2 contains the newly scored probabilities
2. OUTFILE_1 contains the model goodness of fit statistics while OUTFILE_2 contains the newly scored probabilities
3. Access Mostly Uused Products by 50000+ Subscribers
4. OUTFILE_1 contains the final parameter estimates and Wald Chi-Square values while OUTFILE_2 contains the newly scored probabilities.

Question : The following LOGISTIC procedure output analyzes the relationship between a binary response
and an ordinal predictor variable, wrist_size Using reference cell coding, the analyst selects Large (L) as the reference level
What is the estimated logit for a person with large wrist size?

1. 0.0819
2. 0.5663
3. Access Mostly Uused Products by 50000+ Subscribers
4. -1.0415