Question : Select the correct statement which applies to logistic regression 1. Computationally inexpensive, easy to implement, knowledge representation easy to interpret 2. May have low accuracy 3. Works with Numeric values 4. Only 1 and 3 are correct 5. All 1,2 and 3 are correct
Correct Answer : 5
Logistic regression Pros: Computationally inexpensive, easy to implement, knowledge representation easy to interpret Cons: Prone to underfitting, may have low accuracy Works with: Numeric values, nominal values
Question : Suppose training data are oversampled in the event group to make the number of events and nonevents roughly equal. A logistic regression is run and the probabilities are output to a data set NEW and given the variable name PE. A decision rule considered is, "Classify data as an event if probability is greater than 0.5." Also the data set NEW contains a variable TG that indicates whether there is an event (1=Event, 0= No event). The following SAS program was used. What does this program calculate? 1. Depth 2. Sensitivity 3. Specificity 4. Positive predictive value
Correct Answers: 2
Explanation: The sensitivity is the proportion of true positive responders (Response=1) that have a positive test result (Test=1). The specificity is the proportion of true negative responders (Response=0) that have a negative test result (Test=0) = 6/10
Refer study notes as well.
Question : Refer to the exhibit: The plots represent two models, A and B, being fit to the same two data sets, training and validation. Model A is 90.5% accurate at distinguishing blue from red on the training data and 75.5% accurate at doing the same on validation data. Model B is 83% accurate at distinguishing blue from red on the training data and 78.3% accurate at doing the same on the validation data. Which of the two models should be selected and why? 1. Model A. It is more complex with a higher accuracy than model B on training data.
2. Model A. It performs better on the boundary for the training data.
3. Model B. It is more complex with a higher accuracy than model A on validation data.
4. Model B. It is simpler with a higher accuracy than model A on validation data.
1. More high value customers are found in some regions than others. 2. The difference between average purchases for medium and high value customers depends on the region 3. Regions with higher average purchases have more high value customers. 4. Regions with higher average purchases have more medium value customers.
1. All groups are significantly different from each other. 2. 2XL is significantly different from all other groups 3. Only XL and 2XL are not significantly different from each other. 4. No groups are significantly different from each other.