STAT 516 HW 4 Please write your answers neatly and clearly! You MUST make sure your answers to these questions are written in the same order as the questions are listed in the assignment. If this instruction is not followed, your homework WILL NOT be graded. COMPUTER CALCULATIONS: (NOTE: For any hypothesis tests, you may use alpha = 0.05.) 1. Look at the data in Table 8.29 on page 409 of the textbook. These data are also given in the SAS code "basketballdata.txt" on the course web page. (Equivalent R code to read in the data is given as "basketballdataR.txt" on the course web page.) Complete a SAS or R program and answer the following questions about the data set: (a) Using the model you fit in Problem 2(a) in HW 3, calculate a 90% prediction interval for the number of goals made for a new person who is six feet tall, weighs 147 pounds, and runs a 11.2 second 100-yard dash. (b) Repeat part (a), except use a model which includes WEIGHT and HEIGHT but not DASH100. What do you conclude from this interval compared to the one in (d)? (c) Consider the model you fit in Problem 2(a) in HW 3. Is there evidence of multicollinearity? Explain your answer with numerical evidence. 2. Look at the data in Table 8.31 on page 411 of the textbook. These data are also given in the SAS code "liverdata.txt" on the course web page. (Equivalent R code to read in the data is given as "liverdataR.txt" on the course web page.) Complete a SAS or R program to answer the following questions about the data set: (a) Consider the MLR model described in Problem 3(b) in Homework 3. Are there severe outliers in the model? If so, which observations are outliers? (b) Consider the MLR model described in Problem 3(b) in Homework 3. Are there high-leverage points in the model? If so, which observations are high-leverage points? (c) Consider the MLR model described in Problem 3(b) in Homework 3. Based on the DFFITS criterion, are there influence points in the model? If so, which observations are influence points? (d) Use SAS to help with an automatic variable selection with LOGTIME as the dependent variable and CLOT, PROG, ENZ, and LIV as the POTENTIAL independent variables. Of all possible models, which appears to be "best"? Explain your choice(s) clearly, using evidence from SAS or R and the various variable-selection criteria we discussed in class. CONCEPT QUESTIONS: 3. Answer Concept Questions 1,3,4,9 on pages 399-400.