STAT 516 HW 2 Please write your answers neatly and clearly! Also, PLEASE make sure your answers to these questions are written in the same order as the questions are listed in the assignment!! HAND CALCULATIONS: 1. On page 400 (problem 1), data are given for 8 observations. On the top of page 401, the SAS output for the multiple regression of Y on X1 and X2 is given. Also given are the actual Y values, the predicted Y values, and the residuals for the 8 observations. Do the following by hand, SHOWING WORK. (You may use a hand calculator to help with the calculations.) (a) Use the estimated regression line to verify that the listed predicted values for observations 1 and 2 are correct. Also verify that the residuals for observations 1 and 2 are correct. (b) Add up the squares of the residuals for all 8 observations and verify that the result matches the SSE given in the ANOVA table. COMPUTER CALCULATIONS: (NOTE: For any hypothesis tests, you may use alpha = 0.05.) 2. Look at the data in Table 8.29 on page 409 of the textbook. These data are also given in the SAS code "basketballdata.txt" on the course web page. Complete a SAS program and answer the following questions about the data set: (a) Estimate the multiple regression model with GOALMADE as the dependent variable and WEIGHT, HEIGHT, and DASH100 as the independent variables. State the estimated regression equation. Carefully interpret the partial regression coefficient for HEIGHT. (b) Using the P-values listed in SAS, carefully state what the t-tests about the regression coefficients tell you about the individual effects of WEIGHT, HEIGHT, and DASH100 on GOALMADE. (c) Using a TEST statement in SAS, test whether at least one of WEIGHT and DASH100 is needed in the model, given that HEIGHT is in the model. State the hypotheses and give the P-value of the test. (d) Using the model you fit in (a), calculate a 90% prediction interval for the number of goals made for a new person who is six feet tall, weighs 155 pounds, and runs a 11.6 second 100-yard dash. (e) Repeat part (d), except use a model which includes WEIGHT and HEIGHT but not DASH100. What do you conclude from this interval compared to the one in (d)? (f) Consider the model you fit in (a). Is there evidence of multicollinearity? Explain your answer with numerical evidence. 3. Look at the data in Table 8.31 on page 411 of the textbook. These data are also given in the SAS code "liverdata.txt" on the course web page. Complete a SAS program to answer the following questions about the data set: (a) Fit the regression model with the survival time, TIME, as the dependent variable, and CLOT, PROG, ENZ, and LIV as the independent variables. Perform a residual analysis. Comment on any possible model violations. (b) Fit the regression model with the log survival time, LOGTIME, as the dependent variable, and CLOT, PROG, ENZ, and LIV as the independent variables. Perform a residual analysis. Are any model violations alleviated? (NOTE: For this residual analysis, in adapting the sample code, change the PRED and RES in the code to PRED2 and RES2 so that they won't have the same names as the values in the residual analysis in part (a).) (c) Find the R^2 value for the regression equation from part (b). Interpret its value in the context of this problem. (d) Consider the MLR model described in (b). Are there severe outliers in the model? If so, which observations are outliers? (e) Consider the MLR model described in (b). Are there high-leverage points in the model? If so, which observations are high-leverage points? (f) Consider the MLR model described in (b). Based on the DFFITS criterion, are there influence points in the model? If so, which observations are influence points? (g) Use SAS to help with an automatic variable selection with LOGTIME as the dependent variable and CLOT, PROG, ENZ, and LIV as the POTENTIAL independent variables. Of all possible models, which appears to be "best"? Explain your choice(s) clearly, using evidence from SAS and the various variable-selection criteria we discussed in class. CONCEPT QUESTIONS: 4. Answer Concept Questions 1,3,4,9 on pages 399-400.