STAT 516 HW 3 HAND CALCULATIONS: 1. We will analyze the data in Table 6.29 on pg. 312 of the textbook. The dependent (Y) variable (Y) is the time needed for the mouse to complete the maze and the factor is the color of the maze door. The 3 levels for this factor are red, green, and black. Do the following by hand, SHOWING WORK. You may use SAS to check your answers if you want. (a) Calculate the sums of times for each door color (the Y-i-dot values) and the overall sum of all the times (the Y-dot-dot value). (b) Use your answers from part (a) to find the SSB, SSW, and then the MSB and MSW. (c) Calculate the F* ratio and write the complete ANOVA table for this problem. (d) Perform the ANOVA F-test by comparing the F ratio to the appropriate value from the F table (use a significance level of .05). Among the different door colors, is there a significant difference in mean time needed to complete the maze? (e) Calculate the residual for the first mouse listed for the Red door. COMPUTER CALCULATIONS: 2. Look at the data in Table 8.29 on page 462 of the textbook. These data are also given in the SAS code labeled "Basket Goals data set" on the course web page. (a) Estimate the multiple regression model with GOALMADE as the dependent variable and WEIGHT, HEIGHT, and DASH100 as the independent variables. Is there evidence of multicollinearity? Explain your answer with numerical evidence. (b) Consider the MLR model described in (a). Are there severe outliers in the model? If so, which observations are outliers? (c) Consider the MLR model described in (a). Are there high-leverage points in the model? If so, which observations are high-leverage points? (d) Consider the MLR model described in (a). Based on the DFFITS criterion, are there influence points in the model? If so, which observations are influence points? 3. Look at the data in Table 8.31 on page 411 of the textbook. These data are also given in the SAS code "liverdata.txt" on the course web page. Consider variable selection with LOGTIME as the dependent variable and CLOT, PROG, ENZ, and LIV as the POTENTIAL independent variables. Of all possible models, based on the adjusted R-squared criterion, which appears to be "best"? What is its adjusted R^2? Is this "best" model also satisfactory based on the C(p) criterion? Why or why not? 4. Look at the data in Table 6.35 on page 316 of the textbook. These data are also given in the SAS code at the link "Insecticide Data Set" on the course web page. (a) Use PROC GLM to test whether the mean number of insect deaths differs across insectide types. Use alpha= 0.05, and justify your answer by referring to either a test statistic value and table value, or by referring to a P-value. (b) For the analysis in part (a), use Levene's test (alpha = 0.05) to check the equal-variances assumption. Also, perform a residual plot and a Q-Q plot of the residuals to check for violations in the assumptions. Summarize your findings. (c) If the residual plot shows any outlier(s), which data value is responsible? (d) Perform's Tukey's multiple comparison procedure (using alpha = 0.05) and summarize which insecticides are significantly different from each other, in terms of mean number of deaths. 5. Look at the data in Table 6.40 on page 318 of the textbook. These data are also given in the SAS code at the link "Bank Data Set" on the course web page. (a) Use PROC GLM to test whether the mean number of days of sick leave differs across bank branches. Use alpha= 0.05, and justify your answer by referring to either a test statistic value and table value, or by referring to a P-value. (b) In particular, suppose the bank manager suspects the employees at branches 1 and 3 have a different mean sick-leave count than those at branch 2. Using an ESTIMATE statement, perform a t-test to test for a difference between the mean number of sick days for the first and third branches and the mean number of sick days for branch 2. Use alpha = 0.05. TRUE-FALSE QUESTIONS: 6. Answer True/False Concept Questions 2, 7, 9, 10, 11, 12(all parts), 13, 15 on pages 310-311. If the statement is false, either correct it or explain why it is false.