STAT 705 Spring 2009 ------------------ Homework 1 ---------- Do the following problems from the textbook: 8.12, 8.13, 8.14, 8.15, 8.21, 8.22, 8.27, 8.28, 8.34, 8.35(a,b)** NOTE: Problem 8.15 requires analysis using a computer package such as SAS or R. NOTE: For 8.15(e), if this plot shows random scatter, then the interaction term is probably not necessary. **NOTE: For 8.35(b), note the book has an error in equation (6.25). It should be: b = (X'X)^{-1} X'Y The "Copier Maintenance" data set is given on the course web page. Please write your answers neatly and clearly! PRACTICE PROBLEM (Nonparametric regression): ------------------- You don't need to turn this in, but please do it on your own for practice. The relevant example R code is on the STAT 704 web page. The "Modified Stack Escape" data set given below contains data for 31 observations from an industrial plant. The first variable is a predictor, Acid Concentration. The second variable is the response, Stack Escape, which measures how efficient the plant is (the LOWER the stack escape, the BETTER the efficiency.) (a) Fit a kernel regression on these data to determine the effect of acid concentration on stack escape. Experiment with different bandwidth values. In your opinion, which bandwidth produces the "best" curve for these data, and why? (Look at a plot of your "best" fitted curve on top of the data.) (b) Fit a cubic spline regression on these data. Experiment with the number of knots. In your opinion, which selection of knots produces the "best" curve for these data, and why? (Look at a plot of your "best" fitted curve on top of the data.) (c) Interpret what the regression curves tell you about the relationship between stack escape and acid concentration. 89 42 88 37 90 37 87 28 87 18 87 18 93 19 93 20 87 15 80 14 89 14 88 13 82 11 93 12 89 8 86 7 79 8 80 9 82 15 91 15 84 12 90 25 87 22 91 29 92 17 85 18 81 11 83 15 85 16 89 24 81 9