STAT 704 Fall 2009 ------------------ Homework 6 ------------------------------------------------------------- PART 1: Do the following problems from the textbook: 14.2, 14.4, 14.7, 14.15, 14.33(a), 14.13, 14.19(b), 14.40, 14.46, 11.28, 11.29(a,b,c,e)[see note below about Muscle Mass data], 11.30(a,b,c,d,f) PART 2: Problem 1: The "Masters Golfers" data set on the course web page contains data for 60 golfers from the 2007 Masters Tournament in Augusta, GA. The first variable is the golfer's name, the second variable is his number of birdies for the tournament, the third variable his driving distance (how far he typically hit the ball) and the last variable is his driving accuracy (how accurately he typically hit the ball). (Note that a lot of birdies is good in golf!) (a) Fit a Poisson regression model relating number of birdies to driving accuracy. Is driving accuracy a significant predictor in this model, at the 0.05 level? (b) Fit a Poisson regression model relating number of birdies to driving distance. State the estimated regression function. Is driving distance a significant predictor in this model, at the 0.05 level? (c) Interpret the estimate b_1 from part (b) in the context of this problem. (d) In the model with driving distance as the predictor, note that the last observation (#60) has the largest absolute deviance residual. What is the value of his deviance residual? What does this tell you about this golfer concerning his number of birdies and driving distance? Bonus: What is particularly notable about golfer #60 in the context of this data set? NOTE: For 14.7(b), just do a scatter plot of the data and the fitted logistic response function. Don't do the "lowess smooth". NOTE: For the Poisson regression problem, if you analyze these data in SAS you should technically put $17. after the NAME variable in the INPUT statement so SAS will correctly read in it as a character variable. It won't mess up your analysis for this problem if you don't do this, though. PROBLEM 2: For the following data on cars, where horsepower is the predictor and mileage is the response, fit and plot nonparametric regressions using kernel methods and spline methods. Try different amounts of smoothing. Comment on your results, and comment on the relationship between horsepower and mileage. horsepower <- c(49,55,55,70,53,70,55,62,62,80,73,92,92,73,66,73,78,92,78,90,92,74,95,81,95,92,92,92,90,52,103,84,84,102,102,81,90,90,102,102,130,95,95,102,95,93,100,100,98,130,115,115,115,115,180,160,130,96,115,100,100,145,120,140,140,150,165,165,165,165,245,280,162,162,140,140,175,322,238,263,295,236) mileage <- c(65.4,56,55.9,49,46.5,46.2,45.4,59.2,53.3,43.4,41.1,40.9,40.9,40.4,39.6,39.3,38.9,38.8,38.2,42.2,40.9,40.7,40,39.3,38.8,38.4,38.4,38.4,29.5,46.9,36.3,36.1,36.1,35.4,35.3,35.1,35.1,35.0,33.2,32.9,32.3,32.2,32.2,32.2,32.2,31.5,31.5,31.4,31.4,31.2,33.7,32.6,31.3,31.3,30.4,28.9,28,28,28,28,28,27.7,25.6,25.3,23.9,23.6,23.6,23.6,23.6,23.6,23.5,23.4,23.4,23.1,22.9,22.9,19.5,18.1,17.2,17,16.7,13.2) The "Annual Dues", "Car Purchase", "Masters Golfers", "Mileage Study", "Muscle Mass (Problem 1.27)", and "Patient Satisfaction" data can be found on the course web page. NOTE: The Problem 1.27 Muscle Mass data is different from the Muscle Mass data you used for piecewise regression in HW 5! Please write your answers neatly and clearly!