STAT 705 Spring 2009 -------------------- Homework 6 ---------- 28.13(a), 28.14, 28.15(a,b,c), 28.16(b) 18.25, 18.27, 22.1, 22.6, 22.9(b,c), 22.10(a,b,c,d,f), 22.23, 26.2, 26.4, 26.5(c,d,e,f), and the extra problems given below: Extra problem 1: Four varieties of soybean were planted in each of three separate regions of a field. The resulting yields were: Variety A B C D Region 1 45 48 43 41 2 49 45 42 39 3 38 39 35 36 Use Friedman's test (with a 0.05 significance level) to test whether the four varieties have the same effect on yield. Give the test statistic value and P-value of your test. Extra Problem 2: In the "Muscle Mass" data set, the first column (Y) is a measure of muscle mass for a sample of women and the second column (X) is age in years. It is conjectured that the regression of muscle mass on age follows a two-piece linear relation, with the slope changing at age 60 years without discontinuity. (a) State the regression model that applies if the conjecture is correct. What are the respective mean response functions when age is 60 or less and when age is over 60? (b) Fit the regression model specified in part (a) and state the estimated regression function. (c) Plot the data with the estimated piecewise regression function overlain on top of it. (d) Test whether the piecewise linear regression function is needed; use alpha=0.05. State the alternatives, decision rule, and conclusion. What is the P-value of the test? (e) Specify the regression model for the case when the slope changes at age 40 and again at age 60, with no discontinuities. NOTE: Problems 28.14, 28.15, 18.25, 22.9, 22.10, 26.4, 26.5(c,d,e,f) and the extra problems require analysis using a computer package such as SAS or R. NOTE: For 28.14, do the residual plot (vs. fitted values), the normal Q-Q plot, and the Shapiro-Wilk test. NOTE: For 18.25, where it says "nonparametric rank F test", do the Kruskal-Wallis test (they are equivalent). NOTE: For 22.9(b), do the residual plot (vs. fitted values), the normal Q-Q plot, and the Shapiro-Wilk test (and state your conclusions). NOTE: For 22.10(c), you don't have to fit the full and reduced models separately if you let SAS or R do the appropriate F-test. HINT: For 22.10(f), to decide which is "more efficient", compare (22.18) to the square root of (22.17) on page 930. HINT: For 22.23, note that you are deriving the LS estimator for ANY PARTICULAR delta_i. So if you want, you could derive it for delta_1, and then just generalize your result. In any case, note that in #22.23 you will be solving two equations with two unknowns. NOTE: For 26.4(b), show the residuals plotted for each machine level, and state your conclusions. The "Hardware Sales", , "Telephone Communications", "Questionnaire Color #22.9", "Bottling Plant Production", and "Muscle Mass" data sets are given on the course web page. Please write your answers neatly and clearly!