STAT 530, Fall 2018 -------------------- Homework 6 ----------- ALL students should do all of the following problems: IMPORTANT NOTE: For EACH of these problems, write a couple of sentences explaining in words what substantive conclusions about the data that you can draw from the plots and/or analyses. PROBLEM 1: --------------- Use Hotelling's T^2 test and the data in the test score data set (scores on math and reading tests given to a sample of girls and a sample of boys) to test for a difference in the mean score vector of the boys and the mean vector of the girls. The following R code will read in the data: testdata <- read.table("http://people.stat.sc.edu/hitchcock/testscoredata.txt", header=T) attach(testdata) testdata.noIDs <- testdata[,-1] #to remove the ID numbers PROBLEM 2: --------------- Consider the 'hsb' data set that we have studied in class. Suppose our goal is to compare the mean vectors (where the variables are the scores on: read, write, math, science, socst) among the different levels of 'ses' (high, middle, and low socioeconomic classes). hsb <- read.table("http://people.stat.sc.edu/hitchcock/hsbdata.txt", header=T) attach(hsb) hsb.prob2 <- hsb[,c(5,8,9,10,11,12)] response.variables <- cbind(read, write, math, science, socst) ############################################### (a) Conduct the MANOVA F-test using Wilks' Lambda to test for a difference in (read, write, math, science, socst) mean vectors across the three ses classes. Use a 0.05 significance level, and give the P-value of the test. (b) Check (informal exploratory checks are fine) to see whether the assumptions of your test are met. Do you believe your inference is valid? (c) Examine the sample mean vectors for each group. Informally comment on the differences among the groups in terms of the specific variables. PROBLEM 3: --------------- Use the data from Problem 8.2 in the Everitt textbook, but use it for a multivariate regression. Treat the "fasting" variables y1, y2, y3 as the response variables and the "sugar" variables x1, x2, x3 as the predictor variables. We wish to predict a person's fasting measurements based on their sugar measurements. (a) Fit the multivariate regression model; that is, use R to find the least-squares estimate of the Beta matrix of regression coefficients. Based on this matrix, assess the nature of the relationship between the fasting measurements and the sugar measurements. (b) For a person with sugar measurements x1 = 100, x2 = 105, x3 = 95, find a point prediction of her fasting measurements y1, y2, y3, based on your estimated regression model. ============================================================================== - The "Blood Glucose" data from Problem 8.2 are given on the course web page. The following R code will read in the data: bloodg <- read.table("http://people.stat.sc.edu/hitchcock/bloodglucose.txt", header=T) fast.meas <- bloodg[,1:3] sugar.meas <- bloodg[,4:6]