STAT 530, Fall 2016 -------------------- Homework 4 ----------- IMPORTANT NOTE: For EACH of these problems, also write several sentences explaining in words what substantive conclusions about the data that you can draw from the plots and/or analyses. ALWAYS MAKE AN ATTEMPT TO INTERPRET THE FACTORS! Sometimes this works better than other times... NOTE: The "school subjects" correlation matrix, the "pain" correlation matrix and the Foodstuff Contents data set are given on the course web page. ### Problem 1: Do problem 4.6 in the Everitt textbook. You don't have to "plot the derived loadings"; however, do a factor analysis with a varimax rotation and compare your rotated loadings to the loadings given in the book in table 4.12. ### Problem 2: Do 4.7(b,c) in the Everitt textbook. For 4.7(c), just do an orthogonal rotation, not an oblique rotation. ### Problem 3: Do a factor analysis on the Foodstuff Contents data set. Use a rotation, if appropriate. Discuss your choice of the number of factors. Calculate factor scores for the individual items, plot the factor scores using appropriate plot(s), and discuss your findings. *The "Contents of Foodstuffs" data set (in Table 3.6) is given on the course web page. Full descriptions of the observation names are on p. 63 of the book. This R code will read in the data: food.full <- read.table("http://www.stat.sc.edu/~hitchcock/foodstuffs.txt", header=T) food.labels <- as.character(food.full[,1]) food.data <- food.full[,-1] NOTE: for Problem 3, if you use the 'factanal' function to perform the factor analysis on the Foodstuffs data set, it will not allow you to choose 3 or more factors for a data set with only 5 variables. In this case (for the purposes of this HW) it is OK to choose the highest number of factors that the 'factanal' function will allow, even if the chi-square test indicates this number of factors is not quite sufficient. #### Problem 4: ### THIS 4th PROBLEM IS MANDATORY FOR GRADUATE STUDENTS BUT OPTIONAL (EXTRA CREDIT) FOR UNDERGRADS. Do 4.4 in the Everitt book. Attempt to interpret the factors, if possible, and make plots of the factor scores. The following R code should create the two data sets (for men and for women) needed to do the problem. life.df.full <- read.table("http://www.stat.sc.edu/~hitchcock/lifeex.txt", header=T) country.names <- life.df.full[,1] life.df.men <- life.df.full[,2:5] row.names(life.df.men) <- country.names life.df.women <- life.df.full[,6:9] row.names(life.df.women) <- country.names