STAT 540 Homework 6 1. Write a SAS program to read in the data file beatles_songlengths.txt given on the course web page. The variables are the name of the album and the lengths, given in minutes and seconds, of each of the 14 songs on the album. (Note that two of these albums have only 13 songs.) Name the variables album, min1, sec1, min2, sec2, and so on. Here is PART of the DATA step that will read in these variables correctly: FILENAME webpage URL 'http://people.stat.sc.edu/hitchcock/beatles_songlengths.txt'; INFILE webpage; INPUT album $ 1-38 min1 sec1 min2 sec2 min3 sec3 min4 sec4 min5 sec5 min6 sec6 min7 sec7 min8 sec8 min9 sec9 min10 sec10 min11 sec11 min12 sec12 min13 sec13 min14 sec14; Use shortcut references for the variables to do the following: Using arrays, create another set of variables songtime1,...,songtime14 giving the song lengths in minutes (including decimal places). (e.g.: 2 min, 30 sec = 2.5 minutes). (HINT: Use three different ARRAY statements, one for the minutes variables, one for the seconds variables, and one for the new songtime variables. The syntax for each ARRAY statement should be basically the same --- only the array name and the variables' names will be different.) Have SAS calculate (and store as a variable) the mean song length for each album. Finally, drop the original minutes and seconds variables and print the remaining data set using PROC PRINT. 2. Read in the data set nfl_season_data.txt which is on the course web page under the link "NFL players data (very long)" and under Data Files for Homework as "nfl_season_data.txt". Feel free to use the code in the "automatic variables" example to read in the data. Use PROC MEANS to calculate the 5-number summary (min, Q1, median, Q3, max) of the variable PassYD (ONLY for the players whose position is "qb"); (Hint: You can use the WHERE statement to pick out the appropriate values of "position".) This should be done SEPARATELY for each value of "team". 3. Look at the cake_data_duncan.txt (Duncan Hines brand) and cake_data_betty.txt (Betty Crocker brand) files on the course web page. Read these in as separate data sets (the variables in each file are Flavor and Height). HINT: You can use this INPUT statement as part of each DATA step when creating each of the two data sets: INPUT Flavor $ 1-12 +1 Height; (a) When reading in the Duncan Hines data set, create a variable called Brand that has the value "Duncan" for every observation in the Duncan Hines data set. When reading in the Betty Crocker data set, create a variable called Brand that has the value "Betty" for every observation in the Betty Crocker data set. (b) Stack these two data sets with a SET statement, and print the new (combined) data set. (c) Merge the two data sets by flavor, and print the new (combined) data set. Be sure the merged data set indicates which cake brand goes with which height. (d) Use PROC MEANS to calculate the mean and standard deviation of heights, separately for each flavor, writing the summary statistics to an output data set. The PROC MEANS should be done on the STACKED data set that you created in part (b). Merge these summary statistics with the stacked data set created in part (b) and print the resulting data set. The resulting data set should be ordered or printed by flavor and should include, at least, columns for the brand, the height, the mean height (for each flavor), the standard deviation of height (for each flavor). 4. (a) Write a SAS macro that will calculate any descriptive statistic that is an option in PROC MEANS, based on the first k observations of any specified numeric variable in an arbitrary SAS data set. The statistic, the value of k, the variable name, and the name of the SAS data set should be macro parameters. (b) Invoke the macro to calculate the SUM of the heights of the first 4 observations in the original Duncan Hines SAS data set that you created in Problem 3(a). ***************************************************************************** NOTE: You MUST intersperse comments in your SAS program to explain in detail what your SAS statements are supposed to be doing. Please be generous with your comments, since you will be graded not only on the correctness of the code, but the clarity and amount of comments. Remember that SAS comments are typed as: /* comment */ or *comment; DO NOT put R-style comments (with #) in your SAS code!!! You can upload your .txt file with your SAS code and comments into Blackboard as usual. As with the other homeworks, to ensure that the grader will be able to grade your submission: * Please save your document as a plain text (.txt) file before uploading. * Please ensure that EVERY line in your submitted text file EXCEPT your SAS code is placed in "comments". That way, the grader will be able to copy everything in the file into SAS and have the program run without a problem. * Please include detailed comments explaining your code. You will be graded on the quality of your comments, as well as the correctness of your code.