STAT 542 Homework 3: Problem 2 in the Chapter 4 Exercises (in Section 4.4 of the book). NOTE: The data are in the 'babynames' package. Problem 3 in the Chapter 4 Exercises (in Section 4.4 of the book). Problem 6 in the Chapter 4 Exercises (in Section 4.4 of the book). Problem 11 in the Chapter 4 Exercises (in Section 4.4 of the book). For this problem, see the hint below, and in order to examine the pattern as the problem asks, use some tools from Chapter 3 to do a scatterplot of median violation score against number of violations, for zip codes with at least 50 violations. Include this scatterplot in your answer. [NOTE: You will have to install and load the 'mdsr' package. The data are in the 'Violations' data frame in the 'mdsr' package.] [HINT: It may be easiest to do the following: -Choose the rows corresponding to actual violations using the code: !is.na(violation_code) -Create a summary statistic data frame with number of violations per zip code, and median violation score per zip code (you can use the argument na.rm=T in the 'median' function just to be sure R is not including any missing score values) -Pick the rows from the summary statistic data frame corresponding to zip codes with at least 50 violations.] Problem 13 in the Chapter 4 Exercises (in Section 4.4 of the book). [NOTE: You will have to install and load the 'Lahman' package. The data are in the 'Teams' data frame in the 'Lahman' package.] Problem 14 in the Chapter 4 Exercises (in Section 4.4 of the book). For this problem, get the sample sizes for each 'tailnum' value in the data set and arrange them to determine the tailnum (not including a tailnum that is 'missing') that has the most flights. For the plot requested, plot the number of trips per MONTH in 2013 for this plane, not per week. That is, do a line plot that shows this plane's number of trips per month on the y-axis and month number on the x-axis. (You'll have to pick the rows from the flights data table corresponding to this plane and then group these a certain way to get the values needed for the plot.) [NOTE: You will have to install and load the 'nycflights13' package. The data are in the 'flights' data frame in the 'nycflights13' package.] Problem 2 in the Chapter 5 Exercises: [NOTE: Recall that the 'Master' data table referred to in the problem is now called People ] [HINT: For part (c), Recall that Batting Average is defined as hits divided by at-bats, that is: H/AB ] [HINT: Think about which parts involve career statistics and which part(s) involve seasonal statistics, and thus about which part(s) require grouping and which part(s) don't require grouping.] Problem 4 in the Chapter 5 exercises. [HINT: For part (b), consider the n_distinct function.] NOTE about format: For this homework, please turn in TWO files into Blackboard: The first file such be a Word document or pdf with the answers to the questions in the form of any graphs/plots requested, and any written answers or interpretations for those problems that call for writing. The second file should be a PLAIN TEXT file (.txt file) with the code that you used to create the plots, etc. for the problems. Any lines in this text file that are NOT code should start with # so that they will be treated as comments and not executed.