STAT 509 -- FINAL EXAM REVIEW SHEET I. Analysis of Variance (ANOVA) with a Completely Randomized Design A. Basic Terms 1. Designed Experiment vs. Observational Experiment 2. Response Variable 3. Factors (Quantitative and Qualitative) 4. Levels of a Factor 5. Treatments 6. Experimental Units 7. Three Principles of Experimental Design B. Completely Randomized Design 1. Hypothesis Test for whether all treatment means are equal 2. Comparing variance WITHIN groups to variance BETWEEN groups 3. SST, SSE, MST, MSE (What do these quantities measure?) 4. The ANOVA F-statistic: F = MST/MSE a. What is its distibution if H_0 is true? b. How do we use it to test whether all treatment means are equal? 5. Summarizing the data with an ANOVA table 6. Assumptions of ANOVA F-test 7. Rejection region and proper conclusions for ANOVA F-test 8. Tukey's Multiple Comparisons Procedure II. Regression Analysis A. Observational Studies vs. Designed Experiments B. (Straight-Line) Simple Linear Regression Model 1. Response (Dependent) Variable Y, Predictor (Independent) Variable X 2. Y-intercept beta_0, Slope beta_1, random error epsilon 3. Probabilistic Relationship between Y and X C. Fitting the Model with least squares 1. Determining whether a straight-line model is appropriate 2. Scatterplot 3. Least squares philosophy (minimizing SS_res) 4. The least-squares estimates for beta_0 and beta_1 5. Interpreting estimated slope and estimated Y-intercept 6. Using least-squares line to predict Y values for a given X 7. Extrapolation D. Model Assumptions 1. mean of random error component = 0 2. variance of random error component constant for all values of X 3. Probability distn. of random error component is normal 4. Values of random error component for any two Y-values are independent E. Estimating the error variance sigma^2 1. MS_res 2. sqrt(MS_res) (estimate of sigma) F. Testing the Usefulness of the Model 1. How does testing whether the slope is 0 test the usefulness of the model? 2. Test statistic, rejection region for test of H_0: beta_1 = 0 3. Confidence Interval for the true slope beta_1 G. Correlation and Coefficient of Determination 1. Linear Correlation a. What it measures and how it is interpreted b. Type of variables for which correlation can be measured c. Linear association vs. Curved association 2. R-squared and its interpretation H. Estimation and Prediction with the Regression Model 1. CI for the mean of Y at a particular X value 2. Prediction Interval for a new Y at a particular X value 3. Which of these intervals is wider? III. More Regression Analysis A. Multiple Linear Regression Model 1. Difference between Simple and Multiple linear regression 2. Interpreting regression output from R 3. Interpreting estimated coeffients in Multiple Linear Regression 4. Using least-squares line to predict Y values for a given X B. Inference in Multiple Regression 1. F-test of Overall Regression Relationship a. Null and Alternative Hypotheses for this test b. Getting Test Conclusion based on R output c. Appropriate Conclusion in words for this test 2. t-test about Individual Coefficient a. Null and Alternative Hypotheses for this test b. Getting Test Conclusion based on R output c. Appropriate Conclusion in words for this test 3. CI for the mean of Y at a particular set of X values (found using R) 4. Prediction Interval for a new Y at a particular set of X values (found using R) C. Checking Model Assumptions in Regression 1. Plot of Residuals vs. Fitted values a. Which assumptions does it check? b. Which patterns correspond to which violations? 2. Normal Q-Q Plot of Residuals a. Which assumption does it check? b. Which patterns correspond to a violation? 3. Remedies for Violations of Assumptions 4. Which transformations may be useful for which violation? 5. Interpretations/Predictions with transformed model 6. Transformations to Remedy Violations in ANOVA F-test 7. Transformations to Remedy Violations in one-sample and two-sample t-tests IV. Design of Experiments A. 2^2 Factorial Design 1. Factors and Low/High Levels 2. Distinct Treatments Possible 3. (1), a, b, ab Notation for Mean Responses 4. Formulas for Main Effects in 2^2 Factorial Design 5. Linear Model and Meaning of Estimated Coefficients 6. Meaning of Interaction between Two Factors 7. Checking for Interaction with an Interaction Plot 8. Formula for Interaction Effect in 2^2 Factorial Design B. 2^k Full Factorial Design 1. Linear Model for 2^k Design 2. Table of Constrasts and formulas for estimated effects 3. Estimated Coefficients and Interpreting Output from R 4. When can we check for significant effects using formal P-values? 5. Graphical check for significant effects when no replication C. Fractional Factorial Designs 1. Advantages to Fractional Factorials 2. Disadvantages to Fractional Factorials 3. What it means to have Aliased Effects 4. Alias Structure and Defining Interaction 5. Interpreting Output from R 6. Graphical check for significant effects D. Disadvantages of OFAAT approach and Shotgun approach to Design