STAT 704 -- TEST 2 REVIEW SHEET I. Miscellaneous Regression-Related Topics A. Correlation Models 1. Key Difference between regression model and correlation model 2. Bivariate normal model 3. Population correlation coefficient rho 4. Testing whether rho = 0 5. Large-sample CI for rho B. Cautions about Regression 1. Predicting Values into the Future 2. Extrapolation and its Associated Dangers 3. Does linear association between Y and X imply causation? 4. Concerns with simultaneous multiple predictions/inferences 5. Effect of Measurement Error in the X variable(s) II. Multiple Linear Regression (MLR) A. General MLR model with k predictors 1. Interpretions of regression coefficients in the MLR model 2. Meaning and Examples of General Linear Model 3. General Linear Model in Matrix Terms a. Y vector b. X matrix c. beta vector d. epsilon vector 4. Fitting the MLR (estimating the beta's) a. vector of estimated coefficients b. vector of fitted values c. vector of residuals d. Interpretations of estimated regression coefficients B. Analysis of Variance in MLR 1. SSTO, SSR, SSE 2. Degrees of Freedom for each SS 3. Overall ANOVA F-test a. Null and alternative hypotheses b. Test statistic value 4. Coefficient of Multiple Determination R^2 5. Adjusted R^2 C. Inference about Individual Regression Coefficients 1. CI for an individual beta 2. Test for whether an individual beta = 0 a. Tests marginal effect of individual predictor b. "in the presence of" other predictors in the model D. CI for the mean response, E(Y_h) E. Prediction Interval for 'new' response value, Y_h(new) F. Checking Model Assumptions through Residual Plots 1. What are the major regression model assumptions? 2. What values are plotted on the axes of a residual plot? 3. Checking for model misspecification 4. Checking for non-constant error variance 5. Checking for departures from normality a. Graphical methods b. Formal tests G. Transformations of Variables 1. Purpose of transformations a. Transformations of X variable(s) b. Transformations of Y c. Transformations of both 2. Which types of transformations alleviate which violations? 3. Reverse-transformations back to units of original variable(s) H. Extra SS and F-tests 1. Behavior of SSE as predictors are added to the model 2. Reduced model vs. Full model 3. Testing whether some (but not all) predictors can be dropped a. Null and alternative hypotheses b. Test statistic value III. Advanced Considerations in Regression A. Multicollinearity 1. What is multicollinearity? 2. Common Problems Caused by Multicollinearity 3. Detecting Multicollinearity with VIFs 4. Possible Remedies for Multicollinearity B. Polynomial Regression 1. Determining whether polynomial regression is needed 2. Centering predictor variables 3. Polynomial regression with two predictors 4. Extrapolation in polynomial regression C. Interaction Models 1. Basic meaning of interaction between two predictors 2. Interaction plots 3. F-test for whether interactions are significant D. Model Building 1. Confirmatory vs. Exploratory Observational Studies 2. Forward Stepwise Regression Method 3. "All-possible-subsets" approach 4. Criteria for choosing "best" model a. Adjusted R^2 b. AIC c. BIC d. C_p criterion 5. Overall goals in model selection E. Model Validation 1. Data splitting (cross-validation) a. Training Set and Validation Set b. MSPR 2. n-fold cross-validation a. PRESS statistic and how it is used F. Diagnostic Measures 1. Added-variable (Partial Regression) Plots 2. Outliers and Influential Cases a. (Internally) Studentized Residuals b. Leverage Values (Hat diagonal elements) c. Cook's Distance d. DFFITS e. The various rules of thumb 3. What to do about Outliers/Influential Points