/* SAS example to analyze Nested Design */ /* these are the training-school data from class and Chapter 26 */ DATA trschool; INPUT SCORE SCHOOL INSTRUCTOR OBS; cards; 25 1 1 1 29 1 1 2 14 1 2 1 11 1 2 2 11 2 1 1 6 2 1 2 22 2 2 1 18 2 2 2 17 3 1 1 20 3 1 2 5 3 2 1 2 3 2 2 ; run; /* We run PROC GLM with SCHOOL and INSTRUCTOR as the two factors listed */ /* in the CLASS statement. In the MODEL statement, we indicate that the */ /* levels of INSTRUCTOR are nested within the levels of SCHOOL with the */ /* INSTRUCTOR(SCHOOL) syntax for the second factor. */ PROC GLM data = trschool; CLASS SCHOOL INSTRUCTOR; MODEL SCORE = SCHOOL INSTRUCTOR(SCHOOL); OUTPUT OUT = pred p=ybar r=resid; run; /* We see that the 3 schools differ in mean score (F* = 11.18, P-value = .0095) */ /* and that instructors within at least one school have different mean scores */ /* (F* = 27.02, P-value = .0007). */ /* ************************************************************************* */ /* To determine in WHICH schools the instructor effects are significant, we */ /* further decompose the SSB(A) into SSB(A_1), SSB(A_2), and SSB(A_3). */ PROC GLM data = trschool; BY SCHOOL; CLASS INSTRUCTOR; MODEL SCORE = INSTRUCTOR; run; /* Note SAS gives (over 3 pages) the following results: */ /* Atlanta school: MSB(A_1)=210.25 => F* = 210.25/7 = 30.0 (we divide this by hand) */ /* Chicago school: MSB(A_2)=132.25 => F* = 132.25/7 = 18.9 (we divide this by hand) */ /* San Fran school: MSB(A_3)=225.0 => F* = 225/7 = 32.14 (we divide this by hand) */ /* Note that each of these is divided by the original MSE of 7. */ /* Since F(0.95,1,6) = 5.99, then: At the 0.05 level, instructors within the */ /* Atlanta school have different mean scores; at the 0.05 level, instructors */ /* within the Chicago school have different mean scores; and at the 0.05 */ /* level, instructors within the San Francisco school have different mean */ /* scores. The FAMILY significance level for this SET of tests is at most 0.15. */ /* ************************************************************************** */ /* Some Residual Plots to Check the Standard Model Assumptions: */ /* Residual Plots and Q-Q plots: */ goptions reset=all; symbol1 v=circle l=32 c = black; PROC GPLOT data=pred; PLOT resid*ybar/vref=0; /* Residuals Plotted vs. Fitted Values */ PLOT resid*SCHOOL/vref=0; /* Residuals Plotted for each SCHOOL Level */ run; PROC UNIVARIATE noprint data=pred; QQPLOT resid / normal; /* Normal Q-Q Plot of Residuals */ run;