/* SAS Example of the Single-Factor ANOVA model */ /* We will analyze the Kenton Foods data from the example in class */ /* The response variable is sales and the factor is package design. */ /* The store label is also given in the data set. */ DATA kenton; INPUT SALES DESIGN STORE; cards; 11 1 1 17 1 2 16 1 3 14 1 4 15 1 5 12 2 1 10 2 2 15 2 3 19 2 4 11 2 5 23 3 1 20 3 2 18 3 3 17 3 4 27 4 1 33 4 2 22 4 3 26 4 4 28 4 5 ; run; /* PROC GLM will do a standard analysis of variance */ /* We specify that DESIGN is a (qualitative) factor with a CLASS statement */ /* The model statement specifies that SALES is the response */ /* and DESIGN is the factor. */ PROC GLM DATA = kenton; CLASS DESIGN; MODEL SALES = DESIGN; LSMEANS DESIGN; run; /* The ANOVA table is given for us. Pick out SSTR, SSE, SSTO, and MSTR and MSE. */ /* Notice the F* for the test for equality of factor level means is 18.59. */ /* The associated P-value for this test is less than 0.0001. */ /* The LSMEANS statement gives us the sample mean response for each of the */ /* four designs. Compare the output to the table on page 688 of the book. */ /* *********************************************************************************** */ /* Investigating differences among treatment means through plots and inferences */ /* Main effects plot */ /* Note 18.63 is the overall sample mean response (Y-bar-dot-dot) */ /* found on the main PROC GLM output page. */ /* The CL option to the LSMEANS statement produces (here, 95%) confidence */ /* intervals for each population factor level mean. */ /* Note the CI for the mean sales for package design 1: (11.5, 17.7). */ /* PDIFF gives CIs for the difference between any two factor level means. */ /* It also gives P-values for the test of equality of any two factor level means. */ PROC GLM DATA = kenton; CLASS DESIGN; MODEL SALES = DESIGN; LSMEANS DESIGN / CL ALPHA = 0.05 PDIFF; OUTPUT OUT=smpmeans p=YBAR r=resid; run; symbol1 i = join v=circle l=32 c = black; PROC GPLOT data=smpmeans; PLOT YBAR*DESIGN/vref=18.63; run; /* ***************************************************************************** */ /* Contrasts: CIs and Hypothesis Tests */ /* Example: We want a 95% CI for the difference in the mean sales of the */ /* cartoon designs and the mean sales of the non-cartoon designs */ /* The relevant contrast here is: (1/2)mu_1 - (1/2)mu_2 + (1/2)mu_3 - (1/2)mu_4 */ /* The ESTIMATE statement defines the coefficients of the contrast (these must */ /* be in the proper order!) and gives the test statistic and P-value of the test */ /* for whether the contrast equals zero. */ /* The CLPARM option to the MODEL statement tells SAS to give a CI (by default, */ /* a 95% CI) for the contrast. */ PROC GLM DATA = kenton; CLASS DESIGN; MODEL SALES = DESIGN / CLPARM; LSMEANS DESIGN; ESTIMATE 'CartoonVsNoncartoon' DESIGN 1 -1 1 -1 / divisor=2; RUN; /* ***************************************************************************** */ /* Multiple Comparison Procedures */ /* In the MEANS statement, the CLDIFF option gives CIs for all pairwise treatment */ /* mean differences, based on the Tukey procedure. The ALPHA=0.10 ensures that */ /* the family confidence level is 90%. SAS also provides an indication of which */ /* pairs of treatment means are judged to be significantly different, at the */ /* 0.10 family significance level, by the Tukey procedure. */ PROC GLM DATA = kenton; CLASS DESIGN; MODEL SALES = DESIGN; LSMEANS DESIGN; MEANS DESIGN / TUKEY ALPHA=0.10 CLDIFF; /* Produces Tukey CIs and testing results */ run; /* We could change TUKEY to SCHEFFE or BON to get the Scheffe or Bonferroni results, */ /* but if we're interested in all pairwise comparisons, these will not be as */ /* efficient as the Tukey procedure. */ /* ***************************************************************************** */ /* Residual Plots to Check ANOVA model assumptions */ /* The following code produces some residual plots. */ /* The residuals are plotted against the fitted values */ /* and the normal Q-Q plot of the residuals is produced. */ /* The MEANS statement with the option HOVTEST=BF produces */ /* the output for Brown-Forsythe test for unequal variances. */ PROC GLM data = kenton; CLASS DESIGN; MODEL SALES = DESIGN; LSMEANS DESIGN; MEANS DESIGN / HOVTEST=BF; OUTPUT OUT=diagnost p=ybar r=resid; run; PROC GPLOT data=diagnost; PLOT resid*ybar/vref=0; run; PROC UNIVARIATE noprint ; QQPLOT resid / normal; run; /* Note that according to the Brown-Forsythe test (P-value = 0.8659) we would */ /* FAIL TO REJECT the null hypothesis that all variances are equal. */ /* So the equal-variance assumption seems reasonable for these data. */