/* Data for the logistic regression example */ /* involving the cities' use of TIF */ /* that we studied in class */ /* Entering the data and naming the variables: */ /* TIF is a binary response variable here. */ /* TIF = 1 if a city used TIF. */ /* TIF = 0 if the city did not use TIF. */ DATA citydata; INPUT TIF $ income; CARDS; 0 9.2 0 12.9 0 9.2 1 9.6 0 9.3 1 10.1 0 9.4 1 10.3 0 9.5 1 10.9 0 9.5 1 10.9 0 9.5 1 11.1 0 9.6 1 11.1 0 9.7 1 11.1 0 9.7 1 11.5 0 9.8 1 11.8 0 9.8 1 11.9 0 9.9 1 12.1 0 10.5 1 12.2 0 10.5 1 12.5 0 10.9 1 12.6 0 11 1 12.6 0 11.2 1 12.6 0 11.2 1 12.9 0 11.5 1 12.9 0 11.7 1 12.9 0 11.8 1 12.9 0 12.1 1 13.1 0 12.3 1 13.2 0 12.5 1 13.5 ; run; /* A simple plot of the data set */ goptions reset=all; PROC SGPLOT DATA=citydata; SCATTER y=TIF x=income; run; /* PROC LOGISTIC will fit a logistic regression model. */ /* The DESCENDING option is important! */ /* It defines mu as P(Y=1) as we did in class, rather than as P(Y=0). */ /* We create an output data set called NEW that contained the predicted probabilities. */ PROC LOGISTIC DESCENDING DATA=citydata PLOTS=EFFECT; MODEL TIF = INCOME / LACKFIT; OUTPUT OUT=NEW P=PRED L=LOWER U=UPPER; RUN; /* Output: With the LACKFIT option, SAS provides a Hosmer-Lemeshow test for */ /* "H0: the logistic regression fits well". */ /* With a P-value of 0.4603, we cannot reject this null hypothesis. */ /* We have no reason to doubt the logistic model's fit. */ /* So it's fine to use the logistic regression model. */ /* The estimates beta_0-hat and beta_1-hat are -11.347 and 1.002. */ /* Estimated odds ratio = 2.723, and 95% CI for odds ratio is (1.526, 4.858). */ /* Output: SAS provides a likelihood-ratio test of H0: beta_1 = 0. Since the P-value */ /* is very small ( < .0001), we reject H0, conclude beta_1 is not zero, and conclude */ /* that income has a significant effect on the probability a city uses TIF. */ /* PLOTTING THE ESTIMATED LOGISTIC CURVE */ /* The PLOTS=EFFECT option provides a number of plots, including the plot of the */ /* estimated logistic regression curve. */ /* It also plots lower and upper bounds for a 95% CI for the true probability. */