STAT 705

SPRING 2023


Data Analysis II

Instructor: Yen-Yi Ho Office: LeConte 216A

Class Meetings:

Monday/Wednesday 3:55 PM - 5:10 PM  in LeConte 206

Teaching Assistant: Anderson Bussing (ABUSSING@email.sc.edu)


Dr. Ho's office Hours: Monday/Wednesday 11 - 12AM or by appointment 

Email: hoyen@stat.sc.edu

TA office Hour: Tuesday 3p /Friday 10 AM or by appointment (Meet in LeConte or Thomas Cooper)


Textbook:

1.    (KNN)Kutner MH, Nachtsheim CJ, Neter J, and Li W. (2005) Applied Linear Statistical Models. 5th Edition. McGrow-Hill/Irwin.
2.    (RSB) Rosner B. (2000) Fundamentals of Biostatistics. 5th Edition. Duxbury.
3.    (HL) Hosmer DV and Lemeshow S. (2013) Applied Logistic Regression. Third Edition. Wiley and Son.
4.    (AG) Agresti A. (2013) Categorical Data Analysis. 3nd Edition.  Wiley and Son. 

5.    James G, Witten D, Hastie T, Tibshirani R. (2015) An Introduction to Statistical Learning
6.    John Verzani's SimpleR notes.   https://cran.r-project.org/doc/contrib/Verzani-SimpleR.pdf

7.     Casella and Berger. Statistical Inference (2nd edition). Textbook used for STAT 712.

Supplemetary Books (NOT required):

8. "Linear Models with R" by Faraway, J.J.
9. "Extending the Linear Model with R" by Faraway, J.J.
10. "An R and S-plus Companion to Applied Regression" by Fox, J.
11. "Statistical Analysis and Data Display" by Heiberger and Holland.
12. "Statistical Research Methods in the Life Sciences" by Rao, P.V.
13.  "Generalized Linear Models" by McCullagh P. and Nelder J.A.
 

Resources:

1. This is Statistics.org

2. R CRAN website

3. Bioconductor

4. What Kind of Statistician Could You Be?

5. Maindonald's note on introductory statistics in R

6. Faraway's note on regression and ANOVA in R

7. Paradis' R for beginners

8. Burns' guide for the unwilling R user

9. Steve Schachterle's R Code

Review Material:

Prerequisites are successful completion of STAT 704 and STAT 712.

Approximate course outline: (Lecture notes will be updated often)

Acknowledgement: The contens of this course (STAT705) has been developed with contribution from colleagues Drs. David Hitchcock, John Grego and Timothy Hanson. Some of my slides were shamlessly borrowed from Dr. Timothy Hanson.

Date Weekly topic
Homework
R code
SAS Code
  Reading         
Week Jan 9
Syllabus   View Schedule

Lecture 1: Test for Binomial Proportions
1. Binomial Proportions
2. Bayesian analysis of two proportions




Homework 1
Homework   template


BinomPost.R




RB Ch7
RB Ch10
Week Jan 16


Lecture 2: RR and OR
1. Relative risk, odds ratio

Lecture 3: Delta method for confidence interval

Lecture4: Contingency Table: Fisher exact test

Lecture5: Chi-squared test
 










Homework 1 Due
(1/28)
Homework 2

task1.csv




RB Ch13

Week Jan 23


Lecture 6: Case-Control Method

Lecture 7: Introduction to Logistic Regression
1. Bernoulli distribution
2. Logistic model
3. Interpreataion of logistic coefficients
4. Connection to 2x2 table
5. Diagnostics















Logistic.R




Logistic2.R



HL Ch1
Week Jan 30


Lecture 8: Statistical Inference of Logistic Regression
1. Likelihood function
2. Maximum likelihood estimation by IRLS





Homework 2 Due
(2/18)
Homework 3




logisticIRLS.R

MyNewton2.R





HL Ch2, Ch3
Week Feb 6


Lecture 9: Classification using Logistic Regression
1. ROC curves
2. Cross-validated errors
3. Bootstrapping for error assessment






ROC.R






HL Ch5
Week Feb 13

Lecture 10: Machine Learning Algorithms for Classification
1. Regression Models
2. K-Nearest Neighbors
3. Naive Bayes
4. Discriminant Analysis


 






Homework 3 Due (3/3)
Homework 4


case-study-2-or-7.R

ml-smoothing.R






HL Ch7

AG Ch10
Week Feb 20


Lecture 11: Machine Learning Algorithm for Classification II
1. Tree-Based Approaches
2. Ensemble

Lecture 12: Conditional Logistic Regression
1. Model
2. Conditional Likelihood
3. Application to mathced case-control studies

 Handwriting1,
 HandwritingC1 HandwritingC2
 Handwriting3.


Lecture13: Multinomial Logistic Regression

  Handwriting1








trees.R



Clogit.R





Multinomial.R

Alligator.R



Week Feb 27




Lecture 14: Log-Linear Regression for Count Data
1. Poisson model
2. Log-Linear Regression
3. Interpretation of Coefficients











CountData.R


AcheHunting.R




AG Ch8

Week March 6


  Spring Break-No Class


















Week March 13


 
Midterm Exam (3/22) in Class (Cover Lecture 1 to Lecture 12)

Lecture 15: Log-Linear Regression for Count Data (Cont.)
 
Handwriting1, Handwriting2, Handwriting3,
Handwriting4, Handwriting5

MidTerm: 3/16 at 2:20PM in Coliseum 3003




Homework 4 Due
(4/3)
Homework5










Week March 20

Lecture 16: Fixed vs. Ramdom effects models







Homework 5 Due
(4/14)











KNN Ch25


Week March 27

Lecture 17: Non-Linear Regression Models











KNN Ch27
HLCh7
AG Ch12
Week April 3

Lecture 18: Models for Correlated Data











Week April 10






Lecture 19: Models for Correlated Data II

 










Week April 17
 Lecture 20: GAM





  LocalLikelihood

      BikeShare


Week April 24


Final Project Due Friday April 28 @ 5P

Final Take Home Exam

Hint