STAT 704

FALL 2022


Data Analysis I

Instructor: Yen-Yi Ho Office: LeConte 216A

Class Meetings:

Monday/Wednesday 2:20-3:35p  in LeConte 206


Dr. Ho's office Hours: Monday/Wednesday after class or by appointment  (can be arranged virtually or in-person)

Email: hoyen@stat.sc.edu


Textbook:

1. Kutner, Nachtsheim, Neter and Li. Applied Linear Statistical Models  (5th edition)  (This book is out of print, but is available online, including in the "international version" which is OK. It is rather expensive new, but it is a very nice book and it will also be the book used for STAT 705 in the spring.)

2. John Verzani's SimpleR notes.   https://cran.r-project.org/doc/contrib/Verzani-SimpleR.pdf

3. Casella and Berger. Statistical Inference (2nd edition). Textbook used for STAT 712.

Supplemetary Books (NOT required):

4. "Linear Models with R" by Faraway, J.J.
5. "Extending the Linear Model with R" by Faraway, J.J.
6. "An R and S-plus Companion to Applied Regression" by Fox, J.
7. "Statistical Analysis and Data Display" by Heiberger and Holland.
8. "Statistical Research Methods in the Life Sciences" by Rao, P.V.

 

Resources:

1. This is Statistics.org

2. R CRAN website

3. Bioconductor

4. What Kind of Statistician Could You Be?

5. Maindonald's note on introductory statistics in R

6. Faraway's note on regression and ANOVA in R

7. Paradis' R for beginners

8. Burns' guide for the unwilling S user

9. Steve Schachterle's R Code

Review Material:

STAT 704 has a co-requisite of STAT 712. If you register for STAT 704 for Fall 2021, please be sure that you are taking STAT 712 in Fall 2021 (or that you have previously taken STAT 712). If you wish to take an applied statistics sequence without taking STAT 712, please consider STAT 700-701 (or STAT J700-J701), which is designed for graduate students from departments other than statistics.

Approximate course outline: (Lecture notes will be updated often)

Acknowledgement: The contents of this course (STAT 704) has been developed with contributions from  colleagues Dr. David Hitchcock, Dr. John Grego and Dr.  Timothy Hanson. Most of my slides were shamelessly borrowed from Dr. Timothy Hanson.

Date Weekly topic
Homework
R code
SAS Code
  Reading         
Week Aug 22
Syllabus  View Schedule

Lecture 1: Probability Review (Appendix A.3, A.4)




Homework 1



John Verzani's SimpleR note

Lab1.R

Homework Template

R Quiz

R Quiz Solution





Week Aug 29



Lecture1: Probability Review (Appendix A.3, A.4)






Homework 1 Due (Sep 2 at 5PM)

Homework2





Lab2.R


More Topics in R (R Code)

Bootstrap Note
 


KNN Appendix
A.3, A.4



Week Sep 5

Sep 5 Labor Day: No Class

Lecture2: Introduction to R
 
More Topics in R (handout)

Lecture 3: Two-sample Test


Sep 10 (Saturday): Make-Up Class 1
10AM-11:15AM
LeConte 205

A gentle introduction about Bootstrap

Bias-corrected bootstrap intervals




Homework 2 Due
(Sep 9 at 5PM)

Homework 3

Type I error rate




Lab3.R

KNN Ch1, Ch2

Week Sep 12


Sep 12-14: NO CLASS (Yen-Yi NIH meeting)


Sep 17 (Saturday): Make-Up Class 2
10AM-11:15AM
LeConte 205 














SAS (Summer Temp Data)

Week Sep 19

Lecture 3: Two-sample Test


Lecture 4: Non-parametric Test, Permutation Test


Homework 3 Due (Sep 23 at 5PM)
Homework 4





Lab4.R


SAS(Mice Data)

SAS (Pollution Data)

Week Sep 26





Lecture 5: Power and Sample Size















Lab5.R


Week Oct 03



Lecture 6: Likelihood


Lecture 7: Simple linear regression:
Specification of Model, Least Square and MLE Estimates, Inferences of Coefficients










Homework 4 Due (Oct 7 at 5PM )
Homework 5









KNN Ch2

KNN Ch16


Week Oct 10




Lecture 8: Simple linear regression
ANOVA Table,  Coefficient of Determination










Homework 5 Due (Oct 14 at 5PM)

Homework 6



Lab6.R

Toluca Data





KNN Ch6
Week Oct 17


Lecture 9: Multiple Regression Model (Chap 6): Basic Tools for Building Regression Models



Oct 19: Midterm Exam




Midterm Review Note







IGROWUP



KNN Ch5

Week Oct 24



Lecture 10: Matrix Approach to Linear Regression (Chap 5, 6)












Lab7.R







KNN Ch6
Week Oct 31


Lecture 11: Added Variable Approach & Multiple Regression Transformation

Lecture 12: Weighted Least Squares

Paper about R squred in WLS

Paper about Smearing Estimat




Homework 6 Due
(October 28 at 5PM)
Homework 7




Lab8.R


KNN 3.9 & 6.8
KNN 10
Week Nov 07


Lecture 13: Model Checking and Fix-Ups 



Homework 7 Due
(Nov 4 at 5PM)

Homework 8


Lab9.R


Lab10.R


KNN 10
KNN 7.5, 7.6
Week Nov 14



Lecture 14: Model Checking and Fix-Ups II






Homework 8 Due
(Nov 18 at 5PM)


FinalProject





KNN 14
Week Nov 21

Lecture 15: Confounding effect


Lecture 16: Multiple Comparisons

JAMA paper about confounding effect


Nov 23 Thanksgiving: No Class








Lab11.R


Lab12.R


KNN 9
Week Nov 28

Model Selection Note

Multivariate Normal Distribution Note

Working with High-Performance Computing


Linux Command
Slurm Script
Slurm link
Unix Beginners


Final Project Due (December 2 at 5PM)



Lab13.R





KNN 14
Dec 9
 
Final Exam Dec 9 (Friday) at 12:30PM