STAT718/BIO703

Fall 2024


Genomic Data Science

Instructor: Yen-Yi Ho

Office: LeConte 216A

Class Meetings:

Monday/Wednesday/Friday 9:40AM - 10:30AM 

Classroom: LeConte 206


Dr. Ho's office Hours: Thursday afternoon 16:00-17:30PM, Friday afternoon 14:00-15:00 PM or by appointment (LeConte 216A)

Email: hoyen@stat.sc.edu

Teaching Assistant: Kaniz Fatema (KFATEMA@email.sc.edu)

TA office hours: MW 12:30PM-2PM, Tuesday 11:30PM-1PM, LeConte 111 

Textbook:

1. Introduction to Data Science: Data Wrangling and Visualization with R by Rafael A. Irizarry   (Required)

2. Cedric Gondro (2015). Primer to Analysis of Genomic Data Using R. 

 3. Wim P. Krijnen. Applied Statistics for Bioinformatics Using R. November  2009.

 4. Avril Coghlan. A Little Book of R for Bioinformatics. Release 0.1. Aug18, 2017

 6. Hahne, Huber, Gentleman, and Falcon (2008): Bioconductor Case Studies

                

Resources:

1. This is Statistics.org

2. Bioconductor


Annoucements:

Approximate course outline: (Lecture notes will be updated often)

Acknowledgement: The contents of this course are developed based on the book "Introduction to Data Science by Rafael A. Irizarry, RNA-seq workflow on the Bioconductor website and the Harvard Chan Bioinformatics Core.

Date Weekly topic
Homework
R code
  Reading         
Week Aug 19
Syllabus

Getting Started with R and R Studio


R Markdown (Chapter 20.2 in Rafa)

Homework 1


Homework Template


Rmarkdown
cheat sheet





Chapter 1 (Irizarry)

Chapter 20 (Irizarry)


Chapter 2 (Irizarry)
Week Aug 26
Chapter 2: R Basics

Very Basics
Data Types
Vectors
Coercion
Not availables (NA)
Sorting
Vector Arithmetics
Indexing
Basic plots





 




Homework 1 Due
(Aug 30 before class)

Homework 2





R-Basics1.R



Link for Requesting
HPC account

(Choose Research Computing Account creation)



Chapter 3 (Irizarry)
Week Sep 02

Sep 04: Labor Day

R Basics 2

  Getting Data Into R
  Getting Results Out of R

Chapter 3: Programming Basics
Conditional Expressions
Defining Functions
Namespaces
For-loops
Vectorized Functions







brainbod.txt
brainbod1.txt
brainbod.csv





R-Basic2.R


ALLcode.R





programming-basics.R

programming-basics_9_13.R



Chapter 4 (Irizarry)

Chapter 5 (Irizarry)
Week Sep 09

 

Chapter 3: Programming Basics
Conditional Expressions
Defining Functions
Namespaces
For-loops
Vectorized Functions


Homework 2 Due
(Sep 13 before class)


Homework 3



AA.txt





RcodeInClass9_9.R



programming_basic_9_18.R



Chapter 7 (Irizarry)

Chapter 8 (Irizarry)
Week Sep 16







Data Visualization Principles

FlowingData

Data Exploration Through Time


Homework 3 Due
(Sep 27 before class)


Homework 4




tidyverse.R


Visualization ggplot2

ggplot2 cheat sheet





Chapter 9 (Irizarry)

Chapter 10 (Irizarry)
Week Sep 23




Introduction to RNAseq Methods and Experimental Design

What is "n" is cell culture experiments?

Confounding Effect






Homework4 Due
(Oct 4 before class)








raw-Counts.tsv

experiment-design.tsv






Week Sep 30







RNAseq Quality Assessment

RNAseq Alignment













Final Project Instruction

Final Project Proposal Template


Homework 5









RNAseqAlignment.R



RNAseqAlignment2.R








Week Oct 07


Count normalization with DESeq2

DGE QC Analysis




Homework 5 Due
(Oct 16 before class)
Homework 6







DGE_QC.R


DESeq2-Rcode.R



Week Oct 16



Gene-Level DE Analysis


Gene-Level DE Analysis II

Homework 6 Due
(Nov 8 before class)

Homework 7

targets.txt



Reading List for NGS

Chap 3 (Bioconductor)

Chap 6 (Krijnen)
Week Oct 14



Multiple Comparisons



Functional Analysis










FunctionalAnalysis.R
Chap 6 (Gondro)
Chap 6 (Bioconductor)
Week Oct 21





 
 Single-Cell RNAseq Data Analysis




Final Project Proposal Due (November 11 Before Class)

Homework 7 Due
(November 16 at 5PM)



HPC tutorial

Linux Commands

Linux File Transfer



test.R
test.sh
Chap 6 (Gondro)
Week Oct 28



Single-Cell RNAseq Data Analysis II











Week Nov 04



Single-Cell RNAseq Data Analysis III








Download FASTQ file from SRA

RNAseq.fastq
rna_1_1.fq.bz2
rna_2_1.fq.bz2

Bowtie and Samtools



Week Nov 11


Single-Cell RNAseq Data Analysis IV


Final Project Template

Final Project format




test.R
test.sh

Week Nov 18

Student Presentation

 



Week De 02

 Student Presentation
Final Project Due
December 11, 2024 Wed before 5PM (EST)