STAT718/BIO703

Spring 2026


Genomic Data Science: AI & Bioinformatics

Instructor: Yen-Yi Ho

Office: LeConte 216A

Class Meetings:

Monday/Wednesday 14:20-15:35PM

Classroom: LeConte 206


Dr. Ho's office Hours: Thursday afternoon 16:00-17:00PM, Friday 10-11:30AM or by appointment (LeConte 216A)

Email: hoyen@stat.sc.edu

Course Website: https://people.stat.sc.edu/hoyen/STAT718/STAT718.html

Textbook:

1. Introduction to Data Science: Data Wrangling and Visualization with R by Rafael A. Irizarry   (Required)

2. Python Data Science Handbook by Jake VanderPlas

available at https://jakevdp.github.io/PythonDataScienceHandbook/

3. Python for Biologists Tutorial. Available at https://www.pythonforbiologists.org/

4. Machine Learning for Biology Tutorial. Available at https://pythonforbiologists.com/

5. Deep Learning by Ian Goodfellow, Yoshua Bengio and Aaron Courville. Available at https://github.com/janishar/mit-deep-learning-book-pdf or https://www.deeplearningbook.org

6. Understanding deep learning by Simon J.D. Prince 2023, the MIT press. Available at https://udlbook.github.io/udlbook/

Recommended

1. Python Crash Course, 3rd Edition: A Hands-On, Project-Based Introduction to Programming by Eric Mattens.

2. Deep Learning by PyTorch: Build, Train, and Tune Neural Networks Using Python Tools by Eli Stevens, Luca Antiga and Thomas Viehmann


Announcements:

Approximate course outline: (Lecture notes will be updated often)

Date Weekly topic
Homework
Code
  Reading         
Week 1
Jan 12
Syllabus


Lecture 1: Introduction to Genomic Data


Getting Started with Jupyter Notebook and JupyterLab


R Markdown (Chapter 20.2 in Rafa)




Homework Submission Instruction

Homework1

Homework 1
Notebook



Homework 1 Solution

Google Colab Coupon


Link for Requesting
HPC account

(Choose Research Computing Account creation)


Week 2
Jan 19

Jan 19: No Class

Python Basic 1
  Data Types
   NumPy


Presentation Schedule

Paper presented by Macro

Paper presented by John Darden

Paper presented by Kasra

Paper presented by Jianyu

Paper1 presented by Grant
Paper2 presented by Grant

Cellpose paper presented by Xiuchuan

CGMega paper presented by Meysam



Python Basic 1
Notebook1

Python Basic 2
Notebook2


Python Data Science Handbook Chap 2 &3
Week 3
Jan 26

Plots in Python

Python Functions
 
If and Loops

Modules and Packages






Python Basic 3
Notebook3

                                       
Python Basic 4
Notebook4

Python Data Science Handbook Chap 4
Week 4
Feb 2

 
  
  Data Manipulation with Pandas

   Biopython Tutorial



Homework1Due
(Monday Feb 02)

Homework2

Homework 2
Notebook


Homework 2 Solution


Paper about Chopsticks Gene


df3.csv

adata1_counts.csv.gz
adata2_counts.csv.gz

adata1_cell_metadata.csv
adata2_cell_metadata.csv
gene_metadata.csv

Python Basic 5
Notebook5


myseq.fa
Biopython Case Studies

Biopython Case Studies Notebook


Deep Learning by Prince Chap 1, 2, 3



Week 5
Feb 9

   
    
     Lecture 7: Machine Learning Part I (KNN)
    
     Lecture 8: Machine Learning Part II (Linear Classifier)

      Lecture 9: Regularization, Optimization and Performance Metric
      
    The Good, the Bad and the Ugly


 

Homework 3

Homework 3 Notebook


TCGA Pancancer Expression Data

TCGA Pancancer Meta Data



KNN Classifier

Notebook

Linear Classifier

Linear Notebook


Deep Learning by Goodfellow et al. Chap 5

Deep Learning by Prince Chap 4, 5, 6
Week 6
Feb 16
 
  
  Lecture 10: Neural Networks & Backpropagation

   
 Lecture 11: Convolutional Neuron Networks
  

 



Homework2Due
(Monday Feb 16)






Identifying hand-written digits using PyTorch

Notebook

Deep Learning by Goodfellow et al. Chap 6 & 7

Deep Learning by Prince Chap 7, 8, 9
Week 7
Feb 23



Lecture 11: DNA Convolutional Neuron Networks and Applications in Regulatory

Genomics: DeepBind

DNA methylation: DeepCpG








Genomics_CNN
Genomics_CNN.ipynb







Deep Learning by Goodfellow et al. Chap 8 & 9


Deep Learning by Prince Chap 10
Week 8
March 2


      
 
 

  CNN for Gene Coexpression (CNNC) in Single-Cell Data



  

     


Homework3 Due
(Monday March 02)


Homework 3 solution

Notebook






Deep Learning by Goodfellow et al. Chap 10 & 11

Deep Learning by Prince Chap 11

Week 9
March 9
  Spring Break: No Classes





Week 10
March 16

    Single-Cell RNA-seq
  




Homework 4

Homework 4 Notebook






Final Project Proposal template



Deep Learning by Goodfellow et al. Chap 14

Deep Learning by Prince Chap 12 & 13
Week 11
March 23


Single-Cell RNA-seq with Autoencoder
   
     MMD-ResNet
      
     DCA
 
     Proust






Final Project Instruction




HPC tutorial

Linux Commands

Linux File Transfer






Deep Learning by Prince Chap 14 & 15
Week 12
March 30



   From Language Models to Cell Types: Transformers in Genomic Data Science
     
 



Homework 4 Due
(Monday March 30)

Final Project Proposal Due
(Monday March 30)




Deep Learning by Prince Chap 16 & 17
Week 13
April 6


   Spatial Transcriptomics
    
    SuperST


Final Project Template

Final Project Template Notebook








Deep Learning by Prince Chap 18 & 19
Week 14
April 13

    Drug Discovery

    Graphical Neural Networks





Deep Learning by Prince Chap 20 & 21

Week 15
April 20

    Student Presentation

 



Week 16
April 27

    Student Presentation
Final Project Due
Monday May 4 before 5PM