My research interests focus on statistical modeling and applications for analyzing -omics data to improve human health. With my joint appointment, the second focus of my research activities is to conduct collaborative activity (such as serving as co-I on grant proposals, collaborative publications). 

My recent methodological research focuses on developing novel approaches for differential co-expression analysis (DC).  DC analysis examines whether there are correlated changes of expression between a set of genes under various biological conditions. This coordinated expression change suggests evidence for possible co-regulation related to the biological condition in question. Our recent project aims to develop novel analytical tools to detect DC gene combinations using data generated by single-cell RNAsequencing (scRNAseq). Our work in this area is still ongoing. My research team recently published three papers related to this topic listed below. And an R21 project funded by [NIH/National Cancer Institute (NCI)] in this research direction.

In addition, integrative multi-omics data analysis has been increasingly popular with the recent advances in technologies. Because each data type usually has a distinct marginal distribution, a joint study of correlation presents a statistical challenge. We introduced a flexible copula-based framework to study correlation changes across different data types.
 

1. Modeling liquid association using gene expression data generated from microarray experiments
    Biometrics paper (2011) Software package
2. Flexible bivariate correlated count data regression for bulk RNA-seq data 
    Statistics in Medicine paper (2020) 
3. Modeling dynamic correlation in zero-inflated bivariate count data with applications to single-cell RNA sequencing data.
    Biometrics paper (2021) 
4. Ma Z., Davis S.W., and Ho Y.-Y. (2022). Flexible copula model for integrating correlated multi-omics data from single-cell experiments.
    Biometrics paper (To appear)

5. nPARS: A comprehensive search algorithm for constructing Bayesian networks using large-scale genomic data
Data: SNP, Gene Expression, and Cytotoxicity Outcomes ( ldocauc for the area under the log-dose response curve of docetaxel and lfuauc for the area under the log-dose response curve of 5-FU)
Gene expression amd cytotoxicity outcomes  are standardized ([x-mean(x)/sd(x)]).
Algorithm: README, R source Code.
6. Using gene expression to improve the power of genome-wide association analysis
R Source code ReadMe