Pathway and Network Analysis for Omic Data
In this module, various statistical learning methods for reconstruction and analysis of networks from omics data are discussed, as well as methods of pathway enrichment analysis. Particular attention is paid to omics datasets with a large number of variables, e.g. genes, and a small number of samples, e.g. patients. The techniques discussed will be demonstrated in R. This course assumes familiarity with R or other command-line programming languages.
Networks represent the interactions among components of biological systems. In the context of high dimensional omics data, relevant networks include gene regulatory networks, protein-protein interaction networks, and metabolic networks. These networks provide a window into biological systems as well as complex diseases, and can be used to understand how biological functions are implemented and how homeostasis is maintained. On the other hand, pathway-based analyses can be used to leverage biological knowledge available from literature, gene ontologies or previous experiments in order to identify the pathways associated with disease or an outcome of interest.
Learning Objectives: After attending this module, participants will be able to:
- Evaluate the relative strengths and weaknesses of publicly available knowledge bases for gene set analysis.
- Choose an appropriate null-hypotheses in gene-set analysis methods for specific biological questions.
- Estimate (partially) directed and undirected networks from high-dimensional omics data.
- Use publicly available tools to test for over representation of gene-sets/pathways from individual gene association results.
- Perform network-based pathway enrichment analysis using publicly available software tools.
- Perform version control for meta-data (e.g. pathway and network data) and analysis (codes, hyper-parameters) to ensure reproducibility of results.
Course Dates
- Wed June 12, 1:30 p.m. – 5:00 p.m. EST
- Thu June 13, 8:30 a.m. – 5:00 p.m. EST
- Fri June 14, 8:30 a.m. – 5:00 p.m. EST
Instructors
- Alison Motsinger-Reif
- Ali Shojaie
Suggested Course Pairings
Integrative Genomics Stream
- Module 2: Genetics and Genomics
- Module 10: Epigenetics and Gene Regulation
- Module 14: Gene Expression and Single Cell Genomics
Course Materials
Please email sisg@biosci.gatech.edu for free access.
About the Instructors
Alison Motsinger-Reif is Branch Chief for Computational Biology and Bioinformatics at the NIEHS in Research Triangle Park, North Carolina. Her group uses statistical genetics principles as well as machine learning and neural network approaches to dissect interaction effects in genetic data. She is also lead investigator for the NIEHS Personalized Environment and Genes (PEGS) study. Learn more about Alison’s work here.
Ali Shojaie is Professor of Biostatistics at the University of Washington in Seattle. His research is at the interface of statistical machine learning and network analysis focuses on inference of high-dimensional networks. He develops methods for data where there are more variables than observations, or where the variables and/or observations are correlated with each other. Access Ali’s software here.