
Artificial Intelligence & Machine Learning
This module explores artificial intelligence (AI) and machine learning (McL) for genetics, genomics, and evolutionary research. The module will begin by introducing and explaining McL methods, such as regression-based LASSO and feature engineering, for sequence variation analysis and phylogenomics applications. Then, AI models, including large language models, will be introduced, and their biological and evolutionary underpinnings explained, demonstrating their utility for variant effect prediction, phylogenetics, and sequence imputation. The model will address key challenges, including interpretability and privacy, in the analysis of sequence data. Attendees will gain the practical skills and conceptual understanding needed to integrate advanced AI/ML techniques into diverse genomics applications, including the diagnosis of pathogenic variants, the mapping of genotypes to phenotypes via comparative genomics, and the reconstruction of evolutionary histories.
Learning Objectives: After attending this module, participants will be able to:
- Use basic and advanced machine learning algorithms for evolutionary genetics and genomics.
- Use deep learning and large language models for biomedical research and precision medicine.
- Appreciate the circumstances in which basic and advanced McL/AI models apply to their applications.
- Understand and evaluate the outputs of large language models applied to sequence data for tasks such as predicting functional elements, annotating pathogenic variants, and learning gene functions and interactions.
- Deploy AI/ML to analyze and integrate multiple large genomic datasets.
Course Dates
- Mon June 8, 8:30 a.m. – 5:00 p.m. EST
- Tue June 9, 8:30 a.m. – 5:00 p.m. EST
- Wed June 10, 8:30 a.m. – 12:00 p.m. EST
Instructors
- Sudhir Kumar
- Mindy Shi
Suggested Course Pairings
Populations & Evolution and Statistical Methods Streams
- Module IG1: Genetics and Genomics
- Module ST1: Bayesian Statistics
- Module PE1: Population Genetics
- Module ST2: Regression and Regularization
- Module PE2: Statistical Genetics
- Module PE4: Molecular Evolution
Course Materials
Course materials will be available shortly before the class.
Please email sisg@biosci.gatech.edu if you have questions or would like more details.
About the Instructors

Sudhir Kumar is Professor of Biology and Director of the Institute for Genomics and Evolutionary Medicine (iGEM) at Temple University in Philadelphia. Well known for his MEGA software for molecular evolutionary analysis, and for the Time Tree of Life online phylogeny, Sudhir has also developed a variety of new computational methods for scalable, efficient, and practical analysis of big data. These include machine learning algorithms for evolutionary as well as biomedical research. Learn more about Sudhir’s work here.

Xinghua Mindy Shi is an Associate Professor in the Department of Computer and Information Sciences, and core member of the Institute for Genomics and Evolutionary Medicine at Temple University in Philadelphia. Her research lies at the interface of data science and bioinformatics, encompassing such topics as privacy protection, human genome structural variation, and statistical methods for machine learning. Learn more about Mindy’s work here.