SISG HEALTH & EVOLUTION MODULE HE3

Artificial Intelligence & Machine Learning

This new module comprehensively explores how artificial intelligence (AI) and machine learning (ML) tools are revolutionizing genetics and genomics research. The module will begin by examining early methods such as regression-based LASSO and feature engineering from disease variation analysis to phylogenomics applications. Then, the discussion will progress to deep learning architectures—including CNNs, RNNs, LSTMs, and VAEs—applied to evolutionary and biomedical data and explained in a biologist-friendly format. The module will then explain transformer models, showcasing their power for variant effect prediction, phylogenetics, and sequence imputation. Along the way, participants will learn how to address key challenges such as genomic privacy and interpretability and use and train transformer models for genomic data analysis. Through hands-on sessions, real-world examples, and interactive discussions, attendees will gain the practical skills and conceptual understanding needed to integrate advanced AI/ML techniques into diverse genomics applications—from detecting pathogenic variants to mapping evolutionary histories.

Learning Objectives: After attending this module, participants will be able to: 

  1. Implement basic and advanced machine learning algorithms for evolutionary genetics and genomics.
  1. Understand how large language models can be used in biomedical research and precision medicine.
  1. Appreciate under which circumstances in basic and advanced ML/AI models provide superior alternatives.
  1. Understand and evaluate the output of large language models applied to sequence data for tasks such as predicting functional elements, annotating pathogenic variants, and learning the functions and interactions of genes.
  1. Deploy AI/ML to integrate multiple large genomic datasets

Course Dates
  • Mon June 9, 8:30 a.m. – 5:00 p.m. EST
  • Tue June 10, 8:30 a.m. – 5:00 p.m. EST
  • Wed June 11, 8:30 a.m. – 12:00 p.m. EST
Suggested Course Pairings

Health & Evolution and Statistical Methods Streams 

  • Module INT1: Genetics and Genomics
  • Module ST1: Forensic Genetics
  • Module ST2:  Bayesian Statistics
  • Module HE2: Statistical Genetics
  • Module HE4: Molecular Evolution 
Course Materials

Please email sisg@biosci.gatech.edu for free access.

About the Instructors

Sudhir Kumar is Professor of Biology and Director of the Institute for Genomics and Evolutionary Medicine (iGEM) at Temple University in Philadelphia.  Well known for his MEGA software for molecular evolutionary analysis, and for the Time Tree of Life online phylogeny, Sudhir has also developed a variety of new computational methods for scalable, efficient, and practical analysis of big data. These include machine learning algorithms for evolutionary as well as biomedical research. Learn more about Sudhir’s work here.

Xinghua Mindy Shi is an Associate Professor in the Department of Computer and Information Sciences, and core member of the Institute for Genomics and Evolutionary Medicine at Temple University in Philadelphia.  Her research lies at the interface of data science and bioinformatics, encompassing such topics as privacy protection, human genome structural variation, and statistical methods for machine learning. Learn more about Mindy’s work here.