SISG HEALTH & EVOLUTION MODULE HE3

Artificial Intelligence & Machine Learning

This module new module will provide participants an overview of the potential uses of AI and ML for a wide range of applications in genetics and genomics research that uses big data. A series of ten applications will be illustrated with worked examples showing how large language models are shaping the future of statistical genetics applications.

Topics to be covered in evolutionary analysis include estimation of phylogenies, divergence times, ancestral sequences, and genetic distances. In biomedical genetics we will consider the detection of pathogenic mutations, characterizing tumor clonality, precision medicine, and protection of genomic privacy.

Learning Objectives: After attending this module, participants will be able to: 

  1. Implement machine learning algorithms for molecular evolutionary analysis.
  1. Understand how large language models are beginning to be used in biomedical research and precision medicine.
  1. Appreciate under what circumstances foundation models or transformer models provide superior performance.
  1. Be able to fit complex neural network models for genetic analysis, including generative adversarial networks (GANs), and variational encoders.
  1. Evaluate the output of large languege models applied to DNA sequence data for such applications as predicting gene function and identifying interactions.
  1. Deploy AI/ML for integrating multiple large genomic datsets.

Course Dates
  • Mon June 9, 8:30 a.m. – 5:00 p.m. EST
  • Tue June 10, 8:30 a.m. – 5:00 p.m. EST
  • Wed June 11, 8:30 a.m. – 12:00 p.m. EST
Suggested Course Pairings

Health & Evolution and Statistical Methods Streams 

  • Module INT1: Genetics and Genomics
  • Module ST1: Forensic Genetics
  • Module ST2:  Bayesian Statistics
  • Module HE2: Statistical Genetics
  • Module HE4: Molecular Evolution 
Course Materials

Please email sisg@biosci.gatech.edu for free access.

About the Instructors

Sudhir Kumar is Professor of Biology and Director of the Institute for Genomics and Evolutionary Medicine (iGEM) at Temple University in Philadelphia.  Well known for his MEGA software for molecular evolutionary analysis, and for the Time Tree of Life online phylogeny, Sudhir has also developed a variety of new computational methods for scalable, efficient, and practical analysis of big data. These include machine learning algorithms for evolutionary as well as biomedical research. Learn more about Sudhir’s work here.

Xinghua Mindy Shi is an Associate Professor in the Department of Computer and Information Sciences, and core member of the Institute for Genomics and Evolutionary Medicine at Temple University in Philadelphia.  Her research lies at the interface of data science and bioinformatics, encompassing such topics as privacy protection, human genome structural variation, and statistical methods for machine learning. Learn more about Mindy’s work here.