SISG MODULE 3

Introduction to R and Python

This module introduces both the R and Python statistical programming languages, assuming no prior knowledge. It provides a foundation for computation in later modules.

In addition to discussing basic data management tasks, such as reading in data and producing summaries through R scripts, we will also introduce R’s graphics functions, its powerful package system, and simple methods of looping. Synergistically we will highlight applications in Python and discuss applications where it provides an advantage.

This module utilizes extensive hands-on coding. Examples and exercises will use data drawn from biological and medical applications, including infectious diseases and genetics.

Learning Objectives: After attending this module, participants will be able to: 

  1. Use R to perform descriptive statistics including generation of graphics.
  1. Read and write data files in both languages.
  1. Perform basic data manipulations (e.g. creating new variables, merging data sets). 
  1. Install and load R packages, and be able to access the help system and other resources to facilitate their use.
  1. Perform basic inferential statistical analyses including regression analysis.  
  1. Write and use R and Python functions, and perform basic programming including creating loops. 
  1. Understand the advantages offered by Python for bioinformatic scripting.
Course Dates
  • Wed May 29, 1:30 p.m. – 5:00 p.m. EST
  • Thu May 30, 8:30 a.m. – 5:00 p.m. EST
  • Fri May 31, 8:30 a.m. – 5:00 p.m. EST
Instructors
  • Patrick McGrath
  • Sini Nagpal

Learn more about the instructors.

Suggested Course Pairings

Quantitative Genetics Stream 

  • Module 7: Quantitative Genetics 
  • Module 15:  Advanced Quantitative Genetics 
  • Module 17:  WGS Analysis Pipeline
  • Module 19: Association Mapping 
Course Materials

Visit the Box here

About the Instructors

Patrick McGrath is an Associate Professor in the School of Biological Sciences at Georgia Tech. His research group is interested in understanding the genetic basis of heritable behavioral variation, using quantitative genetics and machine learning to study the evolution of complex behaviors in cichlid fish.. Learn more about Patrick’s work here.

Sini Nagpal is a Post-Doctoral Fellow with Greg Gibson’s group in the School of Biological Sciences at Georgia Tech.  Her primary interests are in polygenic risk-by-environment interactions, drawing inference from large-scale human genetic biobank cohort studies. Sini is also investigating the decanalization of disease in contemporary society.