Introduction to R and Python
This module introduces both the R and Python statistical programming languages, assuming no prior knowledge. It provides a foundation for computation in later modules.
In addition to discussing basic data management tasks, such as reading in data and producing summaries through R scripts, we will also introduce R’s graphics functions, its powerful package system, and simple methods of looping. Synergistically we will highlight applications in Python and discuss applications where it provides an advantage.
This module utilizes extensive hands-on coding. Examples and exercises will use data drawn from biological and medical applications, including infectious diseases and genetics.
Learning Objectives: After attending this module, participants will be able to:
- Use R to perform descriptive statistics including generation of graphics.
- Read and write data files in both languages.
- Perform basic data manipulations (e.g. creating new variables, merging data sets).
- Install and load R packages, and be able to access the help system and other resources to facilitate their use.
- Perform basic inferential statistical analyses including regression analysis.
- Write and use R and Python functions, and perform basic programming including creating loops.
- Understand the advantages offered by Python for bioinformatic scripting.
Course Dates
- Wed May 28, 1:30 p.m. – 5:00 p.m. EST
- Thu May 29, 8:30 a.m. – 5:00 p.m. EST
- Fri May 30, 8:30 a.m. – 5:00 p.m. EST
Instructors
- Patrick McGrath
- Sini Nagpal
Suggested Course Pairings
Quantitative Genetics Stream
- Module QG1: Quantitative Genetics
- Module QG2: Mixed Models
- Module QG3: Association Mapping
- Module QG4: Whole Genome Sequence Analysis
Course Materials
Please email sisg@biosci.gatech.edu for free access.
About the Instructors
Patrick McGrath is an Associate Professor in the School of Biological Sciences at Georgia Tech. His research group is interested in understanding the genetic basis of heritable behavioral variation, using quantitative genetics and machine learning to study the evolution of complex behaviors in cichlid fish.. Learn more about Patrick’s work here.
Sini Nagpal is a Post-Doctoral Fellow with Greg Gibson’s group in the School of Biological Sciences at Georgia Tech. Her primary interests are in polygenic risk-by-environment interactions, drawing inference from large-scale human genetic biobank cohort studies. Sini is also investigating the decanalization of disease in contemporary society.