BDSY 2025
Mentored Research
During BDSY, students are organized into teams of around 10, each working on a distinct project in biomedical or public health research. Each team is guided by one or more faculty mentors and graduate student assistants who provide support throughout the project. For the 2025 program, topics included causal inference, genetics, and public health modeling.
Causal Inference
Instructors:
Lee Kennedy-Shaffer, PhD and Fan Li, PhD
Graduate Student Instructors:
Xi Fang, PhD and Jiaqi Tong
-
Using the SUPPORT observational dataset, this project focuses on estimating causal effects of right heart catheterization on mortality outcomes. Students will apply methods such as propensity score weighting, outcome regression, and doubly robust approaches to compare estimates of treatment effects. They will explore individualized treatment effects using causal machine learning methods like DR-learner, R-learner, and BART. Sensitivity analyses will be performed to assess the impact of unmeasured confounding on causal conclusions. Students will critically evaluate assumptions behind different causal inference methods. The project offers rigorous training in modern causal inference techniques and their application to real-world health data.
-
-
Genetics
Instructor:
Hongyu Zhao, PhD
Graduate Student Instructors:
Leqi Xu and Jiaqi Hu
-
This project explores the genetic basis of disease comorbidity through integrative analyses of genome-wide, transcriptome-wide, and proteome-wide association studies. Students will learn to identify shared genetic variants across multiple diseases and quantify their impact on disease pathways. Through hands-on analysis, they will gain skills in genetic epidemiology, bioinformatics, and statistical genetics. Computational tools will be used to interpret complex genetic data and uncover biological mechanisms of disease. Students will work in teams to develop reports and presentations based on their findings. This experience prepares students for future careers in biomedical data science and genetics research.
-
-
Public Health Modeling
Instructors:
Stephanie Perniciaro, PhD, MPH and Shelby Golden, MS
-
This project examines pneumococcal disease dynamics and the phenomenon of serotype replacement following vaccine interventions. Students will analyze global infectious disease surveillance data to characterize changes in pneumococcal serotype distributions. Key statistical methods include time series analysis, hierarchical modeling, and spatial regression, all performed using R. Students will explore how biological, epidemiological, and policy factors influence pneumococcal evolution and vaccine effectiveness. Through data-driven modeling, students will deepen their understanding of public health strategies to prevent infectious diseases. The project bridges biological knowledge with quantitative modeling in the context of global health.
-
-
Lecture Topics
by Week
-
Introduction to Cluster, R and Tidyverse Shelby Golden, MS
Probability Sean McGrath, PhD
Basic Statistics Sean McGrath, PhD
Sources of Bias in Observational Data Fan Li, PhD
Git and GitHub Shelby Golden, MS
Study Design & Estimation Fan Li, PhD
Linear Regression Yuki Ohnishi, PhD
Parameter Estimation/Likelihood Melody Owen, PhD Candidate
AI for Community Health Ruchit Nagar, MD, MPH
Introduction to Python Justin DeMayo
-
Data Mining I Johan Ugander, PhD
Python I Shivam Sharma
Logistic Regression Jingyu Cui, PhD
Data Mining II Johan Ugander, PhD
Empirical Bayes Zhou Fan, PhD
Linear Algebra Melody Owen, PhD Candidate
Generative AI and Foundation Models Arman Cohan, PhD
Python II Shivam Sharma
Prediction for Beginners Sean McGrath, PhD
Cancer Epidemiology Xiaomei Ma, PhD; Leah Ferrucci, PhD, MPH
GenAI in Biomedical Research Hua Xu, PhD
HPC Training Aya Nawano, PhD
Selection Bias and Representativeness Haidong Lu, PhD
Visualizing Data in R with ggplot2 Shelby Golden, MS
Imposter Syndrome Karin Gosselink, PhD
Data Science in Astronomy Priyamvada Natarajan, PhD
-
Variable Selection Shuangge Steven Ma, PhD
RSV and Vaccine Daniel Weinberger, PhD
Introduction to Bayesian Statistics Yiran Wang, PhD
Prediction Leying Guan, PhD
Randomized Trials in Economics A. Mushfiq Mobarak, PhD
Causal Inference Lee Kennedy-Shaffer, PhD
Bayesian Computation Yiran Wang, PhD
Machine Learning Leying Guan, PhD
-
Design and Analysis of Clinical Trials Denise Esserman, PhD
Research at the Intersection of Statistics and Medicine Elizabeth Claus , PhD
Preparing for Graduate School in Biostatistics Elizabeth Claus, PhD
AI and Medicine Rohan Khera, MD, MS
Infectious Disease Modeling Virginia Pitzer, ScD
Collaboration in the Life of a Statistician Denise Esserman, PhD
Data Integration Emma Zang, PhD
Precision Medicine Brian Tom, PhD
Ethics and Questions of AI Consciousness John Pittard, PhD
-
Genetics and Genomics Smita Krishnaswamy, PhD
Climate Modeling and Environmental Health Kai Chen, PhD
AI in Global Health Brian Wahl, PhD, MPH
Data Privacy Hyunghoon Cho, PhD
Large Language Models Shivam Sharma
Poster & Presentation for Biostatistics/Genetics Michael Sweeney
Data Science in Social & Behavioral Sciences Ijeoma Opara, PhD, LMSW, MPH
Scientific Communications: Writing & Presentation Elizabeth Bailey
Working at Meta - FAIR Koustuv Sinha, PhD
Writing Your CV Kelly Shay, MS
-
Biobank Analysis Bhramar Mukherjee, PhD
Graduate Studies in EMD Virginia Pitzer, ScD