BDSY 2026
Key Dates
Move-in Day
Sunday, June 14th
Orientation
Monday, June 15th
Research Symposium
Thursday, July 23rd
Professional Development Day
Friday, July 24th
Move-out Day
Saturday, July 25th
Mentored Research
During BDSY, students are organized into teams of around 10, each working on a distinct project in biomedical or public health research. Each team is guided by one or more faculty mentors and graduate student assistants who provide support throughout the project. For the 2026 program, topics include causal inference, genetics, and electronic health records.
Causal Inference
Instructors:
Fan Li, PhD and Lee Kennedy-Shaffer, PhD
Graduate Student Instructors:
Zihan Zhu, PhD, Hao Wang, MS, and Changjun Li
-
Using data from three recently completed pragmatic clinical trials at the Yale School of Medicine (the ELAIA 1, ELAIA 2, and KAT trials), which evaluated electronic health record-based alert interventions for kidney care, this project focuses on exploring treatment effect heterogeneity with machine learning and discovering meaningful subgroups. Students will be organized into small working groups, each focusing on one trial. They will apply state-of-the-art causal machine learning methods, such as the Meta Learner, Bayesian Additive Regression Trees, and Bayesian Causal Forest, to explore individualized treatment effects and assess whether certain patient subpopulations benefit or do not benefit from alert-based healthcare interventions. Students will critically evaluate the assumptions behind different subgroup identification approaches and compare findings across methods. They will also collaborate across working groups to understand how trial settings affect appropriate methods and results. This is an open-ended pursuit designed to encourage innovative thinking, teamwork, and hands-on implementation, simulating real-world team science in clinical research. The project offers rigorous training in modern causal inference techniques for heterogeneous treatment effects and their application to clinical trial data.
Genetics
Instructor:
Hongyu Zhao, PhD
Graduate Student Instructors:
Yueqian Jing and Haoran Shao
-
This project explores the genetic basis of disease comorbidity through integrative analyses of genome-wide, transcriptome-wide, and proteome-wide association studies. Students will learn to identify shared genetic variants across multiple diseases and quantify their impact on disease pathways. Through hands-on analysis, they will gain skills in genetic epidemiology, bioinformatics, and statistical genetics. Computational tools will be used to interpret complex genetic data and uncover biological mechanisms of disease. Students will work in teams to develop reports and presentations based on their findings. This experience prepares students for future careers in biomedical data science and genetics research.
Electronic Health Records
Instructors:
Cheng-Han Yang, PhD and Henan Xu, PhD
-
This project uses the MIMIC-IV dataset and focuses on prediction using inpatient electronic health record data, particularly repeated biomarker measurements as outcomes. Students will begin with basic regression models and linear mixed-effects models for repeated biomarker measurements, and then explore more flexible approaches such as penalized regression, random forests, XGBoost, and selected sequence models such as RNNs and LSTMs. They will learn how to construct features from longitudinal EHR data and evaluate predictive performance using measures such as prediction error, discrimination, and calibration. The project will also examine how to deal with missing data in lab measurements, including imputation-based approaches, and how these choices affect model performance and interpretation. Students will critically evaluate the strengths and limitations of statistical and machine learning methods for longitudinal EHR data. The project offers rigorous training in modern EHR data analysis and predictive modeling using real-world inpatient data.
Lecture Topics
by Week
Coming Soon!
-
To be updated.
-
To be updated.
-
To be updated.
-
To be updated.
-
To be updated.
-
To be updated.