Modeling Medical Machine Learning
Harvard Griffin GSAS Voices: Yaniv Yacoby, PhD ’23
Throughout its 150th anniversary year, Harvard Griffin GSAS is foregrounding the voices of some of its most remarkable alumni and students as they speak about their work, its impact, and their experiences at the School.
Graduating student Yaniv Yacoby studies machine learning and how it can be applied to fields like healthcare. Yacoby discusses investigating and testing neural networks for potential future use by doctors, his breakthrough approach in solving an issue surrounding complex models, and teaching an introductory class about deconstructing academic culture.
Looking for Safety in Deep Learning
In healthcare, you want to make well-informed decisions. For that reason, clinicians go through rigorous training before they can treat patients. However, these clinicians are still human. They can’t look at all data collected from all patients that were ever treated at every hospital and make insights based on that information. So, there’s a lot of excitement about the possibility of using machine learning to analyze this data, provide insights, and help clinicians make better decisions.
You may have heard of deep learning, or neural networks, already. These machine-learning methods are very data-hungry. If you give them enough data, they learn to make very good predictions. The problem with a lot of these models is that if you want to use them in a safety-critical context, you have to make sure they’re trustworthy. That’s what my research is about. I study how to incorporate deep learning and neural networks into statistical models that have transparency and take uncertainty into account.
There are several challenges in developing these types of methods. First is translating our knowledge of the problem into math. For example, let’s say I was given the age, body mass index (weight-to-height ratio), and a bunch of other information about a patient, and, using machine learning, I wanted to predict their blood pressure after some treatment. How can I meaningfully translate this problem into math in a way that would allow me to infer the relationship between blood pressure and the data I have about the patient? And more importantly, how can I quantify the uncertainty over this relationship; how do I know for which patents the model is trustworthy and for which it’s not?
The second challenge is computational: quantifying uncertainty requires solving a very difficult problem that even modern computers can’t solve quickly. Because of this, we end up having to make approximations that have poorly understood properties and often have unintended consequences. The goal of my research is to empower individuals to use these complex machine learning methods by making all assumptions explicit so that when something goes wrong, we can go back and revise our assumptions, and so that the approximate computation doesn’t have unintended consequences.
My dissertation focuses on two concepts. The first is “latent variables.” Latent variables are those that we haven’t observed in the data, but that we still want to incorporate into our model. For example, with blood pressure, the patient’s results could have been affected by their stress that day, but we did not collect the patient’s stress along with their other data. Modeling these variables is important for the accuracy of the model and can help us answer questions about the data.
The second concept is called “non-identifiability.” This is a big word that means that there exist multiple models that all explain your data equally well. Non-identifiability is often not something you explicitly choose to have. It arises naturally in very complex models. And alone, non-identifiability often doesn’t pose a problem: to make good predictions, any model that explains the data well will do. The problem comes when you have models with latent variables and non-identifiability. As my work shows, non-identifiability in complex latent variable models compromises the uncertainty and transparency of the models. This poses a big barrier for use.
Recently, I developed a new way of inferring the relationship between variables in these complex latent variable models that is completely unaffected by non-identifiability. In preliminary experiments, I also found that my approach has several other good properties. It’s also more accurate and faster than existing methods.
Creating Community at Harvard Griffin GSAS
I’ve been fortunate to have several amazing mentors during my time at Harvard: my advisor, Finale Doshi-Velez, as well as Margo Seltzer, David Parkes, and John Girash. These mentors have supported me in research as well as in trying to build a more inclusive and supportive culture at Harvard—one that challenges what we think a PhD should look like.
For the last two years, I have taught the required first-year computer science PhD course, CS290: Seminar on Effective Research Practices & Academic Culture. I created CS290 in collaboration with Girash and Parkes to help students deconstruct academic culture—to understand in what ways is it not serving us as individuals and as researchers—and to empower students to create a culture and community where everyone feels like they belong and can thrive. We talk about a lot of things that people often don’t talk about in PhD programs, like mental health, imposter syndrome, advising relationships, and many other topics.
I’ve also had the privilege of leading a peer-to-peer support group called InTouch along with my good friend, colleague, and fellow PhD student Larissa Zhou. She is such a thoughtful and caring person who has done so much for the community here, and I feel so lucky to have gotten to work with her.