Skip to main content

Shay Neufeld

Image
Shay Neufeld

The multi-armed bandit problem does not refer to a Western-style outlaw with extra limbs. Rather, it is a probability problem where a gambler is confronted by a row of slot machines (the bandits) each with a different reward probability. The gambler must figure out the best way to play this collection of machines in order to maximize his winnings. Should the gambler stick to a machine that is rewarding him 60 percent of the time or should he search for another slot machine that may have a higher reward probability?

“We are constantly searching for the optimal balance between exploring our world to discover something new, or relying on just what we know, to make a decision,” explains Shay Neufeld, a neuroscience PhD student. Mathematicians have investigated this problem for decades, developing algorithms that promise to optimize the trade off between exploring and exploiting. Neufeld is interested in going beyond finding the right balance and actually identifying the organic circuits in our brains that implement these decisions.

To do this, Neufeld has created a multi-armed bandit task for lab mice, but instead of slot machines and money, the mice have nose ports and water. There are two nose ports a mouse can go to, one on the left and one on the right, and they have different reward probabilities. If the right port delivers water 80 percent of the time and the left port delivers water only 20 percent of the time, the mice quickly figure out that they should be exploiting the right port rather than exploring both in order to maximize the water it receives.  

Then, all of a sudden, the probabilities change. “I study how mice figure out that the odds have changed and how quickly they switch over to the other nose port,” Neufeld says. Using a tiny lens implanted in a mouse’s brain, Neufeld images brain activity as the mouse decides whether to go left or right. “I’m trying to correlate the neural data with behavior and find the circuits in the brain that are implementing this explore/exploit trade off.”

The decision to explore or exploit impacts more than our choice of slot machine or nose port. It influences the food we order, the city we live in, and the majors we pursue, and we often make these choices oblivious of this decision process. “Researching this phenomena has definitely made me question how I make decisions like this. I would encourage people to become more conscious of these decisions and really question ‘is this something I want to do over and over again or I should explore a little bit more,’ ” Neufeld concludes.

Additional Info
Field of Study
Neuroscience
Harvard Horizons
2017
Harvard Horizons Talk
To Explore or to Exploit? Investigating How the Brain Decides Whether to Try Something New, or Stick with What It Knows