A recently published paper, authored by alumni of the Cox Lab and the MCO program, suggests that slow initial decision making can lead to reaping big rewards through learning. The study combines behavioral experiments with rats and neural net modeling, and its findings appear in the journal eLife.
In the study, the researchers presented rats with three lickports loaded with a water reward. A screen above the lickports displayed images of two different digital sculptures. One sculpture corresponded to receiving a reward for licking the left lickport, and the other sculpture corresponded to receiving a reward from the lickport on the right. The rats were allowed to choose how long to spend looking at the screen before deciding which port to lick.
Previous work on decision making has suggested that guessing quickly would be the best way to maximize reward, but that’s not what the rats did. Initially, the rats were slow to decide, but as the trials progressed, their reaction times grew faster. “At the beginning, it seemed like they were doing everything wrong, but, by the end, they were doing everything right,” says study lead author and MCO program alum Javier Masís, who is currently working as a postdoc at Princeton.
Masís compares the rats’ task to a hypothetical exercise where humans are asked to look at pictures of similar-looking statues and decide which of the two statues each picture depicts. If a respondent identifies the statue correctly, they receive ten cents. Guessing as quickly as possible could lead to a significant amount of winnings, but someone who spends the first ten minutes of the exercise carefully studying the pictures might be able to identify the statues with greater speed and accuracy later in the session. The person who studiously observes might end up earning more money than the person who guesses as quickly as possible for the entire session. Like the hypothetical picture studiers, the rats in the experiment appeared to be going slowly at first and picking up speed as they learned.
However, when the scientists repeated the experiment with shapes that weren’t visible on the screen, the rats responded with rapidfire guessing. This result suggests that the rats only decide slowly when there is a visual clue that they can learn from.
Decision making has long been an important area of research, but relatively few computational models for decision making address how learning fits into decision tasks. To corroborate the experimental results and improve upon existing models, Masís teamed with then Cox Lab postdoc and current associate professor at University College London Andrew Saxe.
Together, they built a neural network that could decide how long to look at the sculptures before making a choice. Like the rats, this model chose to go slowly at first but gradually became faster and more accurate as time went on. Saxe says that the study is an extension of the Drift Diffusion Model, a very common decision-making model in psychology and neuroscience in which a decision-maker accumulates evidence until the evidence meets a certain threshold and then decides. In the new model, the signal to noise ratio and threshold change over time as the learning occurs.
Saxe says that the study may also point to one reason why AIs struggle to learn over time; learning simply isn’t built into their decision making. “I think this study opens a door,” he says. “AI systems can still struggle with learning efficiently, and they might be improved if they were designed to not only maximize right answers, but to also manage their own learning.”
Although the result is exciting, Saxe doesn’t see it as unexpected. “Learning slowly at first is almost the default case,” he says. “When you say you’re a guitarist, it’s because you chose to study guitar for a long time.”
“Learning is worth it in the end,” adds Masís. “Usually you have to pay a cost at the beginning of learning, like the time spent doing your homework, but you’ll improve at those skills much more efficiently than otherwise.”