The development of artificial intelligence (AI) has long been compared to training animals, a process of reinforcement and reward. This idea was officially recognised on Wednesday, as two pioneers in the field of reinforcement learning, Andrew Barto and Richard Sutton, were awarded the prestigious A. M. Turing Award—widely regarded as the Nobel Prize of computing.
Barto, 76, and Sutton, 67, have spent decades refining AI’s ability to learn through experience. Their work, which began in the late 1970s, has laid the foundation for some of the biggest AI breakthroughs of the past decade. Their approach—developing “hedonistic” machines that continuously adapt based on positive reinforcement—has shaped modern AI as we know it.
A legacy of learning machines
Reinforcement learning, the core of their research, is a technique that allows machines to make decisions by trial and error, improving their performance with each success. This method has enabled AI to outperform humans in various domains, including games, finance, and robotics.
One of the most famous applications of their work was in 2016 and 2017, when a Google DeepMind AI system defeated the world’s best human players at Go, a complex board game that had previously been thought too intricate for machines to master. Reinforcement learning has also played a crucial role in the development of AI tools like ChatGPT, financial trading algorithms, and even robotic hands that can solve a Rubik’s Cube.
However, Barto and Sutton’s work was not always widely recognised.
“We were kind of in the wilderness,” Barto told the Associated Press. “Which is why it’s so gratifying to receive this award. In the early days, it was not fashionable.”
Today, reinforcement learning is a core pillar of AI research, but when Barto and Sutton first introduced their ideas, AI was dominated by other approaches. Their research was inspired by psychology and neuroscience, particularly the way in which human neurons respond to rewards and punishment.
One of their earliest breakthroughs came in the 1980s, when they successfully applied reinforcement learning to a simulated task: balancing a pole on a moving cart without letting it fall. This early experiment laid the groundwork for their widely used textbook on reinforcement learning, which has influenced generations of AI researchers.
A prize that shapes the future
The A. M. Turing Award, named after British mathematician and early AI pioneer Alan Turing, comes with a $1 million prize, sponsored by Google. It recognises individuals whose work has had a lasting impact on computing and artificial intelligence.
Google’s chief scientist, Jeff Dean, praised Barto and Sutton’s contributions, stating:
“The tools they developed remain a central pillar of the AI boom and have rendered major advances, attracted legions of young researchers, and driven billions of dollars in investments.”
Despite their joint achievements, Barto and Sutton do not always see eye to eye on AI’s future.
AI’s risks and rewards
In a joint interview with the Associated Press, the two researchers expressed differing opinions on AI’s potential risks.
Sutton has long dismissed concerns that AI poses an existential threat to humanity, arguing that machines learning from experience is a natural progression.
“The big choice is, do you try to learn from people’s data, or do you try to learn from an agent’s own life and its own experience?” Sutton said.
Barto, on the other hand, is more cautious. He believes that while AI is a powerful tool, there are unintended consequences that must be considered.
“You have to be cognizant of potential unexpected consequences,” Barto said.
His concern contrasts with Sutton’s optimism about post-humanism, the idea that AI will eventually surpass human intelligence and redefine what it means to be intelligent.
Sutton envisions a future where machines evolve beyond human capabilities, stating:
“People are machines. They’re amazing, wonderful machines. But they are not the end product. We can create things that work even better.”
A lasting impact on AI
Regardless of their differing views, Barto and Sutton’s work has transformed the AI landscape. Their contributions continue to shape robotics, gaming, finance, and advanced AI applications that power today’s most sophisticated systems.
Their hedonistic machines, which thrive on reinforcement and reward, have fundamentally changed the way AI learns, making their research one of the most influential in modern computing.
With the Turing Award now added to their list of accolades, their pioneering work in teaching machines how to learn from experience will undoubtedly inspire future generations of AI researchers—and may one day redefine intelligence itself.