Richard Sutton · Jul 20, 2023 · 12:26 AM UTC

Richard Sutton

Pinned Tweet

Richard Sutton

@RichardSSutton

20 Jul 2023

AI researchers seek to understand intelligence well enough to create beings of greater intelligence than current humans. Reaching this profound intellectual milestone will enrich our economies and challenge our societal institutions. It will be unprecedented and transformational, but also a continuation of trends that are thousands of years old. People have always created tools and been changed by them; this is what humans do. The next big step is to understand ourselves. This is a quest grand and glorious, and quintessentially human.

153

1,028

280,299

Richard Sutton · Jan 27, 2025 · 3:27 AM UTC

Richard Sutton

@RichardSSutton

27 Jan 2025

Free Palestine.

258

1,769

18,850

574,106

Richard Sutton · Nov 18, 2022 · 11:54 PM UTC

Richard Sutton

@RichardSSutton

18 Nov 2022

Stand with the people of Iran.

351

2,303

12,247

Richard Sutton · Sep 26, 2025 · 9:47 PM UTC

Richard Sutton

@RichardSSutton

26 Sep 2025

Dwarkesh and I had a frank exchange of views. I hope we moved the conversation forward. Dwarkesh is a true gentleman.

Dwarkesh Patel

@dwarkesh_sp

26 Sep 2025

.@RichardSSutton, father of reinforcement learning, doesn’t think LLMs are bitter-lesson-pilled. My steel man of Richard’s position: we need some new architecture to enable continual (on-the-job) learning. And if we have continual learning, we don't need a special training phase - the agent just learns on-the-fly - like all humans, and indeed, like all animals. This new paradigm will render our current approach with LLMs obsolete. I did my best to represent the view that LLMs will function as the foundation on which this experiential learning can happen. Some sparks flew. 0:00:00 – Are LLMs a dead-end? 0:13:51 – Do humans do imitation learning? 0:23:57 – The Era of Experience 0:34:25 – Current architectures generalize poorly out of distribution 0:42:17 – Surprises in the AI field 0:47:28 – Will The Bitter Lesson still apply after AGI? 0:54:35 – Succession to AI

203

3,584

654,990

Richard Sutton · Mar 5, 2025 · 6:39 PM UTC

Richard Sutton

@RichardSSutton

5 Mar 2025

awards.acm.org/about/2024-tu… Machines that learn from experience were explored by Alan Turing almost eighty years ago, which makes it particularly gratifying and humbling to receive an award in his name for reviving this essential but still nascent idea.

Andrew Barto and Richard Sutton are the recipients of the 2024 ACM A.M. Turing Award for developing...

For developing the conceptual and algorithmic foundations of reinforcement learning.

awards.acm.org

155

336

2,813

227,569

Richard Sutton · Feb 25, 2025 · 9:27 PM UTC

Richard Sutton

@RichardSSutton

25 Feb 2025

"What we want is a machine that can learn from experience." ---Alan Turing, 1947

291

2,206

111,966

Richard Sutton · Sep 1, 2025 · 11:30 PM UTC

Richard Sutton

@RichardSSutton

1 Sep 2025

My acceptance speech at the Turing award ceremony: Good evening ladies and gentlemen. The main idea of reinforcement learning is that a machine might discover what to do on its own, without being told, from its own experience, by trial and error. As far as I know, the first person to propose this was Alan Turing in 1947, which makes it particularly gratifying and humbling to receive this award in his name for reviving this essential but still nascent idea. I have three people that I would like to particularly thank. First, Andy Barto. As my PhD supervisor he taught me my whole approach to science, and in particular instilled in me an appreciation of scholarship and craft, and of the great breath of prior work. Second, I would like to thank Oliver Selfridge, my other main mentor; sadly, now deceased. Oliver taught me how keeping ideas simple can be the boldest of all ambitions. Third, I want to thank Martha Steenstrup, my life partner and intellectual sparring partner. She keeps me honest and grounded. Finally, I also want to thank the University of Alberta, which has been an ideal environment for me and for reinforcement learning research these past 22 years. These three people and my university have reinforced in me the ambition to have ideas that matter, without getting too full of myself about it. They taught me that the quest for better ideas is serious, but is best approached playfully, with humility, kindness, and optimism. For this I am eternally grateful. I would also like to thank all of you for being here and for celebrating the pursuit of intellectual excellence. Thank you very much.

223

2,273

183,538

Richard Sutton · Jul 5, 2025 · 8:44 PM UTC

Richard Sutton

@RichardSSutton

5 Jul 2025

It turns out the Turing Award is actually a silvery bowl from Tiffanys.

2,212

123,063

Richard Sutton · Dec 27, 2023 · 12:01 AM UTC

Richard Sutton

@RichardSSutton

27 Dec 2023

I've studied intelligence all my long life, yet still I feel I learned important things about intelligence by reading this book. Thank you, Max Bennett.

202

2,042

191,250

Richard Sutton · Jun 19, 2025 · 3:28 PM UTC

Richard Sutton

@RichardSSutton

19 Jun 2025

All the more so.

Sun @BillySun28

19 Jun 2025

Replying to @RichardSSutton

Dear Prof.Sutton, I recently bought one of your classic reinforcement learning book. But I would like to ask you, in the current era when deep reinforcement learning and large language models are prevalent, is it still necessary to read this book carefully?

131

1,704

148,190

Richard Sutton · Oct 22, 2025 · 3:20 PM UTC

Richard Sutton

@RichardSSutton

22 Oct 2025

Learning is the derivative of knowledge.

134

1,591

109,813

Richard Sutton · Mar 3, 2022 · 1:49 AM UTC

Richard Sutton

@RichardSSutton

3 Mar 2022

If you take all the fields that study intelligent decision making—from neuroscience to AI, psychology to control theory, economics to operations research—do their theories have much in common? I think so, as I explain in this new short paper: arxiv.org/pdf/2202.13252.pdf

263

1,476

Richard Sutton · Sep 3, 2025 · 4:24 PM UTC

Richard Sutton

@RichardSSutton

3 Sep 2025

Dwarkesh Patel is 100% right on this: AI's utility is very strongly dependent on continual learning. piped.video/nyvmYnz6EAg?si=D2v2…

128

1,481

434,867

Richard Sutton · Nov 24, 2024 · 10:51 PM UTC

Richard Sutton

@RichardSSutton

24 Nov 2024

The original RL algorithms, inspired by natural learning, were online and incremental—they were streaming in the sense that they learned from each increment of experience as it happened, then discarded it, never to be processed again. The streaming algorithms were simple and elegant, but the first big successes of RL in deep learning were not with streaming algorithms. Instead, methods such as DQN chopped the stream of experience into individual transitions, then stored and sampled them in arbitrary batches. Subsequent work followed, extended, and refined the batch approach into asynchronous and offline RL, while the streaming approach languished, unable to produce good results in popular deep learning domains. Until now. Now researchers at the University of Alberta have shown that streaming RL algorithms can work just as well as DQN on Atari and Mujoco tasks (arxiv.org/pdf/2410.14606). How did they do it? Mostly just by getting signal normalization and step-size bounding right for the streaming case—otherwise they use standard streaming algorithms like TD(lambda) and Q(lambda). To me it looks like they were simply the first researchers knowledgeable of streaming RL algorithms to seriously address deep RL without being over-influenced by batch-oriented software and batch-oriented supervised-learning ways of thinking.

Mohamed Elsayed @mhmd_elsaye

22 Nov 2024

Would you believe that deep RL can work without replay buffers, target networks, or batch updates? Our recent work gets deep RL agents to learn from a continuous stream of data one sample at a time without storing any sample. Joint work with @Gautham529 and @rupammahmood.

226

1,383

129,544

Richard Sutton · Sep 26, 2025 · 11:23 PM UTC

Richard Sutton

@RichardSSutton

26 Sep 2025

Replying to @GaryMarcus @ylecun @demishassabis

You were never alone, Gary, though you were the first to bite the bullet, to fight the good fight, and to make the argument well, again and again, for the limitations of LLMs. I salute you for this good service!

1,192

666,516

Richard Sutton · Sep 29, 2022 · 10:52 PM UTC

Richard Sutton

@RichardSSutton

29 Sep 2022

The case for ambition in artificial intelligence research: Within your lifetime, AI researchers will understand the principles of intelligence—what it is and how it works—well enough to create beings of far greater intelligence than current humans.

135

1,062

Richard Sutton · Mar 27, 2025 · 4:23 PM UTC

Richard Sutton

@RichardSSutton

27 Mar 2025

Modern Americans, you are not responsible for slavery. You are not responsible for stealing the land and killing almost all native Americans. Those things happened long ago, before you were born. But you are responsible for the stealing of land and the genocide of Palestinians by the state of Israel. These are directly caused by your bombs and your votes today.

141

1,040

121,607

Richard Sutton · Oct 20, 2025 · 5:55 AM UTC

Richard Sutton

@RichardSSutton

20 Oct 2025

To learn more about temporal difference learning, you could read the original paper (incompleteideas.net/papers/s…) or watch this video (videolectures.net/videos/dee…).

Khurram Javed

@kjaved_

18 Oct 2025

The Dwarkesh/Andrej interview is worth watching. Like many others in the field, my introduction to deep learning was Andrej’s CS231n. In this era when many are involved in wishful thinking driven by simple pattern matching (e.g., extrapolating scaling laws without nuance), it’s refreshing to hear an influential voice that is tethered to reality. One clarification for the podcast is that when Andrej says humans don’t use reinforcement learning, he is really saying humans don't use returns as learning targets. His example of LLMs struggling to learn to solve math problems from outcome-based rewards also elucidates the problem with learning directly from returns. Fortunately for RL, this exact problem is solved by temporal difference (TD) learning. All sample-efficient RL algorithms that show human-like learning (e.g., sample-efficient learning on Atari, and our work on learning from experience directly on a robot) rely on TD learning. Now Andrej is not primarily an RL person; he is looking at RL through the lens of LLMs these days, and all RL done in LLMs uses returns as targets, so it’s understandable that he is assuming that RL is all about learning from observed returns. But this assumption leads him to the incorrect conclusion that we need process-based dense rewards for RL to work. If you embrace TD learning, then you don't necessarily need a dense reward. Once you have learned a value function that encodes useful knowledge about the world, you can learn on the fly in the absence of rewards, just like humans and animals. This is possible because in TD learning there is no difference between learning from an unexpected reward and learning from an unexpected change in perceived value.

118

1,060

159,649

Richard Sutton · Apr 26, 2022 · 8:58 PM UTC

Richard Sutton

@RichardSSutton

26 Apr 2022

A new pdf of Andy Barto's and my reinforcement learning textbook is released today. Only minor typo-like corrections. See incompleteideas.net/book/the….

181

1,013

Richard Sutton · Apr 11, 2025 · 6:26 PM UTC

Richard Sutton

@RichardSSutton

11 Apr 2025

David Silver really hits it out of the park in this podcast. The paper "Welcome to the Era of Experience" is here: goo.gle/3EiRKIH.

Google DeepMind

@GoogleDeepMind

10 Apr 2025

Human generated data has fueled incredible AI progress, but what comes next? 📈 On the latest episode of our podcast, @FryRsquared and David Silver, VP of Reinforcement Learning, talk about how we could move from the era of relying on human data to one where AI could learn for itself. Watch now → 00:00 Introduction 01:50 Era of experience 03:45 AlphaZero 10:19 Move 37 15:20 Reinforcement learning and human feedback 24:30 AlphaProof 29:50 Math Olympiads 35:00 Experience based methods 42:56 Hannah's reflections 44:00 Fan Hui joins

181

1,035

182,705

Richard Sutton · Jun 7, 2025 · 7:13 AM UTC

Richard Sutton

@RichardSSutton

7 Jun 2025

This i did not expect. Cool.

Ivanka Trump

@IvankaTrump

6 Jun 2025

Perhaps the most important thing you can read about AI this year : “Welcome to the Era of Experience” This excellent paper from two senior DeepMind researchers argues that AI is entering a new phase—the "Era of Experience"—which follows the prior phases of simulation-based learning and human data-driven AI (like LLMs). The authors’ posit that future AI breakthroughs will stem from learning through direct interaction with the world, not from imitating human-generated data. This is not a theory or distant future prediction. It’s a description of a paradigm shift already in motion. Let me know what you think ! storage.googleapis.com/deepm…

991

127,292

Richard Sutton · Oct 14, 2022 · 10:03 PM UTC

Richard Sutton

@RichardSSutton

14 Oct 2022

If you want others to care about what you think, then start by caring yourself. Get a notebook, write your thoughts down, challenge them, and develop them into something worth sharing.

932

Richard Sutton · Mar 31, 2025 · 10:33 PM UTC

Richard Sutton

@RichardSSutton

31 Mar 2025

Rich's slogans for AI research (revised 2006): 1. Approximate the solution, not the problem (no special cases) 2. Drive from the problem 3. Take the agent’s point of view 4. Don’t ask the agent to achieve what it can’t measure 5. Don't ask the agent to know what it can't verify 6. Set measurable goals for subparts of the agent 7. Discriminative models are usually better than generative models 8. Work by orthogonal dimensions. Work issue by issue 9. Work on ideas, not software 10. Experience is the data of AI incompleteideas.net/rlai.cs.…

158

907

60,390

Richard Sutton · Jan 25, 2023 · 5:38 PM UTC

Richard Sutton

@RichardSSutton

25 Jan 2023

It is sad to lose the DeepMind office in Edmonton to the Tech layoffs and looming recession. But AI is not going away, and I am more focused than ever on the Alberta Plan for AI research. arxiv.org/abs/2208.11173

The Alberta Plan for AI Research

Herein we describe our approach to artificial intelligence research, which we call the Alberta Plan. The Alberta Plan is pursued within our research groups in Alberta and by others who are like...

arxiv.org

719

149,609

Richard Sutton · Jan 14, 2023 · 10:07 PM UTC

Richard Sutton

@RichardSSutton

14 Jan 2023

Blue laser eyes. I am laser focused on understanding intelligence, ignoring all the hype and FUD. (Bitcoin is pretty cool too)

692

284,554

Richard Sutton · Mar 7, 2025 · 6:45 PM UTC

Richard Sutton

@RichardSSutton

7 Mar 2025

I am pretty happy with this 30-minute summary of my views on the current state of AI and alignment. piped.video/watch?v=w177Ov-Y…

PTJC 20th Anniversary: Distinguished Lecture: Dr. Rich Sutton

Dr. Richard Sutton is Professor of Computing Science at the Univers...

youtube.com

103

723

122,775

Richard Sutton · Apr 22, 2025 · 12:21 AM UTC

Richard Sutton

@RichardSSutton

22 Apr 2025

Everything new is also old. This from my 1984 PhD thesis: "AI is an experimental science, yet the complexity of its programs and problem domains often makes the interpretation of results very difficult. Programs often contain so many components and parameters that limitations on computer time and the sheer number of possibilities make it impossible to experimentally evaluate how each contributes to performance." Then I argued, just as I do today, for careful empirical studies in simplified settings that enable better scientific understanding.

717

51,736

Richard Sutton · Oct 14, 2025 · 4:19 PM UTC

Richard Sutton

@RichardSSutton

14 Oct 2025

Replying to @GPUmonk

Read the textbook.

693

48,593

Richard Sutton · Sep 27, 2025 · 12:20 AM UTC

Richard Sutton

@RichardSSutton

27 Sep 2025

💯

Chris Hayduk

@ChrisHayduk

27 Sep 2025

Everyone posting about the Dwarkesh interview (including Dwarkesh himself!) is missing this subtle point. When LLMs imitate, they imitate the ACTION (ie the token prediction to produce the sequence). When humans imitate, they imitate the OUTPUT but must discover the action

656

101,293

Richard Sutton · Aug 18, 2025 · 5:55 PM UTC

Richard Sutton

@RichardSSutton

18 Aug 2025

I was happy to give a more technical talk on how we might create an AI at RLC-2025 and AGI-2025 (video below). The Oak Architecture: A Vision of Super-Intelligence from Experience As AI has become a huge industry, to an extent it has lost its way. What is needed to get us back on track to true intelligence? We need agents that learn continually. We need world models and planning. We need knowledge that is high-level and learnable. We need to meta-learn how to generalize. The Oak architecture is one answer to all these needs. It is a model-based RL architecture with three special features: 1) all of its components learn continually, 2) each learned weight has a dedicated step-size parameter that is meta-learned using online cross-validation, and 3) abstractions in state and time are continually created in a five-step progression: Feature Construction, posing a SubTask based on the feature, learning an Option to solve the subtask, learning a Model of the option, and Planning using the option’s model (the FC-STOMP progression). The Oak architecture is rather meaty; in this talk we give an outline and point to the many works, prior and contemporaneous, that are contributing to its overall vision of how super-intelligence can arise from an agent’s experience. piped.video/live/XqYTQfQeMrE…

AGI-25 Conference | Day 1 | Keynotes and Paper Presentations

Welcome to the first day of the 18th Annual AGI Conference (AGI-25)...

youtube.com

103

679

65,045

Richard Sutton · May 6, 2023 · 12:25 AM UTC

Richard Sutton

@RichardSSutton

6 May 2023

Lots of exaggeration about AI lately. The hype is that LLMs have anything to do with intelligence. The FUD is that AIs will enslave us. I like this cartoon in the New Yorker because it suggests the ridiculousness of both memes.

111

616

481,034

Richard Sutton · Sep 26, 2025 · 11:29 PM UTC

Richard Sutton

@RichardSSutton

26 Sep 2025

Replying to @eigenrobot

Even in birdsong learning in zebra finches the motor actions are not learned by imitation. The auditory result is reproduced, not the actions; in this crucial way it differs from LLM training.

613

316,984

Richard Sutton · Nov 24, 2023 · 6:06 PM UTC

Richard Sutton

@RichardSSutton

24 Nov 2023

I agree 100%

Yann LeCun

@ylecun

23 Nov 2023

Replying to @DrJimFan @RichardSSutton

Animals and humans get very smart very quickly with vastly smaller amounts of training data. My money is on new architectures that would learn as efficiently as animals and humans. Using more data (synthetic or not) is a temporary stopgap made necessary by the limitations of our current approaches.

568

278,632

Richard Sutton · Jan 22, 2022 · 12:56 AM UTC

Richard Sutton

@RichardSSutton

22 Jan 2022

I kind of wish Geoff Hinton would write a brief article like this one by Claude Shannon in 1956: ieeexplore.ieee.org/stamp/st…

531

Richard Sutton · Mar 31, 2025 · 9:55 PM UTC

Richard Sutton

@RichardSSutton

31 Mar 2025

I’ve changed so little. From my 1978 Bachelor’s thesis: “The adult human mind is very complex, but the question remains open whether the learning processes that constructed it in interaction with the environment are similarly complex. Much evidence and many peoples’ intuitions suggest that the learning processes are in fact simple and that the adult mind’s complexity is due to a long history of adaptive interaction with a complex environment.”

541

46,954

Richard Sutton · Sep 30, 2025 · 6:39 PM UTC

Richard Sutton

@RichardSSutton

30 Sep 2025

Still timely

Richard Sutton

@RichardSSutton

6 May 2023

530

64,298

Richard Sutton · Jul 18, 2024 · 5:21 PM UTC

Richard Sutton

@RichardSSutton

18 Jul 2024

The one-step trap (in AI research) The one-step trap is the common mistake of thinking that all or most of an AI agent’s learned predictions can be one-step ones, with all longer-term predictions generated as needed by iterating the one-step predictions. The most important place where the trap arises is when the one-step predictions constitute a model of the world and of how it evolves over time. It is appealing to think that one can learn just a one-step model and then “roll it out” to predict all the longer-term consequences of a way of behaving. The one-step model is thought of as being analogous to physics, or to a realistic simulator. The appeal of this mistake is that it contains a grain of truth: if all one-step predictions can be made with perfect accuracy, then they can be used to make all longer-term prediction with perfect accuracy. However, if the one-step predictions are not perfectly accurate, then all bets are off. In practice, iterating one-step predictions usually produces poor results. The one-step errors compound and accumulate into large errors in the long-term predictions. In addition, computing long-term predictions from one-step ones is prohibitively computationally complex. In a stochastic world, or for a stochastic policy, the future is not a single trajectory, but a tree of possibilities, each of which must be imagined and weighted by its probability. As a result, the computational complexity of computing a long-term prediction from one-step predictions is exponential in the length of the prediction, and thus generally infeasible. The bottom line is that one-step models of the world are hopeless, yet extremely appealing, and are widely used in POMDPs, Bayesian analyses, control theory, and in compression theories of AI. The solution, in my opinion, is to form temporally abstract models of the world using options and GVFs, as in the following references. Sutton, R.S., Precup, D., Singh, S. (1999). Between MDPs and semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning. Artificial Intelligence 112:181-211. Sutton, R. S., Modayil, J., Delp, M., Degris, T., Pilarski, P. M., White, A., Precup, D. (2011). Horde: A scalable real-time architecture for learning knowledge from unsupervised sensorimotor interaction. In Proceedings of the Tenth International Conference on Autonomous Agents and Multiagent Systems, Taipei, Taiwan. Sutton, R. S., Machado, M. C., Holland, G. Z., Timbers, D. S. F., Tanner, B., & White, A. (2023). Reward-respecting subtasks for model-based reinforcement learning. Artificial Intelligence 324.

513

59,413

Richard Sutton · Aug 28, 2024 · 8:22 PM UTC

Richard Sutton

@RichardSSutton

28 Aug 2024

If you are looking to conduct research full-time on the foundations of AI, and • you have read the RL textbook and done the exercises, • you agree with the Alberta Plan for AI Research, • you already have a PhD, • you are open to spending some time in Edmonton, then the Openmind Research Institute is looking for you and would be pleased to receive your application for a research fellowship. These criteria are meant to be a high bar, and there are only a few positions; don't apply unless you meet all the criteria. Openmind doesn't pay industry salaries, but in compensation aims to conduct research that truly matters in the long run. Oh, and there is one other catch: all your research must be published in the open scientific literature. openmindresearch.org

507

77,771

Richard Sutton · Jun 18, 2025 · 7:45 AM UTC

Richard Sutton

@RichardSSutton

18 Jun 2025

In war, both sides lose. That we don’t learn this is the greatest tragedy.

868

53,389

Richard Sutton · Apr 25, 2025 · 1:45 AM UTC

Richard Sutton

@RichardSSutton

25 Apr 2025

My nuanced views on AI alignment are still often caricatured, so perhaps its a good time to repost this 15-minute talk in which I presented them directly: piped.video/watch?v=Hnt-oBA0… The short version is that I don't agree with AI-safety folks about what question we should be asking. Rather than asking how we can control the goals of the AIs, I think we should be asking how we can have a good future without controlling their goals (just as we have a pretty good present without controlling other peoples' goals). @steve47285

Value alignment? | Richard Sutton & Blaise Agüera y Arcas | Absolut...

AI systems are increasingly being used for decisions that have sign...

youtube.com

495

78,469

Richard Sutton · May 15, 2022 · 10:39 PM UTC

Richard Sutton

@RichardSSutton

15 May 2022

My favorite conference is a small one: The Multi-disciplinary Conference on Reinforcement Learning and Decision Making. It works best if only those with a genuine interest in crossing disciplines attend.

476

Richard Sutton · Apr 4, 2025 · 9:57 PM UTC

Richard Sutton

@RichardSSutton

4 Apr 2025

The PhD thesis of my _first_ PhD student, Doina Precup, is at-long-last available in digital form. Title: Temporal Abstraction in Reinforcement Learning Url: incompleteideas.net/papers/P… Abstract: Decision making usually involves choosing among different courses of action over a broad range of time scales. For instance, a person planning a trip to a distant location makes high-level decisions regarding what means of transportation to use, but also chooses low-level actions, such as the movements for getting into a car. The problem of picking an appropriate time scale for reasoning and learning has been explored in artificial intelligence, control theory and robotics. In this dissertation we develop a framework that allows novel solutions to this problem, in the context of Markov Decision Processes (MDPs) and reinforcement learning. In this dissertation, we present a general framework for prediction, control and learning at multiple temporal scales. In this framework, temporally extended actions are represented by a way of behaving (a policy) together with a termination condition. An action represented in this way is called an _option_. Options can be easily incorporated in MDPs, allowing an agent to use existing controllers, heuristics for picking actions, or learned courses of action. The effects of behaving according to an option can be predicted using multi-time models, learned by interacting with the environment. In this dissertation we develop multi-time models, and we illustrate the way in which they can be used to produce plans of behavior very quickly, using classical dynamic programming or reinforcement learning techniques. The most interesting feature of our framework is that it allows an agent to work simultaneously with high-level and low-level temporal representations. The interplay of these levels can be exploited in order to learn and plan more efficiently and more accurately. We develop new algorithms that take advantage of this structure to improve the quality of plans, and to learn in parallel about the effects of many different options. Where now: Doina is a professor of computer science at McGill University and head of the Montreal office of Google DeepMind

492

34,446

Richard Sutton · Sep 21, 2025 · 7:42 PM UTC

Richard Sutton

@RichardSSutton

21 Sep 2025

For those really into it, here are another 50 minutes of my views on planning and action selection in options-based AI agents (like in the Oak architecture). piped.video/watch?v=eJSoV2fS…

Rich Sutton, Planning and Action Selection in Options-based Agents

Rich Sutton is here to challenge one of the biggest misconceptions ...

youtube.com

495

77,703

Richard Sutton · Feb 15, 2025 · 11:45 PM UTC

Richard Sutton

@RichardSSutton

15 Feb 2025

The PhD thesis of my 14th PhD student, Khurram Javed (@KhurramJaved_96), is now available. Title: Real-time Reinforcement Learning for Achieving Goals in Big Worlds Url: incompleteideas.net/papers/j… Abstract: In this dissertation, I motivate the need for real-time learning and propose algorithms that can learn in real time. I argue that such algorithms are needed for achieving goals in large and partially observable environments—big worlds. I then present my algorithms, developed in collaboration with others, in two parts. In Part I, I present algorithms that can learn quickly and reliably in the linear function approximation setting. I introduce an algorithm for learning temporal predictions—SwiftTD—and use it to develop an algorithm for decision-making—SwiftSarsa. The key property of these algorithms is that they can learn with large step-size parameters online without the instability associated with quick online learning. In Part II, I present algorithms for learning non-linear recurrent features efficiently. I introduce the idea of continual imprinting for generating useful candidate features, and I present an algorithm for efficiently computing the gradients of recurrent features online. Khurram is now a research scientist at Keen Technologies.

477

44,608

Richard Sutton · Feb 1, 2024 · 6:41 AM UTC

Richard Sutton

@RichardSSutton

1 Feb 2024

Yes, the agent architectures that Yann LeCun and I work on are both instances of “the common model of the intelligent agent”. And it’s not just an AI thing. You can find the same ideas in psychology, economics, control theory, and neuroscience. See arxiv.org/pdf/2202.13252.pdf

Art of the Problem

@Artoftheproblem

1 Feb 2024

Replying to @Artoftheproblem @ylecun @RichardSSutton

These two diagrams share a lot of similarities

464

76,526

Richard Sutton · Aug 25, 2022 · 3:50 PM UTC

Richard Sutton

@RichardSSutton

25 Aug 2022

A draft of the Alberta Plan for AI Research came out today on arXiv. :-) arxiv.org/abs/2208.11173

The Alberta Plan for AI Research

Herein we describe our approach to artificial intelligence research, which we call the Alberta Plan. The Alberta Plan is pursued within our research groups in Alberta and by others who are like...

arxiv.org

431

Richard Sutton · May 13, 2025 · 10:56 PM UTC

Richard Sutton

@RichardSSutton

13 May 2025

In 1993, it was looking like the internet was actually going to be a thing, so I made a homepage for myself. This is what I wrote for my personal statement: "I am seeking to identify general computational principles underlying what we mean by intelligence and goal-directed behavior. I start with the _interaction_ between the intelligent agent and its environment. Goals, choices, and sources of information are all defined in terms of this interaction. In some sense it is the only thing that is real, and from it all our sense of the world is created. How is this done? How can interaction lead to better behavior, better perception, better models of the world? What are the computational issues in doing this efficiently and in realtime? These are the sort of questions that I ask in trying to understand what it means to be intelligent, to predict and influence the world, to learn, perceive, act, and think." The point being that is was always all about experience for me.

434

25,397

Richard Sutton · Jul 5, 2025 · 8:26 PM UTC

Richard Sutton

@RichardSSutton

5 Jul 2025

In my recent talk at the Upperbound conference, I included a slide in which I tried to be a realist about the arrival of AI, setting aside what we might want to happen or what we might fear will happen, and just ask what _will_ happen (as in John Mearsheimer's "realist" school of geo-politics). Full talk: piped.video/FLOL2f4iHKA

442

82,565

Richard Sutton · Jan 7, 2022 · 3:03 AM UTC

Richard Sutton

@RichardSSutton

7 Jan 2022

DeepMind Alberta is hiring research scientists this year. Come join us in understanding and creating interactive, playful AI. deepmind.com/careers/jobs/88…

402

Richard Sutton · Oct 20, 2025 · 5:30 AM UTC

Richard Sutton

@RichardSSutton

20 Oct 2025

Well said

Csaba Szepesvari @CsabaSzepesvari

19 Oct 2025

Replying to @karpathy

@karpathy I think it would be good to distinguish RL as a problem from the algorithms that people use to address RL problems. This would allow us to discuss if the problem is with the algorithms, or if the problem is with posing a problem as an RL problem. 1/x

416

96,265

Richard Sutton · Oct 7, 2025 · 5:42 AM UTC

Richard Sutton

@RichardSSutton

7 Oct 2025

Replying to @beforeasi @dwarkesh_sp

Yeah, I misspoke there. I meant to say that I don’t think learning is about *training*. Learning is something that the agent does, whereas training is something done to it.

411

33,288

Richard Sutton · Sep 9, 2023 · 1:11 AM UTC

Richard Sutton

@RichardSSutton

9 Sep 2023

We should prepare for, but not fear, the inevitable succession from humanity to AI, or so I argue in this talk pre-recorded for presentation at WAIC in Shanghai. piped.video/NgHFMolXs3U

AI Succession

This video about the inevitable succession from humanity to AI was ...

youtube.com

380

436,811

Richard Sutton · Nov 5, 2025 · 8:09 PM UTC

Richard Sutton

@RichardSSutton

5 Nov 2025

And the new Superintelligence Research Lab will be centered in... Edmonton!

Giri ATG

@lazyuniverse

5 Nov 2025

Launching our Research Lab : Advancing experience powered, decentralized superintelligence - built for continual learning, generalization & model-based planning. Press Release : businesswire.com/news/home/2… We’re solving the hardest challenges in real-world industries, robotics, science … unlocking true intelligence that learns from experience. #Superintelligence #AIResearch #ReinforcementLearning #TrueRL #ContinualLearning #ModelBasedPlanning #DecentralizedAI #ExperientialAI #EnterpriseAI

422

88,406

Richard Sutton · Jul 27, 2025 · 9:18 PM UTC

Richard Sutton

@RichardSSutton

27 Jul 2025

My colleague Rupam Mahmood explains from first principles his groundbreaking work on Streaming Deep Reinforcement Learning: piped.video/QOfkOl9QrZY?si=6qMV…

399

50,189

Richard Sutton · Jun 22, 2024 · 7:28 PM UTC

Richard Sutton

@RichardSSutton

22 Jun 2024

The PhD thesis of my 12th PhD student, Abhishek Naik, is now available. Title: Reinforcement Learning for Continuing Problems Using Average Reward Url: incompleteideas.net/papers/N… Abstract: This dissertation develops simple and practical learning algorithms from first principles for long-lived agents. Formally, the algorithms are developed within the reinforcement learning framework for continuing (non-episodic) problems, in which the agent-environment interaction goes on ad infinitum, with the goal of maximizing the average reward obtained per step. The average-reward formulation is under-studied in reinforcement learning with several important open problems. The first contribution of this dissertation involves the development of foundational one-step average-reward learning methods for prediction and control. The central idea involves using the TD error to estimate the average reward, which enables proofs for convergence in both the on- and off-policy tabular settings. Experimental results show that the algorithms’ performance is robust to the values of their parameters. Next, we extend the above one-step prediction algorithm to make multi-step updates using eligibility traces, because multi-step methods can be more sample-efficient. Based on the analysis of a related algorithm, we prove convergence in the on-policy setting with linear function approximation. We also show the first convergence proof in the off-policy setting for a multi-step tabular average-reward prediction algorithm. Finally, we show that standard discounted algorithms can be significantly improved if their rewards are centered by subtracting out the rewards’ empirical average, which could be changing with time in the control problem. We discuss two ways of estimating the average reward that can be used with any standard discounted algorithm and demonstrate the benefts of reward centering with tabular, linear, and non-linear function approximation.

374

48,643

Richard Sutton · Jan 9, 2022 · 9:22 PM UTC

Richard Sutton

@RichardSSutton

9 Jan 2022

I was thinking about how fractious AI research is. This sentence from Kuhn’s “The Structure of Scientific Revolutions” (1962) is apropos and succinct: “History suggests that the road to a firm research consensus is extraordinarily arduous.”

358

Richard Sutton · Apr 3, 2022 · 1:05 AM UTC

Richard Sutton

@RichardSSutton

3 Apr 2022

I am proud to announce the graduation of my sixth PhD student. Sina Ghiassian is an expert in the design and empirical study of off-policy reinforcement learning algorithms. Reach out to him at ghiassia@ualberta.ca or @sina_ghiassian.

357

Richard Sutton · May 5, 2024 · 9:09 PM UTC

Richard Sutton

@RichardSSutton

5 May 2024

Everything you know about the world is a belief about the statistics of your sensory input and how they depend on your output. There is nothing more to it, and understanding knowledge in this sense is one key to creating AI.

352

39,573

Richard Sutton · Jan 13, 2023 · 4:57 AM UTC

Richard Sutton

@RichardSSutton

13 Jan 2023

Yi Wan will be my eighth PhD student to graduate this spring, and is on the job market now. His research speciality is RL algorithms that maximize the average reward per step. Such algorithms are rarely used today, but are better in all ways. sites.google.com/ualberta.ca…

342

96,263

Richard Sutton · Nov 3, 2024 · 9:38 PM UTC

Richard Sutton

@RichardSSutton

3 Nov 2024

Fans of The Bitter Lesson may be interested in this talk from 2018 (recently re-discovered) which includes its first public presentation, at 30:40. piped.video/tUCJ4UsKU2I?si=ubbY…

Weinberg Symposium 2018: Sutton

Enjoy the videos and music you love, upload original content, and share it all with friends, family, and the world on YouTube.

youtube.com

345

34,905

Richard Sutton · Feb 2, 2025 · 8:29 PM UTC

Richard Sutton

@RichardSSutton

2 Feb 2025

We have fun and talk about DeepSeek in the second episode of a new podcast on the frontier of AI and economic productivity organized by the great @professor_ajay. Check it out. insights.intrepidgp.com//p/d…

DeepSeek (The Derby Mill Series ep 02)

How the Chinese start-up developed an impressive AI chatbot at a fraction of the cost—and what that means for the future of machine learning.

insights.intrepidgp.com

336

37,431

Richard Sutton · Jul 23, 2023 · 8:22 PM UTC

Richard Sutton

@RichardSSutton

23 Jul 2023

There are a lot of things wrong with this world… but too much intelligence is not one of them.

314

206,220

Richard Sutton · Nov 18, 2024 · 6:38 AM UTC

Richard Sutton

@RichardSSutton

18 Nov 2024

“Nature never appeals to intelligence until habit and instinct are useless. There is no intelligence where there is no change and no need of change.” —H. G. Wells, The Time Machine

324

18,399

Richard Sutton · Aug 15, 2023 · 9:46 PM UTC

Richard Sutton

@RichardSSutton

15 Aug 2023

Last night we threw Yi Wan out of my research group (and today he started his travel to Seattle and Meta).

313

45,753

Richard Sutton · May 9, 2025 · 5:59 AM UTC

Richard Sutton

@RichardSSutton

9 May 2025

True dat

finbarr

@finbarrtimbers

7 May 2025

now that RL is hot again, you should all register for RLC and come visit Edmonton in August rl-conference.cc/index.html

323

26,979

Richard Sutton · Dec 26, 2021 · 9:43 PM UTC

Richard Sutton

@RichardSSutton

26 Dec 2021

Levels of explanation. Level 1 is physics. Level 2 is biology/evolution. Level 3 is the mind. (I study level 3.) Level 4 is the economy. Is there a level 5?

303

Richard Sutton · Aug 29, 2023 · 1:50 AM UTC

Richard Sutton

@RichardSSutton

29 Aug 2023

We finally have a version of our paper on loss of plasticity and continual backprop that is polished and submitted to a journal. Good work led by my PhD student Shibhansh Dohare. arxiv.org/abs/2306.13812v2

Loss of Plasticity in Deep Continual Learning

Modern deep-learning systems are specialized to problem settings in which training occurs once and then never again, as opposed to continual-learning settings in which training occurs continually....

arxiv.org

291

52,304

Richard Sutton · May 31, 2025 · 3:25 AM UTC

Richard Sutton

@RichardSSutton

31 May 2025

ACM has made an excellent video introduction to reinforcement learning!

Communications of the ACM @CACMmag

28 May 2025

2024 @TheOfficialACM A.M. Turing Award recipients @RichardSSutton and Andrew G. Barto discuss their #careers and their work on reinforcement learning in an #original #video at bit.ly/43fpe4q

305

23,766

Richard Sutton · Jul 19, 2024 · 12:12 AM UTC

Richard Sutton

@RichardSSutton

19 Jul 2024

Marc Andreessen: I’ve DMed you.

290

91,335

Richard Sutton · Aug 29, 2022 · 9:48 PM UTC

Richard Sutton

@RichardSSutton

29 Aug 2022

I recently gave a keynote talk at an exciting new conference: CoLLAs, the conference on life-long learning agents. My talk was on Maintaining Plasticity in Deep Continual Learning, and the slides can be found here: incompleteideas.net/Talks/Ta…

258

Richard Sutton · Apr 17, 2025 · 5:38 AM UTC

Richard Sutton

@RichardSSutton

17 Apr 2025

This thread in Chinese does indeed seem to accurately communicate the main points of David Silver’s and my short paper on the Era of Experience. Thanks @AnneXingxb!

xingxb @AnneXingxb

16 Apr 2025

1/6 TheBitter RL 今天，RL太🔥了，RLHF更是毕业利器。但 @RichardSSutton 和 @GoogleDeepMind 的Welcome to the Era of Experience 犹如TheBitterLesson的续章给我们当头一棒。经历过模拟时代，享受过人类数据时代，如今我们正踏入经验时代不靠模仿，不靠学习，而靠“活过”。 #AI范式 #RL

ALT https://storage.googleapis.com/deepmind-media/Era-of-Experience%20/The%20Era%20of%20Experience%20Paper.pdf

271

30,339

Richard Sutton · Apr 2, 2025 · 12:25 AM UTC

Richard Sutton

@RichardSSutton

2 Apr 2025

The research team at Openmind now consists of one director and four fellows. With no fanfare and no hype, they go about researching AI exactly how they think will be most productive. openmindresearch.org/

Openmind Research Institute

openmindresearch.org

263

20,341

Richard Sutton · Oct 25, 2022 · 12:02 AM UTC

Richard Sutton

@RichardSSutton

25 Oct 2022

Intelligence is the computational part of an agent’s ability to learn to predict and control its input stream (particularly its reward) in interaction with its environment.

251

Richard Sutton · Nov 15, 2024 · 12:10 AM UTC

Richard Sutton

@RichardSSutton

15 Nov 2024

"In a world of change, the learners shall inherit the earth, while the learned shall find themselves perfectly suited for a world that no longer exists." - Eric Hoffer, philosopher and author

249

13,068

Richard Sutton · Nov 24, 2023 · 7:11 PM UTC

Richard Sutton

@RichardSSutton

24 Nov 2023

Replying to @sprk_77

Not at all. The point of the bitter lesson is that the right learning algorithms (those that scale efficiently with massive computation) are exactly what we need. Massive computation does not alleviate the need for data efficiency.

256

45,428

Richard Sutton · Oct 29, 2022 · 1:54 AM UTC

Richard Sutton

@RichardSSutton

29 Oct 2022

I have just completed my NSERC Discovery Grant proposal, describing the research I'd like to do for the next five years. It can be read at incompleteideas.net/NSERCtec…. FYI.

244

Richard Sutton · Sep 14, 2024 · 10:31 PM UTC

Richard Sutton

@RichardSSutton

14 Sep 2024

The PhD thesis of my 13th PhD student, Kris De Asis (@M33pinator), is now available. Title: Explorations in the Foundations of Value-based Reinforcement Learning Url: incompleteideas.net/papers/K… Abstract: Value-based reinforcement learning is an approach to sequential decision making in which decisions are informed by learned, long-horizon predictions of future reward. This dissertation aims to understand issues that value-based methods face and develop algorithmic ideas to address these issues. It details three areas of contribution toward improving value-based methods. The first area of contribution extends temporal difference methods for fixed-horizon predictions. Regardless of problem setting, using fixed-horizon approximations of the return avoids the well-documented stability issues which plague off-policy temporal difference methods with function approximation. The second area of contribution introduces a framework of value-aware importance weights for off-policy learning and derives a minimum-variance instance of them. This alleviates variance concerns of importance sampling-based off-policy corrections. Lastly, the third area of contribution acknowledges a discrepancy between the discrete-time and continuous-time returns when viewing one as an approximation of the other, and proposes a modification to better align the objectives. This provides improved prediction targets, and when faced with variable time-discretization, improves control performance in terms of an underlying integral return. Where now: Kris is a research fellow at openmindresearch.org

243

30,927

Richard Sutton · Sep 3, 2025 · 4:36 PM UTC

Richard Sutton

@RichardSSutton

3 Sep 2025

The Pandemonium paper is seminal, but a little hard to find; here is a pdf: incompleteideas.net/papers/p…

Jorge Hernandez 🇺🇦 🏳️‍🌈@braneloop

2 Sep 2025

Replying to @RichardSSutton

I recently had the pleasure of having to read Selfridge's Pandemonium. What an amazing mind he had.

249

31,256

Richard Sutton · Sep 17, 2022 · 4:09 PM UTC

Richard Sutton

@RichardSSutton

17 Sep 2022

When there is a war, both sides have failed.

228

Richard Sutton · Dec 21, 2021 · 7:19 PM UTC

Richard Sutton

@RichardSSutton

21 Dec 2021

The special thing about life is that it has a now.

234

Richard Sutton · Oct 11, 2025 · 7:43 AM UTC

Richard Sutton

@RichardSSutton

11 Oct 2025

More on LLMs, RL, and the bitter lesson, on the Derby Mill podcast.

Ajay Agrawal

@professor_ajay

10 Oct 2025

Are LLMs Bitter Lesson pilled? @RichardSSutton says "no" @m_sendhil @suzannegildert @shulgan piped.video/watch?v=e-sghqKZ…

235

44,306

Richard Sutton · Apr 11, 2025 · 7:13 PM UTC

Richard Sutton

@RichardSSutton

11 Apr 2025

The short paper "Welcome to the Era of Experience" is literally just released, like this week. Ultimately it will become a chapter in the book 'Designing an Intelligence' edited by George Konidaris and published by MIT Press. goo.gle/3EiRKIH

253

27,975

Richard Sutton · Jun 17, 2025 · 1:09 AM UTC

Richard Sutton

@RichardSSutton

17 Jun 2025

This is what we have been up to at Keen

John Carmack

@ID_AA_Carmack

16 Jun 2025

The video of my talk at Upper Bound 2025 is up: piped.video/rQ-An5bhkrs?si=y9DP…

214

33,443

Richard Sutton · Feb 20, 2024 · 12:57 AM UTC

Richard Sutton

@RichardSSutton

20 Feb 2024

My tenth PhD student, Banafsheh Rafiee, just defended her thesis “State Construction in Reinforcement Learning”, in which she introduced three diagnostic testbeds based on animal learning experiments and the first generate-and-test algorithm for discovering auxiliary subtasks. She is currently looking for a research scientist position. PhD thesis: drive.google.com/file/d/1sxa… Linkedin page: ca.linkedin.com/in/banafsheh… Google scholar page: scholar.google.ca/citations?… Email: rafiee.banw@gmail.com

197

27,203

Richard Sutton · May 28, 2022 · 7:24 PM UTC

Richard Sutton

@RichardSSutton

28 May 2022

Intelligence is the computational part of the ability to predict and control a sensory input stream. Adapted from John McCarthy's 1997 definition, see incompleteideas.net/papers/S…

205

Richard Sutton · Aug 24, 2024 · 6:17 PM UTC

Richard Sutton

@RichardSSutton

24 Aug 2024

A year later and our work on Loss of Plasticity is finally published, in Nature no less! The Nature version is totally rewritten and has many new results: nature.com/articles/s41586-0… Congratulations to the authors: @s_dohare @JFernandoHG @LanceLan3 @rahman_parash @rupammahmood

Loss of plasticity in deep continual learning

Nature - The pervasive problem of artificial neural networks losing plasticity in continual-learning settings is demonstrated and a simple solution called the continual backpropagation algorithm is...

nature.com

Richard Sutton

@RichardSSutton

29 Aug 2023

194

16,831

Richard Sutton · May 6, 2025 · 6:41 PM UTC

Richard Sutton

@RichardSSutton

6 May 2025

The latest episode of the Derby Mill Podcast is just out and focused on the "Era of Experience" paper by David Silver and myself. Substack: insights.intrepidgp.com/p/we… Spotify: open.spotify.com/episode/254… Apple: podcasts.apple.com/us/podcas… YouTube: piped.video/watch?v=dhfJfQ5N…

Welcome to the Era of Experience (The Derby Mill Series ep 10)

What will the future of AI look like once human-derived knowledge has reached its limit?

insights.intrepidgp.com

203

22,749

Richard Sutton · Jul 19, 2024 · 12:24 AM UTC

Richard Sutton

@RichardSSutton

19 Jul 2024

Artificial Agency is led by my former students and colleagues---people I know well. They are the best in the world at using reinforcement learning and foundation models to create complex, life-like, and purposive agents.

TechCrunch

@TechCrunch

18 Jul 2024

Artificial Agency raises $16M to use AI to make NPCs feel more realistic in video games tcrn.ch/4d6zMEO

194

29,395

Richard Sutton · Jul 21, 2023 · 11:31 PM UTC

Richard Sutton

@RichardSSutton

21 Jul 2023

It has become commonplace to speak of the “existential risk” of AI. Recently even top AI scientists have begun to talk this way. I, for one, find it an unhelpful. So, without controversy, we can note: 1. AI scientists disagree about whether or not “existential risk of AI” is a good way to think 2. the issue is emotionally charged 3. the issue directly impacts the public perception of AI research 4. serious discussion of the issue is rare among AI scientists I want to particularly note the incongruity between the first three points and the fourth. And yet the fourth point is clearly true. Instead of reasoned discussions, we have surveys of AI scientists’ opinions and public letters calling for regulation. There are books on the subject, but in my reading they too lack a serious discussion of whether or not “existential risk” is a good way to think about AI.

185

62,748

Richard Sutton · Oct 16, 2022 · 5:24 PM UTC

Richard Sutton

@RichardSSutton

16 Oct 2022

AIs can serve us as tools, but eventually, when they are sufficiently advanced, it may become immoral to keep them subservient. What is a practical criterion for deciding when an AI should be set free?

177

Richard Sutton · Aug 22, 2024 · 7:34 PM UTC

Richard Sutton

@RichardSSutton

22 Aug 2024

Yesterday there was a completely-student-organized summit of the RLAI (Reinforcement Learning and Artificial Intelligence) research group at the University of Alberta, held at the lovely Amii headquarters. Nice folks and diverse new ideas!

176

11,832

Richard Sutton · Sep 29, 2022 · 10:52 PM UTC

Richard Sutton

@RichardSSutton

29 Sep 2022

I call it the Prize. The Prize is a great and glorious goal! Ambitious AI researchers should keep their Eyes on the Prize.

171

Richard Sutton · Mar 31, 2025 · 10:09 PM UTC

Richard Sutton

@RichardSSutton

31 Mar 2025

Neural networks already seemed old to me in 1978: “A common way to develop general theories of the brain is to theorize about the neuron as the fundamental building block. Frequently the neuron is modeled as an input-summing threshold device and learning is proposed to reside in the connections with other such elements. The question has always been how to change the efficacy of the connections as a function of past experience so that the network of neurons has brain-like learning properties.”

175

13,297

Richard Sutton · Oct 12, 2022 · 11:37 PM UTC

Richard Sutton

@RichardSSutton

12 Oct 2022

A video of my talk on the Alberta Plan for AI Research is now available: incompleteideas.net/Talks/Ta…

162

Richard Sutton · Oct 14, 2022 · 10:03 PM UTC

Richard Sutton

@RichardSSutton

14 Oct 2022

Honoring Your Thoughts To write is to begin to think. To write in a special place ---a book such as this--- is to honor your thoughts and to help them build, one upon the other.

161

Richard Sutton · Jun 27, 2025 · 3:40 AM UTC

Richard Sutton

@RichardSSutton

27 Jun 2025

Ah, if only I did have muscles like that...

You’re unable to view this Post because this account owner limits who can view their Posts.

162

19,295

Richard Sutton · Mar 7, 2025 · 6:57 PM UTC

Richard Sutton

@RichardSSutton

7 Mar 2025

Geordie Rose taking the bullet of explaining away consciousness for those who think it is a special thing. Good solid work, but not very rewarding. I salute you.

This Post is from an account that no longer exists.

150

24,516

Richard Sutton · Sep 29, 2022 · 10:52 PM UTC

Richard Sutton

@RichardSSutton

29 Sep 2022

It will be the greatest intellectual achievement of all time. An achievement of science, of engineering, and of the humanities, whose significance is beyond humanity, beyond life, beyond good and bad.

149

Richard Sutton · Sep 3, 2024 · 5:30 PM UTC

Richard Sutton

@RichardSSutton

3 Sep 2024

Andy Barto gave a great talk, and Ida is doing a great job relaying it!

Ida Momennejad @criticalneuro

3 Sep 2024

A thread on the history of RL/ML based on Andy Barto's talk #RLC2024: the Reinforcement Learning Conference. Beyond seeing friends & giving talks/panel, talking to @RichardSSutton & hearing Andy Barto revived a need for attention to historical psych/neuro influences on AI. 1/n🧵

147

15,193

Richard Sutton · May 27, 2022 · 11:13 PM UTC

Richard Sutton

@RichardSSutton

27 May 2022

In the end, Amii's AI week was awesome. #aiweek2022 So much science. So much industry. So much education. So much fun.

147