Csaba Szepesvari (@CsabaSzepesvari) | nitter

Csaba Szepesvari @CsabaSzepesvari

6 Jan 2023

This semester I'll teach an undergraduate "intro to RL" course at the UofA. For the first lecture, I collected some exciting, recent, impactful applications of RL. Link to the relevant slides: tinyurl.com/3zw9453p I thought this may be worthwhile to share.

26

100

736

108,155

Csaba Szepesvari @CsabaSzepesvari

8 Jul 2025

First position paper I ever wrote. "Beyond Statistical Learning: Exact Learning Is Essential for General Intelligence" arxiv.org/abs/2506.23908 Background: I'd like LLMs to help me do math, but statistical learning seems inadequate to make this happen. What do you all think?

Beyond Statistical Learning: Exact Learning Is Essential for...

Sound deductive reasoning -- the ability to derive new knowledge from existing facts and rules -- is an indisputably desirable aspect of general intelligence. Despite the major advances of AI...

20

64

432

35,783

Csaba Szepesvari @CsabaSzepesvari

29 Jul 2018

Yours truly and his coauthor Tor Lattimore happily present the near-final draft of their upcoming bandit book at banditalgs.com/ The pdf will stay free. In this phase we welcome reader comments. The book will be printed by #CambrideUniversityPress. Please share:)

7

163

401

Csaba Szepesvari @CsabaSzepesvari

19 Oct 2025

Replying to @karpathy

@karpathy I think it would be good to distinguish RL as a problem from the algorithms that people use to address RL problems. This would allow us to discuss if the problem is with the algorithms, or if the problem is with posing a problem as an RL problem. 1/x

9

38

413

177,448

Csaba Szepesvari @CsabaSzepesvari

28 Aug 2020

Interested in hearing about the theoretical foundations of RL from a multidisciplinary perspective (CS, control, stats, OR)? If so, join us at the (all virtual) RL Theory Bootcamp at the Simons Institute next week. Lectures in the morning and the afternoon ==>

4

73

367

Csaba Szepesvari @CsabaSzepesvari

3 Aug 2019

After a 2 year break, I'll be teaching in the fall a grad course. Go Bandits! banditalgs.com

Bandit Algorithms

7

43

343

Csaba Szepesvari @CsabaSzepesvari

3 Sep 2019

Glad to announce the "Theory of RL" program at the Simons Institute in the Fall of 2020. DM me if you are interested! simons.berkeley.edu/programs… @SebastienBubeck @EmmaBrunskill Alan Malek @SeanMeyn Ambuj Tewari and Mengdi Wang are my awesome coorganizers.

Theory of Reinforcement Learning

This program will bring together researchers in computer science, control theory, operations research and statistics to advance the theoretical foundations of reinforcement learning.

simons.berkeley.edu

3

38

210

Csaba Szepesvari @CsabaSzepesvari

19 Mar 2019

Is RL used in real applications? If so, how and where? And if not, why not and how can this be fixed? Join our excellent panelists and speakers at the half-day RL2 workshop organized at @icmlconf or submit a paper to present your views. sites.google.com/view/RL4Rea…

RL4RealLife@NeurIPS2022

Website @ NeurIPS 2022 (videos, posters, etc.)

sites.google.com

3

18

170

Csaba Szepesvari @CsabaSzepesvari

19 Aug 2024

amathr.org/prizes/aiprize/ The Association for Mathematic Research announces "Prize in the Mathematics of Artificial Intelligence". I'm in the selection committee. The goal is to inspire young people to work on the intersection of AI and maths. Nominations to aiprize@amathr.org

6

62

169

38,733

Csaba Szepesvari @CsabaSzepesvari

15 Jan 2021

I feel very much honoured to be selected for this role. To make the best of this job, hive mind of ML people on twitter, if you have any ideas about how to improve ICML, drop me a message (or just respond to this tweet).

John Langford @JohnCLangford

12 Jan 2021

Some decisions for ICML from the board: ICML General Chairs: 2022: Kamalika Chaudhuri @kamalikac 2023: Andreas Krause @arkrause ICML 2022 Program Chairs: Csaba Szepesvari @CsabaSzepesvari, Le Song @dasongle, and Stefanie Jegelka (maybe @StefanieJegelka )

17

2

165

Csaba Szepesvari @CsabaSzepesvari

23 May 2020

Friends: I am looking for theory oriented postdocs in RL (with past theory experience). I appreciate if you spread the word.

1

75

147

Csaba Szepesvari @CsabaSzepesvari

8 Aug 2020

Just for counterbalancing, hats off to those reviewers who are still doing a great job! I know that you are out there and while your numbers could be diminishing, we need you to keep doing what you do (post inspired by reading actual good reviews doing my editorial job).

4

5

154

Csaba Szepesvari @CsabaSzepesvari

6 Mar 2025

Nothing inspires more than the humility of someone with great accomplishments. I hope that generations of researchers will pay attention to the wise words of Rich! (Coming back to X just to post this.)

Amii @AmiiThinks

5 Mar 2025

“There are no authorities in science,” says Turing Award winner @RichardSSutton, Amii Fellow & Canada @CIFAR_News AI Chair. Sit down with Rich and @camlinke as they discuss the journey to this moment. Watch now: hubs.la/Q039xBP-0 #TuringAward #AI #ReinforcementLearning

1

12

151

10,832

Csaba Szepesvari @CsabaSzepesvari

11 Aug 2020

Advice for future reviews: An important question to ask when figuring out whether to recommend accept or reject is "How difficult it is to fix the issues I found?" If very difficult, the paper can't be saved. If not too difficult, there is no reason to reject the paper.

5

8

134

Csaba Szepesvari @CsabaSzepesvari

23 Jul 2020

Broader impact predictions back in the day.

Fermat's Library

@fermatslibrary

23 Jul 2020

Heinrich Hertz after proving the existence of radio waves stated that "it's of no use whatsoever" and regarding the applications of the discovery: "Nothing, I guess"

12

134

Csaba Szepesvari @CsabaSzepesvari

24 Nov 2023

Our department is hiring theoreticians working on ML! If you are on the job market for faculty positions and have a strong track record in theory, this may be your dream job! careers.ualberta.ca/Competit… Why apply? Read on.. 1/x

4

23

108

39,677

Csaba Szepesvari @CsabaSzepesvari

21 Apr 2020

This sounded like a crazy idea two weeks ago, but here we go! @RLtheory is the account to follow! Thanks for the speakers who already accepted our invitations! I hope the community will like this series!

Gergely Neu @neu_rips

21 Apr 2020

excited to announce a new series of virtual seminars on ~~~REINFORCEMENT LEARNING THEORY~~~ we've set this up with @CiaraPikeBurke and @CsabaSzepesvari to keep track of all the advances of this fast-paced field. hope others will also find it useful! sites.google.com/view/rltheo…

4

25

111

Csaba Szepesvari @CsabaSzepesvari

28 Sep 2019

I have a duty to spread the truth: "Don't worry about the overall importance of the problem; work on it if it looks interesting. I think there's a sufficient correlation between interest and importance. — David Blackwell" And remember: en.wikipedia.org/wiki/David_…

15

109

Csaba Szepesvari @CsabaSzepesvari

31 Mar 2021

For whatever it's worth, I am offering a mentoring session at #AISTATS on Wednesday, April 14, 2021 18:30 MDT. All are welcome!

3

13

103

Csaba Szepesvari @CsabaSzepesvari

12 Jul 2018

Please share: The newly created "Foundations team" of @DeepMindAI have openings for research scientists with strong theoretical background, and an unstoppable interest in pushing the boundaries of AI and machine learning. PM me if you are interested. #ICML2018

3

34

103

Csaba Szepesvari @CsabaSzepesvari

14 Mar 2020

Just in case the travel restrictions would last until July, preorder our book now on Amazon: amazon.ca/Bandit-Algorithms-…

2

7

101

Csaba Szepesvari @CsabaSzepesvari

19 Oct 2025

Replying to @CsabaSzepesvari @karpathy

It seems to me that not only you, but too many people talk about RL as if these two things were the same, which prevents a more nuanced discussion. 2/2

4

4

104

20,023

Csaba Szepesvari @CsabaSzepesvari

21 Feb 2019

Bandit blog revived! Yours truly and Tor Lattimore presents everything you wanted to know about "First order bounds for k-armed adversarial bandits"! banditalgs.com/2019/02/16/fi…

First order bounds for k-armed adversarial bandits

To revive the content on this blog a little we have decided to highlight some of the new topics covered in the book that we are excited about and that were not previously covered in the blog. In th…

1

20

96

Csaba Szepesvari @CsabaSzepesvari

4 Apr 2020

After creating a new homepage, I discovered, I used to have a blog. Since I already had it, why not add a new post? Here we go: readingsml.blogspot.com/2020…

3

14

92

Csaba Szepesvari @CsabaSzepesvari

11 May 2023

RL Theory Seminars are back! First talk, Policy learning "without'' overlap: Pessimism and generalized empirical Bernstein's inequality by Ying Jin! sites.google.com/view/rltheo…

2

20

89

15,564

Csaba Szepesvari @CsabaSzepesvari

11 Jun 2020

Tomorrow we will have Martha White! She will talk about "Policy Gradient Methods as Approximate Policy Iteration: Advantages and Open Questions". Talks open to anyone! Join here: amiithinks.github.io/tea-tim…

Amii @AmiiThinks

9 Jun 2020

The @rlai_lab Tea Time Talks return! Hosted by Amii’s Chief Scientific Advisory Dr. Richard S. Sutton, the 20-minute talks are delivered by students, faculty and guests, and range from ideas starting to take root to finished projects. hubs.ly/H0rjZ3X0 #AI #ML #RL

2

17

79

Csaba Szepesvari @CsabaSzepesvari

24 Jun 2020

Replying to @roydanroy

Of course, can't compete with Dan, but I am also still looking for postdocs -- right down in Edmonton, driving distance to the rockies. Awesome hikes, climbs, kayaking, .. + I can promise interesting RL theory problems and a fast paced environment:)

5

8

86

Csaba Szepesvari @CsabaSzepesvari

30 Nov 2020

simons.berkeley.edu/workshop… The third and final workshop in the RL theory program starts tomorrow. The topic is batch RL (sorry @jacobmbuckman) and simulation-based optimization. All are welcome! The workshop will stream on Youtube. To join on zoom, you need to register.

2

14

86

Csaba Szepesvari @CsabaSzepesvari

13 Apr 2024

Venting. Reviewer: The paper is bad because of X, Y and Z. Rebuttal: You are wrong on X, Y and Z + detailed explanation. Reviewer: I maintain my score. The paper is bad (no explanation given). How is this ever an acceptable behavior? Why does a reviewer think this is fine?

8

2

85

12,045

Csaba Szepesvari @CsabaSzepesvari

27 Mar 2021

@peter_richtarik's recent post gave me this idea: As next year yours truly will be partially responsible for reviewing quality at ICML, and you just got your first round of reviews back from named conference, vent for me. I promise to listen.

26

9

87

Csaba Szepesvari @CsabaSzepesvari

18 Aug 2020

This is a mini water treatment plant that will be used to optimize the water treatment process using reinforcement learning. It's really awesome to see this happening in Alberta!

ISL Adapt @ISLadapt

10 Aug 2020

We are excited to advance the science of water treatment and AI with our partners @rlai_lab @UAlberta @AmiiThinks @DraytonValley and @ISLengineering! 💧💻 Many thanks to our supporters @ABInnovates @NSERC_CRSNG for this #aiforgood opportunity!

1

5

84

Csaba Szepesvari @CsabaSzepesvari

9 Apr 2022

Offline RL is cool, but will it ever work? Next Tuesday, Yunzong Xu (MIT) will put the nail into the coffin of offline RL by showing us the proof of the correctness of a 2019 conjecture by Chen and Jiang that predicted bad bad news for offline RL. tinyurl.com/5n9aedv5

8

84

Csaba Szepesvari @CsabaSzepesvari

16 Aug 2023

Replying to @jasondeanlee

He skipped this. Vitanyi & Li's book, or article below gives you the answer. In one formulation, see attached pic, one has that maximum likelihood for a large large class of distributions over one-way infinite sequences is implemented by Kolm-compression link.springer.com/chapter/10…

3

4

84

10,375

Csaba Szepesvari @CsabaSzepesvari

21 Dec 2021

While some moments are pretty bleak (CMT mishaps), it warms my heart to see how many people care about @icmlconf. Thank you reviewers and other program committee members and I am looking forward to working with you in the coming year.

84

Csaba Szepesvari @CsabaSzepesvari

5 Oct 2019

#NevernendingReviewingSeason What makes a review good? (1) Objective; (2) helps the decision maker; (3) helps the authors; (4) polite. Constructive criticism is the expression. Constructive, not destructive.

2

11

81

Csaba Szepesvari @CsabaSzepesvari

15 Aug 2019

Happy to report that it seems chances are really high that we'll record and will post the lectures online. I'll test the tech on Friday to see whether it is able to track me as I zip from board to board.

2

5

81

Csaba Szepesvari @CsabaSzepesvari

18 Nov 2019

To the attention of friends of #ReinforcementLearning: After all those years, finally, our home, @rlai_lab from @UofAResearch is live on twitter.

Reinforcement Learning and Artificial Intelligence @rlai_lab

31 Oct 2019

Hello World! This account will share the latest news and updates about what the Reinforcement Learning and Artificial Intelligence (RLAI) Lab at the University of Alberta is up to. Let’s figure out intelligence!

3

5

77

Csaba Szepesvari @CsabaSzepesvari

28 Apr 2020

With some glitches, but we are done with the first of the series. Never knew so many people care about RL theory, yay! Great talk Chi Jin! Awesome audience! Next one can only be smoother:) Sign up here if you have not signed up yet: sites.google.com/corp/view/r…

3

6

74

Csaba Szepesvari @CsabaSzepesvari

21 Apr 2024

Replying to @thegautamkamath

I grind for my students. And for the love of science and knowledge:) It's not rational, but I can't help it. I am not sure whether this sound honest, but I really never cared about anything but my students and the joy I get from learning new things and connecting to others

2

2

70

3,661

Csaba Szepesvari @CsabaSzepesvari

8 Dec 2020

Tired of starring at the pages of the free pdf at banditalgs.com? Want to smell it, flip the pages? Visit the @CambridgeUP booth at #NeurIPS2020 or just head directly to bit.ly/2VPswrk for an incredible 30% discount! #BanditBook

1

4

66

Csaba Szepesvari @CsabaSzepesvari

25 Oct 2021

Unsolicited student email: "This is my second reminder. I believe your research team is one of the best positions for me to continue my studies, I would be thankful if you could respond to my initial email." (The student never carefully checked my homepage.) Go figure!

5

2

67

Csaba Szepesvari @CsabaSzepesvari

28 Aug 2020

.. and we will finish every day with a bonus talk which brings in the perspective of some particular application. For registration (no fees, just to receive the zoom link) and further details, visit the bootcamp website. simons.berkeley.edu/workshop…

Theory of Reinforcement Learning Boot Camp

Because of COVID-19, we cannot schedule in-person events on the Berkeley campus through December 2020. This workshop will take place online. It will be open to the public for online participation....

simons.berkeley.edu

6

68

Csaba Szepesvari @CsabaSzepesvari

21 Apr 2024

We often hear about the theory-practice gap. At this workshop we will take a thorough look at this. Is there a gap? What is the nature of the gap? Who made it? Is it good to have the gap? If not, how to close it? I think this is super important for the healthiness of the field!

ARLET @arlet_workshop

19 Apr 2024

🧵 Thrilled to announce the #ICML RL workshop 'Aligning RL Experimentalists and Theorists'! We will have several talks and a panel delivered by a super lineup of speakers: @white_martha, @ShamKakade6, @yayitsamyzhang, Dylan Foster, Niao He, @svlevine, and @MengdiWang10. 1/3

1

11

69

8,124

Csaba Szepesvari @CsabaSzepesvari

7 Dec 2020

To the attention of grad students. New Mentor Session scheduled Who? Csaba Szepesvari When? Thu, 10 Dec 2020 18:00:00 GMT Description: phd advise and virtual cookies Details about event: mementor.net/#/session/5fce7…

ELITJP = Login Situs Resmi Permainan Gaming Online Yang Sudah Mendunia

ELITJP menciptakan login tanpa hambatan ke situs resmi gaming online yang sudah mendunia karena sudah diakui dunia bahwan situs gaming online ELITJP merupakan yang terbaik ditahun baru 2026 ini!

1

11

70

Csaba Szepesvari @CsabaSzepesvari

12 Apr 2021

More awesome RL content; Reinforcement Learning, Bit by Bit by Xiuyuan (Lucy) Lu (DeepMind) Date / Time: Lecture 1: 9:30 AM - 10:30 AM (PT), April 20th (Tuesday) Lecture 2: 10:30 AM - 11:30 AM (PT), April 23rd (Friday) rlforum.sites.stanford.edu/t… (Stanford RL forum!)

2

13

65

Csaba Szepesvari @CsabaSzepesvari

23 Jun 2020

It's here! This weekend, a fully online, pre-ICML, soothing "RL for real life" 2x3 hours virtual conference! Fantastic invited speakers & panel, moderators. Prepare and submit your questions in advance!!! All credit should go to my incredible coorganizers.

Yuxi Li @yuxili99

1 Jun 2020

Welcome to RL for Real Life Virtual Conference, June 27-28. sites.google.com/view/RL4Rea…, co-organized with @gabepsilon, Alborz Geramifard, Omer Gottesman, @LihongLi20, Anusha Nagabandi, Zhiwei (Tony) Qin, @CsabaSzepesvari With two panels on general RL and RL+healthcare topics.

6

62

Csaba Szepesvari @CsabaSzepesvari

30 Aug 2019

Bandits going strong at UofA! 32 seats in the classroom all taken on the day when they became available.

3

1

66

Csaba Szepesvari @CsabaSzepesvari

9 May 2024

Now that the #COLT2024 decisions are out, I'd like to announce a workshop that we are organize that will happen just before COLT. The workshop theme is RL Theory. All are welcome! Details here: rltheory-workshop.github.io Please spread the word!

The second RL theory workshop (co-located with COLT 2024)

rltheory-workshop.github.io

2

20

64

23,188

Csaba Szepesvari @CsabaSzepesvari

9 Mar 2019

Illustration, slightly edited to protect anonymity: "paper feels incremental ..putting together well-known ideas in a straightforward manner." What can I say? Previous work missed even these. And straightforward once done. Reviewer also admitted not reading the proof. Great job?!

Scott Niekum @scottniekum

9 Mar 2019

ICML review rant: The ML community is screwed if we keep insisting that scientific inquiry about known algorithms isn't "novel" (even if it leads to major new capabilities / SoTA), but that engineering yet another new, incremental algorithm that we know nothing about is great.

5

64

Csaba Szepesvari @CsabaSzepesvari

16 May 2020

Any tips on what to write as a broader impact statement for theory papers to be sent to NeuroIPS? #powerofmath #poweroftheory

10

3

59

Csaba Szepesvari @CsabaSzepesvari

27 Nov 2022

1/x Our department has 2 Assistant Professor positions in AI/ML and one in Theoretical Computing Science. Here are the job ads. Our department is a super fun, collegial place. Ads: careers.ualberta.ca/Competit… careers.ualberta.ca/Competit…

1

15

59

Csaba Szepesvari @CsabaSzepesvari

4 Aug 2021

The moment when the hope that review quality can be improved appears to be fading into the void.. But: #NeverGiveUp #ICML2022

5

3

60

Csaba Szepesvari @CsabaSzepesvari

19 Mar 2019

New post on the inescapable appeal of Bayesian methods in the context of adversarial bandits. Or how Bayesian methods can help the agnostic. Hint: Minimax theorems open wormhole between distant corners of the universe. banditalgs.com/2019/03/17/ba…

Bayesian/minimax duality for adversarial bandits

The Bayesian approach to learning starts by choosing a prior probability distribution over the unknown parameters of the world. Then, as the learner makes observation, the prior is updated using Ba…

16

59

Csaba Szepesvari @CsabaSzepesvari

27 Jul 2021

"What information to seek, how to seek that information, and what information to retain?" What else is there to know? A principled approach to this problem will be presented tomorrow by DeepMind's Xiuyuan Lu. Last RL Theory Seminar before the summer break! tinyurl.com/2e2yu873

7

58

Csaba Szepesvari @CsabaSzepesvari

5 Mar 2022

One day before reviews are due for Phase 1 at #ICML2022, 50% of the reviewers have submitted zero reviews. The review load for this phase is <=2 papers and there were 19 days for writing these <=2 reviews. What percentage of reviewers will submit all of their reviews in time?

18% 50-69

27% 70-89

12% 90-100

43% just relax Csaba

941 votes • Final results

11

2

58

Csaba Szepesvari @CsabaSzepesvari

9 Dec 2020

Asking for a friend: A student wants to pick up intuition about Bregman divergences and their use in convex optimization/online learning. There are lots of excellent texts out there, but is there one that is strong on providing intuition? 1/x

5

3

57

Csaba Szepesvari @CsabaSzepesvari

1 Dec 2019

New favourite quote:)

Computer Science

@CompSciFact

30 Oct 2014

'Just because you've implemented something doesn't mean you understand it.' -- Brian Cantwell Smith

2

57

Csaba Szepesvari @CsabaSzepesvari

12 Jan 2022

Exactly what the program committee needs to know! Thanks Mike! :-D

2

55

Csaba Szepesvari @CsabaSzepesvari

10 Sep 2021

Super proud of Tor and Andras! It's a delight to have them in the team! The paper can be access from here: proceedings.mlr.press/v134/l…

Improved Regret for Zeroth-Order Stochastic Convex Bandits

We present an efficient algorithm for stochastic bandit convex optimisation with no assumptions on smoothness or strong convexity and for which the regret is...

proceedings.mlr.press

Google DeepMind

@GoogleDeepMind

8 Sep 2021

Huge congratulations to Tor and Andras! Their paper “Improved Regret for Zeroth-Order Stochastic Convex Bandits” was recently recognised for a best paper runner-up award by the flagship learning theory conference, COLT: dpmd.ai/colt21 1/

A graphical representation of the algorithm running

ALT A graphical representation of the algorithm running

1

3

54

Csaba Szepesvari @CsabaSzepesvari

15 Feb 2021

I am delighted to invite everyone tomorrow for the first RL Theory Seminar talk of 2021 by Andrea Zanette. Andrea will explain to us why and how batch reinforcement learning can be much harder than online RL. For details check out sites.google.com/view/rltheo…

11

54

Csaba Szepesvari @CsabaSzepesvari

15 Jan 2023

I got many good comments, suggestions and I have significantly expanded the list. I am quite pleased with the result, RL seems to be doing quite well. Very nice applications and more in the works! Thanks everyone!

Csaba Szepesvari @CsabaSzepesvari

6 Jan 2023

This semester I'll teach an undergraduate "intro to RL" course at the UofA. For the first lecture, I collected some exciting, recent, impactful applications of RL. Link to the relevant slides: tinyurl.com/3zw9453p I thought this may be worthwhile to share.

5

55

10,537

Csaba Szepesvari @CsabaSzepesvari

21 Mar 2023

Wow, I just discovered this treat: mlstory.org/index.html Moritz Hardt and Ben Recht: "Patterns, predictions, and actions". I will surely recommend this for my students or whoever starts with this subject! Very cool. Thank you @beenwrekt !

3

2

51

8,807

Csaba Szepesvari @CsabaSzepesvari

7 Dec 2020

NeurIPS experience: Does anyone enjoy moving around a silly avatar with the speed of a snail in oversized rooms to get to specific posters?

9

52

Csaba Szepesvari @CsabaSzepesvari

28 Jan 2022

My typical day..

Peyman Milanfar

@docmilanfar

28 Jan 2022

On the first page of my (1993) PhD Thesis. Still true.

54

Csaba Szepesvari @CsabaSzepesvari

14 Sep 2024

Signal boosting; please repost! We need more nominations! There are so many deserving people, **please be generous and send a nomination**! It should not take much time (a short nomination is preferred to none). We are hoping the prize will motivate more to take the math+AI path!

Csaba Szepesvari @CsabaSzepesvari

19 Aug 2024

amathr.org/prizes/aiprize/ The Association for Mathematic Research announces "Prize in the Mathematics of Artificial Intelligence". I'm in the selection committee. The goal is to inspire young people to work on the intersection of AI and maths. Nominations to aiprize@amathr.org

1

26

48

17,509

Csaba Szepesvari @CsabaSzepesvari

26 Jun 2020

Improper learning? Who would do that? Is not that bad by definition? Not even proper? Come to our seminar to find out what Max Simchowitz thinks about improper learning for non-stochastic control!

RL Theory Virtual Seminars @RLtheory

26 Jun 2020

Our next talk: 06/30: Max Simchowitz (UC Berkeley) "Improper Learning for Non-Stochastic Control" For details, please see the website: sites.google.com/view/rltheo…

1

6

47

Csaba Szepesvari @CsabaSzepesvari

28 Jul 2020

Replying to @thegautamkamath

When I was a PhD student, a few times I was quite discourage by some reviews. SIAM J. Opt told me in 2000 that exploration in finite MDPs is old-fashioned:) Soon enough though, I learned not to pay attention to failures or rejections and focused on positives. ==>

2

1

50

Csaba Szepesvari @CsabaSzepesvari

12 Jan 2020

Cool universality argument for SGD with FF neuralnets: Take any learning alg A for learning Boolean functions without noise from a sample of size n. Then there is a NN architecture G(A,n) such that SGD+G(A,n)+Any reasonable loss with sequential processing "implements" A.

Dimitris Papailiopoulos

@DimitrisPapail

10 Jan 2020

A tour de force by Abbe & Sandon, arxiv.org/pdf/2001.02992.pdf "Any function distribution that can be learned from samples in poly-time can also be learned by a poly-size neural net trained with SGD on a poly-time initialization with poly-steps" + "[this] does not hold for GD"

1

7

46

Csaba Szepesvari @CsabaSzepesvari

13 Aug 2017

I am very excited to announce that I am joining Deepmind, taking a two year leave. I will miss people in Edmonton, but you should visit!

2

4

49

Csaba Szepesvari @CsabaSzepesvari

11 Dec 2020

@neu_rips being featured in @marcgbellemare's talk (awesome talk Marc, by the way!! congrats again for all those involved!!). But Twitter does work, eh?

1

45

Csaba Szepesvari @CsabaSzepesvari

13 Jul 2020

amii.ca/postdoc-opportunity-… One last time; official ad is out.

13

42

Csaba Szepesvari @CsabaSzepesvari

15 Oct 2020

Replying to @beenwrekt

You mean no progress? Nah.. Btw, I like the style of some of these old papers that describe some unbaked idea for what they are, not trying to oversell them, making them look bigger than what they are (eg a heuristic is a heuristic..). Papers of this type won't make it today.

2

1

47

Csaba Szepesvari @CsabaSzepesvari

28 Oct 2024

No measure theory required and martingales mentioned 76 times. It must be about discrete stuff. But no, it is not at all. So how does this work? LOL (This is from the Meyn and Tweedie book about Markov chains, which I love regardless. It seems Sean is not on twitter anymore!?!)

1

3

46

5,275

Csaba Szepesvari @CsabaSzepesvari

22 Oct 2020

Replying to @yisongyue

Research is done in many small steps. You may think something goes unnoticed, but it may have influenced someone, who gets a new idea, writes another small thing. This leads to the next thing. Wait 20 years, the many little things add up and a much cleaner, deeper ==>

1

2

46

Csaba Szepesvari @CsabaSzepesvari

17 Mar 2020

You must see this, new webpage! sites.ualberta.ca/~szepesva/ ..after the service I have previously used to compile my publications-page stopped working (dire times..), put together in a day with the help of bibbase.org and jemdoc.jaboc.net

4

45

Csaba Szepesvari @CsabaSzepesvari

24 Sep 2020

..and next week we take a break to let the "Deep RL meets theory" workshop to take the stage! Check out the program at: simons.berkeley.edu/workshop… Do not forget to put all these events in your calendar! The most convenient way to do this is to go here: simons.berkeley.edu/workshop…

RL Theory Virtual Seminars @RLtheory

22 Sep 2020

We are glad to announce that we are now officially part of the "Theory of RL" program at the Simons Institute! See our updated schedule that now includes two new speakers and the RL theory workshops at @SimonsInstitute.

7

43

Csaba Szepesvari @CsabaSzepesvari

15 Mar 2021

A frequent issue in batch RL is that evaluation methods are biased and the size of the bias is unknown. Come and join us tomorrow to learn from Yi Su about how to build optimizers that do almost as well as if the bias was known! For details: tinyurl.com/v5s68k5c

1

10

42

Csaba Szepesvari @CsabaSzepesvari

30 Nov 2024

I guess I'll be out from here; you know where to find me. I'll probably check back time to time for the odd messages, but this will wind down and stop eventually. There are rules about how many social media accounts one should keep alive. Thx!

1

38

4,996

Csaba Szepesvari @CsabaSzepesvari

10 Aug 2023

Aaditya Ramdas (not on twitter; good for him) is coediting a special issue for MLJ on "Conformal Prediction and Distribution-Free Uncertainty Quantification". Deadline Nov 30. Consider submitting if you have something! I will be looking forward to see what comes out of this!

2

5

38

6,889

Csaba Szepesvari @CsabaSzepesvari

6 Jun 2023

RL Theory Seminars is pleased to present a talk by Yujia Jin (Stanford) tomorrow on "VOQL: Towards Optimal Regret in Model-free RL with Nonlinear Function Approximation". For further details, check out sites.google.com/view/rltheo…

RL theory seminars - Next Seminar

June 23rd 2026, 4 pm UTC Speaker: Andrew Wagenmaker (Microsoft Research) Title: Posterior Behavioral Cloning: Pretraining BC Policies for Efficient RL Finetuning Paper: https://arxiv.org/abs/2512.1...

sites.google.com

6

42

5,235

Csaba Szepesvari @CsabaSzepesvari

15 Aug 2020

For those who like books, I also love the Anthony-Bartlett book stat.berkeley.edu/~bartlett/… While it is quite short, it explains soo much about how SLT has evolved over the years!

6

43

Csaba Szepesvari @CsabaSzepesvari

23 Aug 2023

Proud of my colleagues, winning an IJCAI distinguished paper award! Go @GoogleDeepMind @UAlbertaCS @AmiiThinks !

Marcus Hutter @mhutter42

23 Aug 2023

What do you get when you cross modern Machine Learning with good old-fashioned Search? An IJCAI distinguished paper award 🙂 for Levin Tree Search with Context Models: aihub.org/2023/08/23/congrat…

1

2

36

9,285

Csaba Szepesvari @CsabaSzepesvari

30 Mar 2021

Representation learning and exploration in RL together? Aditya Modi got you covered! Details? Well, you should come to the next talk! For details visit: tinyurl.com/1gl2z6cc

2

41

Csaba Szepesvari @CsabaSzepesvari

22 Dec 2021

Advice for people thinking of registering an email address at CMT or other similar reviewing systems: Register an email that is NOT associated with your school/workplace. School and workplace change. Then you will end up with multiple identities, which is not what you want:)

2

1

43

Csaba Szepesvari @CsabaSzepesvari

31 Jan 2023

Very happy for this! What a spectacular future for @UAlberta / @UAlbertaCS and @AmiiThinks !

Nathan Sturtevant @nathansttt

31 Jan 2023

A packed house to hear @BFlanaganUofA from the @UAlberta and @AmiiThinks announce that 20 new faculty will be hired in AI across campus in the next 3 years, with 5 of these positions in CS.

1

40

3,939

Csaba Szepesvari @CsabaSzepesvari

2 May 2022

I hope everyone enjoyed ICLR. As promised, RL Theory seminars are back and we are super lucky to have Kwang-Sung Jun fixing our bad ideas about how to use Boltzmann exploration via the help of the mysterious "Maillard sampling" idea. Intrigued? Check out tinyurl.com/4wzdxb2m

8

40

Csaba Szepesvari @CsabaSzepesvari

8 Dec 2020

Why do we use softmax to represent policies? Could we use some other "transfer" function? Which one? Pros/cons? Come to see our posters to hear about the gravitational pull of softmax and how physicist are always right! I can't guarantee to be up at the time of the oral though:)

Reinforcement Learning and Artificial Intelligence @rlai_lab

8 Dec 2020

Come hear Jincheng Mei, Chenjun Xiao, @daibond_alpha, @LihongLi20, @CsabaSzepesvari, Dale Schuurmans talk about "Escaping the Gravitational Pull of Softmax" on Tuesday. Oral: 0715–0730 MST Poster: 10–12pm MST Link: nips.cc/virtual/2020/protect… #NeurIPS2020

1

2

37

Csaba Szepesvari @CsabaSzepesvari

19 Sep 2020

Ladies and gentlemen! We are delighted to give you OPPO, optimistic policy optimization (very much related to the previous talk by the way!) to achieve efficient and effective exploration with linear function approximation in finite horizon MDPs as presented by Zhuoran Yang!

RL Theory Virtual Seminars @RLtheory

19 Sep 2020

Our next talk: 09/22: Zhuoran Yang (Princeton) "Provably Efficient Exploration in Policy Optimization" For details, please see the website: sites.google.com/view/rltheo…

4

40

Csaba Szepesvari @CsabaSzepesvari

26 Mar 2022

Replying to @pcastr

SOMs are an awesome example of how curiosity driven research looks like. Neither neuroscience, nor solving any real problem. Yet, one can still write books about SOMs, think about them in various ways, etc. Sg to remember when judging relevance while reviewing!

2

38

Csaba Szepesvari @CsabaSzepesvari

24 Nov 2020

Our chance to stay positive during these dire times is to attend Simon's seminar tomorrow where I hope we learn that despite all other signs RL is not much harder than bandits. Long live RL, long live bandits!

RL Theory Virtual Seminars @RLtheory

22 Nov 2020

Our next talk: 11/24: Simon S. Du (University of Washington) "Is Reinforcement Learning More Difficult Than Bandits? A Near-optimal Algorithm Escaping the Curse of Horizon" For details, please see the website: sites.google.com/view/rltheo…

1

2

39

Csaba Szepesvari @CsabaSzepesvari

4 May 2020

Please join us and Matthieu to hear about breaking news about how averaging and regularization work together to make your RL algorithms go faster!

RL Theory Virtual Seminars @RLtheory

4 May 2020

Reminder: this talk is coming up tomorrow! ***Note that the talk starts at 4PM UTC, one hour earlier than our regular time slot*** Public YouTube link: piped.video/watch?v=DfJHL7Ij… Sign up for the talk on Google Meet: forms.gle/zXy2dpapg2PzHjvb9

3

38

Csaba Szepesvari @CsabaSzepesvari

30 Nov 2020

Huge congratulations to my colleagues at @DeepMind! This is a really awesome achievement!

Google DeepMind

@GoogleDeepMind

30 Nov 2020

In a major scientific breakthrough, the latest version of #AlphaFold has been recognised as a solution to one of biology's grand challenges - the “protein folding problem”. It was validated today at #CASP14, the biennial Critical Assessment of protein Structure Prediction (1/3)

40

Csaba Szepesvari @CsabaSzepesvari

18 Apr 2022

Huge improvements for the sample complexity of RL for representation learning in low-rank (linear) MDPs! How? Why? Really? Come check out the seminar of Masatoshi Uehara tomorrow! For details follow this link: tinyurl.com/5n9aedv5

2

39

Csaba Szepesvari @CsabaSzepesvari

4 Jul 2020

It is a great pleasure to have Fei Feng from UCLA speaking at our next seminar. Join us to learn about how to combine RL and unsupervised learning and keep everything provably efficient!

RL Theory Virtual Seminars @RLtheory

4 Jul 2020

Our next talk: 07/07: Fei Feng (UCLA) "Provably Efficient Exploration for RL with Unsupervised Learning" For details, please see the website: sites.google.com/view/rltheo…

5

37

Csaba Szepesvari @CsabaSzepesvari

2 Aug 2020

Join us on Tuesday to hear from Mengdi about the latest and greatest lower and upper bounds in off-policy evaluation with linear function approximation!

RL Theory Virtual Seminars @RLtheory

1 Aug 2020

Our next talk: 08/04: Mengdi Wang (Princeton / DeepMind) "Minimax-Optimal Off-Policy Evaluation with Linear Function Approximation" For details, please see the website: sites.google.com/view/rltheo…

6

37

Csaba Szepesvari @CsabaSzepesvari

6 Jun 2020

We are delighted to have Shie give the next RL Theory Virtual Seminar. I hope to see many of you online at the seminar.

RL Theory Virtual Seminars @RLtheory

6 Jun 2020

Our next talk: 06/09: Shie Mannor (Technion) "Adaptive Trust Region Policy Optimization: Global Convergence and Faster Rates for Regularized MDPs" For details, please see the website: sites.google.com/view/rltheo…

4

38

Csaba Szepesvari @CsabaSzepesvari

16 Jun 2020

Gentle reminder, this talk is happening tomorrow! I hope to see many of you there:)

RL Theory Virtual Seminars @RLtheory

13 Jun 2020

Our next talk: 06/16: Niao He (UIUC) "A Unified Switching System Perspective and O.D.E. Analysis of Q-Learning Algorithms" For details, please see the website: sites.google.com/view/rltheo…

7

37

Csaba Szepesvari @CsabaSzepesvari

26 Dec 2022

Replying to @CsabaSzepesvari @ylecun @nanjiang_cs

Perhaps better to focus on what needs to be done than on who is doing it or whether we call it RL or anything else. But I am glad you recognize that some sort of planning with models (or not?) will be needed! We are on the same page with this one. And Merry Christmas!! 2/2

2

1

38

2,846

Csaba Szepesvari @CsabaSzepesvari

6 Apr 2020

Yours truly talks RL.. Thanks @TalkRLPodcast /Robin for having me!!

TalkRL Podcast

@TalkRLPodcast

6 Apr 2020

Episode 10 @CsabaSzepesvari of DeepMind shares his views on Bandits, Adversaries, PUCT in AlphaGo / AlphaZero / MuZero, AGI and RL, what is timeless, and more! talkrl.com/episodes/csaba-sz…

2

37