Rohith Kuditipudi (@rckpudi) | nitter

Rohith Kuditipudi @rckpudi

4 Jul 2022

Interpolation (train to zero loss) often does well in high dim, yet may still be undesirable (e.g., security/privacy concerns). So is interpolation necessary for optimal generalization? In our COLT paper, we surprisingly find the answer is yes! arxiv.org/abs/2202.09889 (1/n)

Memorize to Generalize: on the Necessity of Interpolation in High...

We examine the necessity of interpolation in overparameterized models, that is, when achieving optimal predictive risk in machine learning problems requires (nearly) interpolating the training...

2

23

115

Rohith Kuditipudi @rckpudi

31 Jul 2023

Watermarking enables detecting AI-generated content, but existing strategies distort model output or aren't robust to edits. We offer a strategy for LMs that’s distortion-free (up to a max budget) *and* robust. arxiv.org/abs/2307.15593 w/ @jwthickstun @tatsu_hashimoto @percyliang

2

16

86

66,476

Rohith Kuditipudi @rckpudi

31 Jul 2023

Along with the paper, we've released a blog post featuring a public demo of our watermark (with code), using some examples of watermarked text we generated from LLaMA-7B. Try to break our watermark by editing the text! crfm.stanford.edu/2023/07/30…

1

7

1,050

Rohith Kuditipudi @rckpudi

31 Jul 2023

We validate our strategies with OPT-1.3B, LLaMA-7B, and Alpaca-7B. Our best watermark uses exp-min sampling (like Aaronson) to generate text from the key sequence and is detectable (p < 0.01) from 35 tokens even when 50% of the sequence is corrupted with random insertions!

1

4

915

Rohith Kuditipudi @rckpudi

24 Jul 2024

what a wonderful, slick result! has me wondering what the all-time record is for shortest phd thesis...

Aaron Roth @Aaroth

7 Jul 2024

A quick thread on a short (3 page) paper, giving a simple algorithm that makes predictions guaranteeing 2*Sqrt{T} "Distance to calibration" against an adversary. The algorithm and proof are so simple I can describe it in thread. Joint with Eshwar, @natalie_collina, and Mirah:

4

833

Rohith Kuditipudi @rckpudi

4 Jul 2022

We take inspiration from a wonderful line of work originated by @vitalyFM and others, who construct certain combinatorial, heavy-tailed settings in which various notions of memorization (i.e., stability-based and information-theoretic) are provably necessary to learn. (6/n)

2

2

4

Rohith Kuditipudi @rckpudi

20 Sep 2022

Replying to @StanfordHAI @_jasonwei @RishiBommasani

link seems down

1

3

Rohith Kuditipudi @rckpudi

31 Jul 2023

We generate text from a fixed "watermark key sequence". The detector, who knows the full sequence, can align it to a text to verify the watermark. Until we reuse part of the sequence, the text is indistinguishable from regular text (i.e., distortion-free).

1

1

3

872

Rohith Kuditipudi @rckpudi

31 Jul 2023

Other watermarking strategies (e.g., Kirchenbauer et al.; Aaronson) hash the previous k-1 tokens to determine the next token. Larger k makes the bias toward certain k-grams less noticeable but hurts robustness (replacing 1/k tokens breaks detection). We avoid this trade-off.

1

3

632

Rohith Kuditipudi @rckpudi

19 May 2025

super excited for what's to come!

Percy Liang

@percyliang

19 May 2025

What would truly open-source AI look like? Not just open weights, open code/data, but *open development*, where the entire research and development process is public *and* anyone can contribute. We built Marin, an open lab, to fulfill this vision:

3

851

Rohith Kuditipudi @rckpudi

19 Aug 2025

great work!

lily clifford

@lilyjclifford

19 Aug 2025

🚀 Arcana v2 is here. Rime’s next-gen TTS makes voice AI sound truly human. More languages. More realism. More deployment options. 🧵👇

1

2

476

Rohith Kuditipudi @rckpudi

4 Jul 2022

We quantify the cost of not exactly fitting the training data via an optimization problem over learners: min. test error s.t. train error > ε^2. Even for ε quadratically smaller than the label noise variance σ^2, any feasible learner must suffer increased test error. (4/n)

1

2

Rohith Kuditipudi @rckpudi

4 Jul 2022

There are lots of exciting open questions. Even for adjacent settings such as kernel regression and linear classification, we think obtaining similar results will require new approaches. (9/n, n=9)

2

Rohith Kuditipudi @rckpudi

18 Sep 2022

Replying to @DimitrisPapail @bneyshabur

@DimitrisPapail we have a fairly simple counterexample (existence of disconnected global minima) that applies to two-layer nets of any width, though unclear re: reachability via SGD (see Section 5 of arxiv.org/abs/1906.06247)

1

2

Rohith Kuditipudi @rckpudi

19 Jul 2025

Replying to @jasondeanlee @DimitrisPapail

do we know whether lean was part of the training process? eg, 1. learn to solve problems in lean, 2. learn to translate lean proofs to natural language while preserving semantics. (wouldn't be surprised if training included some data like this)

1

2

318

Rohith Kuditipudi @rckpudi

30 Sep 2024

Replying to @thegautamkamath

observe "arm's reach" << "arm's length" since the former implicitly requires flexing the fingers for grasping

2

158

Rohith Kuditipudi @rckpudi

4 Jul 2022

Our result is the converse of the benign overfitting phenomenon in linear regression: interpolation is both necessary and sufficient to generalize. (8/n)

1

2

Rohith Kuditipudi @rckpudi

11 Jan 2023

when you unconsciously start new-lining every sentence in an email thinking it's latex...

2

625

Rohith Kuditipudi @rckpudi

31 Jul 2023

Check out the blog post and paper for more details. For practitioners: the fast runtime and distortion-free nature of our watermark strategies makes it a pretty lightweight intervention. We include concrete recommendations for deployment at the end of the paper.

1

2

603

Rohith Kuditipudi @rckpudi

4 Jul 2022

In particular, we consider overparameterized linear regression as a case study. Our main result implies it is provably necessary to overfit to the label noise in order to achieve optimal generalization. (3/n)

1

2

Rohith Kuditipudi @rckpudi

4 Jul 2022

Our particular motivation for studying linear regression (a.k.a., every statistician's first favorite problem) was to identify the most simple and generic (and continuous) lens through which we might understand the value of memorizing/interpolating training data. (7/n)

1

2

Rohith Kuditipudi @rckpudi

4 Jul 2022

Key to our analysis: despite nonconvexity due to the training error constraint, strong duality actually holds for the problem! Thus we can exactly characterize the optimal solution, and use random matrix theory to obtain exact closed form limits for train and test error. (5/n)

1

2

Rohith Kuditipudi @rckpudi

26 Jun 2023

Replying to @DimitrisPapail @random_walker @llm_sec

option 1: ask chatGPT for the article, option 2: ask chatGPT how to switch off javascript. all roads lead to rome

1

2

518

Rohith Kuditipudi @rckpudi

26 Apr 2020

looks like the Internet is here to stay, so I've finally decided to hop on the bandwagon. #helloworld

2

Rohith Kuditipudi @rckpudi

21 Jul 2025

Replying to @GaryMarcus @jasondeanlee @DimitrisPapail

why reserve judgement? regardless of how the model was trained, the result itself is amazing!

1

30

Rohith Kuditipudi @rckpudi

15 Dec 2023

cool result!

117

Rohith Kuditipudi @rckpudi

2 Jun 2022

Replying to @tsiprasd @aleks_madry @TheOfficialACM

Congrats!!

1

1

Rohith Kuditipudi @rckpudi

4 Jul 2022

Joint work with Chen Cheng and our advisor John Duchi, who are not on Twitter. (2/n)

1

1

Rohith Kuditipudi @rckpudi

14 Nov 2023

Replying to @jhasomesh

Thanks for having me!

1

100

Rohith Kuditipudi @rckpudi

12 Jul 2021

Replying to @garydient

Where can I sign up for the beta trial?

1

Rohith Kuditipudi @rckpudi

27 Aug 2021

Replying to @garydient @scump

@TomBrady drinks half his bodyweight in fluid ounces a day

1

Rohith Kuditipudi @rckpudi

31 Jul 2023

Replying to @acidflask

haha we did not. but this is a great illustration of why exact p-values are important... we know 2% of non-watermarked text will give p < 0.02, which allows us to make an informed decision over whether or not a text is watermarked

1

22

Rohith Kuditipudi @rckpudi

18 Sep 2024

Replying to @beenwrekt @BarneyFlames @JSEllenberg

but what even is n in the original context?

1

1

77

Rohith Kuditipudi @rckpudi

21 Mar 2021

Replying to @garydient

Oh captain my captain.

1

Rohith Kuditipudi @rckpudi

15 Jan 2022

Replying to @rohanalchemist @tomgoldsteincs

1

Rohith Kuditipudi @rckpudi

6 Jul 2021

Replying to @florian_tramer @Stanford @danboneh @CSatETH @GoogleAI @aterzis

Congrats!!

1