Interpolation (train to zero loss) often does well in high dim, yet may still be undesirable (e.g., security/privacy concerns). So is interpolation necessary for optimal generalization? In our COLT paper, we surprisingly find the answer is yes! arxiv.org/abs/2202.09889 (1/n)
2
23
115
Watermarking enables detecting AI-generated content, but existing strategies distort model output or aren't robust to edits. We offer a strategy for LMs that’s distortion-free (up to a max budget) *and* robust. arxiv.org/abs/2307.15593 w/ @jwthickstun @tatsu_hashimoto @percyliang
2
16
86
66,476
Along with the paper, we've released a blog post featuring a public demo of our watermark (with code), using some examples of watermarked text we generated from LLaMA-7B. Try to break our watermark by editing the text! crfm.stanford.edu/2023/07/30…
1
7
1,050
We validate our strategies with OPT-1.3B, LLaMA-7B, and Alpaca-7B. Our best watermark uses exp-min sampling (like Aaronson) to generate text from the key sequence and is detectable (p < 0.01) from 35 tokens even when 50% of the sequence is corrupted with random insertions!
1
4
915
what a wonderful, slick result! has me wondering what the all-time record is for shortest phd thesis...
A quick thread on a short (3 page) paper, giving a simple algorithm that makes predictions guaranteeing 2*Sqrt{T} "Distance to calibration" against an adversary. The algorithm and proof are so simple I can describe it in thread. Joint with Eshwar, @natalie_collina, and Mirah:
4
833
We take inspiration from a wonderful line of work originated by @vitalyFM and others, who construct certain combinatorial, heavy-tailed settings in which various notions of memorization (i.e., stability-based and information-theoretic) are provably necessary to learn. (6/n)
2
2
4
We generate text from a fixed "watermark key sequence". The detector, who knows the full sequence, can align it to a text to verify the watermark. Until we reuse part of the sequence, the text is indistinguishable from regular text (i.e., distortion-free).
1
1
3
872
Other watermarking strategies (e.g., Kirchenbauer et al.; Aaronson) hash the previous k-1 tokens to determine the next token. Larger k makes the bias toward certain k-grams less noticeable but hurts robustness (replacing 1/k tokens breaks detection). We avoid this trade-off.
1
3
632
super excited for what's to come!
What would truly open-source AI look like? Not just open weights, open code/data, but *open development*, where the entire research and development process is public *and* anyone can contribute. We built Marin, an open lab, to fulfill this vision:
3
851
great work!
🚀 Arcana v2 is here. Rime’s next-gen TTS makes voice AI sound truly human. More languages. More realism. More deployment options. 🧵👇
1
2
476
We quantify the cost of not exactly fitting the training data via an optimization problem over learners: min. test error s.t. train error > ε^2. Even for ε quadratically smaller than the label noise variance σ^2, any feasible learner must suffer increased test error. (4/n)
1
2
There are lots of exciting open questions. Even for adjacent settings such as kernel regression and linear classification, we think obtaining similar results will require new approaches. (9/n, n=9)
2
@DimitrisPapail we have a fairly simple counterexample (existence of disconnected global minima) that applies to two-layer nets of any width, though unclear re: reachability via SGD (see Section 5 of arxiv.org/abs/1906.06247)
1
2
do we know whether lean was part of the training process? eg, 1. learn to solve problems in lean, 2. learn to translate lean proofs to natural language while preserving semantics. (wouldn't be surprised if training included some data like this)
1
2
318
Replying to @thegautamkamath
observe "arm's reach" << "arm's length" since the former implicitly requires flexing the fingers for grasping
2
158
Our result is the converse of the benign overfitting phenomenon in linear regression: interpolation is both necessary and sufficient to generalize. (8/n)
1
2
when you unconsciously start new-lining every sentence in an email thinking it's latex...
2
625
Check out the blog post and paper for more details. For practitioners: the fast runtime and distortion-free nature of our watermark strategies makes it a pretty lightweight intervention. We include concrete recommendations for deployment at the end of the paper.
1
2
603
In particular, we consider overparameterized linear regression as a case study. Our main result implies it is provably necessary to overfit to the label noise in order to achieve optimal generalization. (3/n)
1
2
Our particular motivation for studying linear regression (a.k.a., every statistician's first favorite problem) was to identify the most simple and generic (and continuous) lens through which we might understand the value of memorizing/interpolating training data. (7/n)
1
2
Key to our analysis: despite nonconvexity due to the training error constraint, strong duality actually holds for the problem! Thus we can exactly characterize the optimal solution, and use random matrix theory to obtain exact closed form limits for train and test error. (5/n)
1
2
option 1: ask chatGPT for the article, option 2: ask chatGPT how to switch off javascript. all roads lead to rome
1
2
518
looks like the Internet is here to stay, so I've finally decided to hop on the bandwagon. #helloworld
2
why reserve judgement? regardless of how the model was trained, the result itself is amazing!
1
30
Joint work with Chen Cheng and our advisor John Duchi, who are not on Twitter. (2/n)
1
1
Replying to @jhasomesh
Thanks for having me!
1
100
Replying to @garydient
Where can I sign up for the beta trial?
1
Replying to @garydient @scump
@TomBrady drinks half his bodyweight in fluid ounces a day
1
Replying to @acidflask
haha we did not. but this is a great illustration of why exact p-values are important... we know 2% of non-watermarked text will give p < 0.02, which allows us to make an informed decision over whether or not a text is watermarked
1
22
but what even is n in the original context?
1
1
77
Replying to @garydient
Oh captain my captain.
1