@tobias_schrdr and I are excited to share WildCat: Near-Linear Attention in Theory and Practice arxiv.org/abs/2602.10056 By attending over a spectrally-accurate optimally-weighted coreset, WildCat approximates exact attention with super-polynomial error decay in near-linear time
4
11
64
9,001
If you're a PhD student interested in interning with me or one of my amazing colleagues at Microsoft Research New England (@MSRNE, @MSFTResearch) this summer, please apply here jobs.careers.microsoft.com/g…
12
80
395
86,491
If you're a PhD student interested in interning with me or one of my amazing colleagues at Microsoft Research New England (@MSRNE, @MSFTResearch) this summer, please apply here jobs.careers.microsoft.com/g… (If you'd like to work with me, please include my name in your cover letter!)
8
67
424
52,213
If you're a PhD student interested in interning with me or one of my amazing colleagues at Microsoft Research New England (@MSRNE, @MSFTResearch) this summer, please apply here jobs.careers.microsoft.com/g…
6
65
308
48,081
I just want to return my package to Whole Foods 😭
10
16
292
36,848
If you're a PhD student interested in interning with me or one of my amazing colleagues at MSR New England this summer, please apply here careers.microsoft.com/us/en/…
4
65
273
If you're a PhD student interested in interning with me or one of my amazing colleagues at MSR New England this summer, please apply here careers.microsoft.com/us/en/…
6
79
229
Why permute when you can cheaply permute?
2
21
196
28,660
Introducing CheatGPT
5
13
146
If you’d like to join Microsoft Research New England as a researcher in AI / ML / statistics, please apply here: jobs.careers.microsoft.com/g… @MSFTResearch @MSRNE
1
24
129
23,200
New guarantees for approximating attention, accelerating SGD, and testing sample quality in near-linear time
1
13
136
25,762
Call for Machine Learning, AI, & Statistics Researchers at Microsoft Research New England @MSRNE : careers.microsoft.com/us/en/…
2
39
116
If you're a recent or graduating PhD student interested in postdocing with the ML / stats team at MSR New England, please apply here aka.ms/ml-postdoc-msrne
1
37
93
Call for machine learning and statistics postdocs at Microsoft Research New England @MSFTResearch careers.microsoft.com/us/en/…
18
72
See you next year @NeurIPSConf !
1
3
53
8,865
Thanks @NeurIPSConf — it’s been a blast. See you all at next year’s conference!
1
37
4,034
Replying to @srush_nlp
For me the Netflix Prize was a bellwether of many ML trends: all of the leading teams used SGD and low-rank approximation for scalable non-convex optimization, trained neural networks for both modeling and ensembling, and fit billion-parameter models to get the best performance
1
4
36
9,918
New @NatureComms work with Soukayna Mouatadid, Paulo Orenstein, @GFlaspohler, @judah47, Miruna Oprescu, Ernest Fraenkel, @UofT, @MSFTResearch, @MSRNE, @AER_inc, @MIT, @SpringerNature Adaptive bias correction for improved subseasonal forecasting rdcu.be/deAJW
1
6
33
6,037
If you're a PhD student interested in a spring internship exploring fairness in clinical trials and A/B experimentation, please apply here careers.microsoft.com/studen…
7
26
I've decided to solve the most pressing problems of our time using language models, for example,
1
20
2,163
Replying to @docmilanfar
This follow-up article also surveys recent applications in probabilistic inference, computational statistics, and machine learning arxiv.org/pdf/2105.03481
3
19
3,010
Replying to @minilek
I’ll never be too old for this:
1
17
1,207
If you’re attending @COLM_conf this week, check out Self-Taught Optimizer (STOP): Recursively Self-Improving Code Generation (w/ @ericzelikman @AdamKalai)
1
17
1,198
Well, it was worth a try
1
1
14
1,350
Check out this amazing work by Myra Cheng! (@mariadearteaga, @adamfungi, and I helped too)
📣New research📣"Social Norm Bias: Residual Harms of Fairness-Aware Algorithms” led by undergrad Myra Cheng (applying to PhDs soon!) w/ @adamfungi @lestermackey🧵 arxiv.org/abs/2108.11056
1
12
Replying to @aminkarbasi
Yeah! It somehow knew that I wanted to buy a statistics for social good t-shirt, mug, sticker, and book
11
1,498
Don’t just improve your code; improve your code improver!
“Recursive self-improvement” (RSI) is one of the oldest ideas in AI. Can language models write code that recursively improves itself? Self-Taught Optimizer (STOP): Recursively Self-Improving Code Generation w/@elianalorch, @LesterMackey, @adamfungi (1/n)
4
11
5,787
For this edition of #TuesdayTraillazers, meet 2003 Science Talent Search and International Science and Engineering Fair alum, @LesterMackey
1
11
Replying to @peteratmsr
Thanks Peter!
3
66
Also curious how @amazon placed me at Morehouse 🤔
8
1,618
Exciting new work with @raazdwivedi; let us know what you think!
Super excited to present “Kernel Thinning”, a new procedure for compressing a distribution more effectively than i.i.d. sampling or standard MCMC thinning. w/ @LesterMackey, today 2PM ET@#MCM, tomorrow 10AM/Noon@#COLT2021 arxiv.org/abs/2105.05842 Video: learningtheory.org/colt2021/…
1
9
Only in LA 🐢
1
9
521
Could this be our future? #ChADGPT
1
1
9
My first tweet 😀
Our #ICML2021 paper unifies optimistic and delayed online learning to develop optimal algorithms with no hyperparameters to tune (w/ applications in subseasonal forecasting) arxiv.org/abs/2106.06885 w/ @bremen79 @judah47 S Mouatadid @MirunaOprescu P Orenstein @LesterMackey 🧵
1
7
I’m pretty sure this is what they designed Sora 2 for (sound on) @raazdwivedi @AShettyV
1
8
755
Come intern with us at @MSRNE!
The Microsoft Research Undergraduate Internship Program offers 12-week internships in our Redmond, NYC, or New England labs for rising juniors and seniors who are passionate about technology. Apply by October 6: msft.it/6015scgSJ
1
8
1,521
The competition also ushered in the new era of parallel and distributed ML. My teammates and I bought some of the first Intel quad-core processors to work on the competition, and I remember writing some of the first Spark code with @matei_zaharia to scale beyond that.
8
542
Replying to @sp_monte_carlo
There’s a related inequality in arxiv.org/abs/1201.6002
8
519
Replying to @KevinKaichuang
Do you mean overall or per day?
1
6
497
Inspired in no small way by ChaatGPT @matei_zaharia
5
Replying to @risteski_a
This looks great @risteski_a ! See also Minimum Stein Discrepancy Estimators (arxiv.org/pdf/1906.08283.pdf) for an analysis of score matching (and related procedures like diffusion score matching)
2
6
@immonica please see the response from @NeurIPSConf here:
NeurIPS acknowledges that the cultural generalization made by the keynote speaker today reinforces implicit biases by making generalisations about Chinese scholars. This is not what NeurIPS stands for. NeurIPS is dedicated to being a safe space for all of us. We want to address the comment made during the invited talk this afternoon, as it is something that NeurIPS does not condone and it doesn't align with our code of conduct. We are addressing this issue with the speaker directly. NeurIPS is dedicated to being a diverse and inclusive place where everyone is treated equally.
1
6
1,410
Wow, I can’t believe it’s been two years… especially because I still haven’t seen you at work
1
4
Replying to @docmilanfar
Andrew Barbour’s generator method also gives you a practical way to create Stein operators for any distribution
6
306
Replying to @zdhnarsil
Here’s the rap version (sound on)
2
1
6
542
Replying to @irenetrampoline
Wait, how do you get a mug?
1
5
650
Well this is surprising #ChatGPT
3
5
So, which posters should I check out at NeurIPS Tuesday poster session 2?
2
4
1,749
I was watching the season finale of Severance and didn’t want the screenshot noise to ruin the experience
4
321
Undergrads, get your MSR internship applications in by November 6!
Come do an undergraduate research internship at MSR! microsoft.com/en-us/research…
3
1,037
Replying to @abeirami
This reminds me of the tool that Joel Tropp used to derive "intrinsic dimension" concentration inequalities for random matrices in apps.dtic.mil/sti/tr/pdf/ADA…
1
4
295
Replying to @LihongLi20
I have a feature request 😀
I just want to return my package to Whole Foods 😭
3
960
These letters often come from a PhD advisor or other collaborators or research mentors who can speak to your research experience
1
3
3,383
Love this! Congratulations again! (Also, this is tweet number 34)
4
Replying to @MuratAErdogdu
Great suggestion!
3
94
Congratulations Gesine; your work is inspirational!
1
3
140
Replying to @fx_briol
Is this pronounced “Doctor MMD”? (I hope so)
1
3
124
Replying to @roydanroy
MSR New Englanders are exceptionally collaborative, so my colleagues are my collaborators!
3
Replying to @KevinKaichuang
Sometimes I wear jeans with a belt
3
3
Imagining Ne Zha 3
1
3
526
Replying to @daniela_witten
Reminds me of when someone opened our package of microwave popcorn and left half of the bags for us to enjoy
3
562
They can make up whatever backstory they’d like if it saves me a trip to Kohls.
2
273
Replying to @lorin_crawford
Thanks Lorin!
3
143
I still remember our conversations Franklin, and I’m glad to see all of the success the that The Black List has had!
1
211
I’d like to pretend that added it on purpose, but, in reality, I accidentally took the screenshot while pressing the mute button
1
3
356
Call for NeurIPS Ethics Reviewers neurips.cc/Conferences/2023/…
2
1
1,506
Thanks! I'm surprised no one has commented on the Mute symbol 😀
1
3
688
I’m proud to say that I’ve shared two houses and two initials with this guy
1
2
241
Replying to @nancybaym
Thanks @nancybaym! I'm looking forward to h̶a̶n̶g̶i̶n̶g̶ ̶o̶u̶t̶ working with you in person one day soon
3
In the #realworldproblem motivating this question, my only option was a Church's Chicken in Costa Rica
1
79
Replying to @KevinKaichuang
I think you mean a whole nother level
3
1,052
Introducing CandidGPT
3
Replying to @lmeyerov
Thanks LM1! I still miss our houses
2
142
Replying to @james_y_zou
Go James go!
1
230
Replying to @YuanqiD @zdhnarsil
I’ve been searching for the killer app, and I think rapping about CCDD might be it!
2
43
How to catch a man (according to LaMDA) #aitestkitchen
1
1
2
I'm very curious about the <mystery> hire!
2
Yes, it is open to international students
1
1
732
Replying to @elmelis
Brilliant! I’ll try that next
1
2
1,131
tl;dr: A low-cost machine learning correction for physics-based dynamical models improves subseasonal forecasting of temperature and precipitation two to six weeks ahead.
2
269
Replying to @kklmmr
Thanks Konstantin!
1
133
Replying to @KevinKaichuang
On a related note, would you mind hiring me as an intern?
1
2
Replying to @KevinKaichuang
The leaf sheep finally has some competition! boredpanda.com/leaf-sheep-se…
2
Replying to @sigact
Congratulations @minilek !
1
450
Replying to @shortstein
The matrix Hoeffding inequality (Cor. 4.2 of arxiv.org/pdf/1201.6002.pdf) would give you P( twonorm(sum_i X_i) >= t ) <= (d+1) exp( - t^2 / sum_i c_i^2 ) . There are also ways to get rid of that d multiplier
2
628
Replying to @KevinKaichuang
You will be missed!
1
2
268
Replying to @KevinKaichuang
Oh he knows -- they always know
1
2