NLP researcher. Assistant Professor at NYU CS & CDS.

Unbelievable. This quote is blatantly false and unnecessary for the argument. And she surely had expected the backlash with the patronizing NOTE. This is racism, not "cultural generalization". @NeurIPSConf
Mitigating racial bias from LLMs is a lot easier than removing it from humans! Can’t believe this happened at the best AI conference @NeurIPSConf We have ethical reviews for authors, but missed it for invited speakers? 😡
7
24
283
24,356
Excited to join NYU!
16
21
245
@kchonyc and I are hiring a post-doc. Come help us figure out how an agent can learn by reading manuals and watching videos! Looking for expertise in multimodal reasoning, few-shot learning, QA/dialogue. Get in touch or apply at apply.interfolio.com/92494
4
29
133
Reward hacking means the model is making less effort than expected: it finds the answer long before its fake CoT is finished. TRACE uses this idea to detect hacking when CoT monitoring fails. Work led by @XinpengWang_ @nitishjoshi23 and @rico_angell👇
4
11
130
24,367
Have LLMs mastered deductive reasoning? Check out PrOntoQA-OOD, a synthetic dataset using a complete set of deduction rules. arxiv.org/abs/2305.15269 Stop by the poster on Wed at 10:45-12:45 and ask Abu Saparov all about reasoning (w or w/o LLMs)! #NeurIPS2023
3
22
114
25,101
Automating AI research is bottlenecked by verification speed (running experiments takes time). Our new paper explores whether LLMs can tell which ideas will work before executing them, and they appear to have better research intuition than human researchers.
Most promising-looking AI research ideas don’t pan out, but testing them burns through compute and labor. Can LMs predict idea success without running any experiments? We show that they do it better than human experts!
5
11
112
11,328
Thanks @_jasonwei for a fantastic and timely lecture! We had a full house and half an hour discussion. Stay tuned for @hwchung27 's lecture on RLHF in two weeks (nyu-cs2590.github.io/spring2…)!
I gave an invited lecture at New York University for @hhexiy's class! I covered three ideas driving the LLM revolution: scaling, emergence, and reasoning. I tried to frame them in a way that reveals why large LMs are special in the history of AI. Slides: docs.google.com/presentation…
4
13
99
41,985
If you are interested in truthfulness/interpretability of LLMs, chat with @javirandor at #NeurIPS2023 !
🧵 New paper: “Personas as a Way to Model Truthfulness in Language Models” We introduce empirical evidence suggesting LLMs may use “personas” to model truthfulness and improve generalization. arxiv.org/abs/2310.18168
7
99
14,655
Congratulations again @thtrieu_ ! Thanks for bringing me on this quest and can't wait to see the next rabbit you pull!
Congratulations to Trieu Trinh (@thtrieu_ ) on the launch of AlphaGeometry, an AI system capable of solving Olympiad-level geometry problems. Advised by He He (@hhexiy), Dr. Trinh defended his doctoral dissertation on this topic just last week!
3
1
94
18,035
@haizelabs is one of the few truly tackling the hard problem of LLM eval and oversight. Excited to support their mission!
We are thrilled to welcome Professor He He @hhexiy as an advisor to the Haize Labs team! Professor He leads a group at NYU focused on evaluation, scalable oversight, human–AI collaboration, and reasoning.
5
2
68
9,079
It’d be great if the ARR @ReviewAcl meta review provides two scores, one on significance of ideas/results and one on revisions needed. The two are kind of conflated now; what should be the score of a perfectly-executed, low-impact paper?
1
1
52
New work on OOD detection with @uditarora09 & @WillHuang93! OODs are notoriously hard to define. We try to construct realistic pairs of ID/OOD sets and find that they reveal distinct failure modes of different detection methods.
1/5 New paper @emnlpmeeting! “Types of Out-of-distribution Texts and How to Detect Them” with @WillHuang93 and @hhexiy: arxiv.org/abs/2109.06827. TL;DR: Our results call for an explicit definition of OOD examples when evaluating different detection methods.
1
5
41
Check out Jiaxin's work on how RLHFed model excels at impressing humans, not the actual tasks!
RLHF is a popular method. It makes your human eval score better and Elo rating 🚀🚀. But really❓Your model might be “cheating” you! 😈😈 We show that LLMs can learn to mislead human evaluators via RLHF. 🧵below
7
37
5,637
LM as meta learners! Fun collaboration with @yanda_chen_ at @awscloud: The GPT-3 paper claims that LM pretraining implicitly performs meta-learning. We explicitly tuned LMs to learn from in-context examples. Excited to see how LM may be used for other meta learning tasks!
[1/8] Large LMs (e.g.,GPT-3) are good at few-shot learning. But prompting LMs exhibit artifacts like oversensitivity to example order/choice & instruction wording.☹️ Our work “Meta-learning via LM In-context Tuning” proposes a fix to meta-train LMs to learn in-context learning!😃
2
36
Are we really improving the faithfulness of the summary or just copying more content from the document? Check out our new work on generating both abstractive and faithful summaries!
Checkout our new paper: “Faithful or Extractive? On Mitigating the Faithfulness-Abstractiveness Trade-off in Abstractive Summarization”. arxiv.org/pdf/2108.13684.pdf #NLProc 1/n
4
35
Excited to share new work with @nitishjoshi23! Counterfactual data augmentation is a promising idea to robustify models and we were confused by its small gains on some OOD generalization tasks. It turns out that diversifying the perturbations is the key.
Excited to share our new work on analyzing counterfactual data augmentation - arxiv.org/abs/2107.00753. Thread 👇- 1/8
1
31
Come to Nick's poster if you're at #COLM2025 and learn about how to run LLM experiments the scientific way!
LLMs are expensive—experiments cost a lot, mistakes even more. How do you make experiments cheap and reliable? By using hyperparameters' empirical structure. @kchonyc, @hhexiy, and I show you how in Hyperparameter Loss Surfaces Are Simple Near their Optima at #COLM2025! 🧵1/9
4
31
9,196
What assumptions should we make about spurious correlations? Check out @nitishjoshi23 and Xiang's work on the nuances of spurious features in natural language.
The term 'spurious correlations' is often used informally in NLP to denote any undesirable feature-label correlations. But are all spurious features alike? #EMNLP2022 paper tries to address this question through a causal lens - arxiv.org/abs/2210.14011 (w/ Xiang & @hhexiy)
4
30
Interested in AI-assisted creativity and writing assistants? Come to @vishakh_pk's talk at the NILLI workshop at 14:45 AST Wed #EMNLP2021
New work on Machine-in-the-Loop Creative Writing w/ my advisor @hhexiy - arxiv.org/abs/2111.04193 (1/9) #NLProc
6
30
Talking to ChatGPT isn’t like talking to a collaborator yet. It doesn’t track what you really want to do—only what you just said. Check out work led by @jcyhc_ai and @rico_angel that shows how attackers can exploit this, and a simple fix: just look at more context!
LLMs won’t tell you how to make fake IDs—but will reveal the layouts/materials of IDs and make realistic photos if asked separately. 💥Such decomposition attacks reach 87% success across QA, text-to-image, and agent settings! 🛡️Our monitoring method defends with 93% success! 🧵
2
6
26
5,174
Are LMs able to generate truly novel sequences? Check out @vishakh_pk and @yzpang_ 's work on extrapolative generation. I had fun working on protein sequences thanks to @ank_parikh!
Super excited to share our #ICML2023 work on *extrapolative* controlled generation. Using iterative editing, our method is able to generate text and protein sequences with attribute values outside the training region Paper: arxiv.org/abs/2303.04562 w/ @yzpang_ @hhexiy @ank_parikh
7
26
5,806
Very cool work on NLG evaluation! Among other things, I really like that the authors did extensive work to make sure that the human eval itself is reliable.
Is human evaluation really necessary? If so, what can we do to make it easier or cheaper to conduct human evaluations? Our latest paper at #acl2018 sheds light on both of these questions. Read more at arun.chagantys.org/technical… or visit me at poster session 1E!
1
7
24
Thanks @hwchung27 for a wonderful lecture and a perfect conclusion of our course!
I gave an invited lecture on Instruction finetuning and RLHF for @hhexiy 's class at NYU. One unique perspective of my lecture is that I introduce RLHF as an instance of using a learned objective function. Video: piped.video/zjrM-MW-0y0 Slides: docs.google.com/presentation…
2
16
4,563
tagging the correct @rico_angell !
6
859
Had a lot of fun chatting with folks at @wing_nus!
Our @wing_nus NLP summer seminar series hosted He He from @nyudatascience to discuss her works on spurious correlations and how to combat it! Seminar slides and video (w/ permission) ☞ ☞: wing-nus.github.io/nlp-semin… piped.video/watch?v=iiXrsdJE… #NLProc
10
Replying to @jxmnop
Most people can't perceive that subtlety beyond a certain level.
5
631
The space of machine-assisted writing has grown quickly in the past year. How well do LLMs follow user instruction in controlled generation? Check out @vishakh_pk and @TuhinChakr 's work and read some fun poems copoet-emnlp.github.io/direc…!
Super excited to share our work, "Help me write a poem: Instruction Tuning as a Vehicle for Collaborative Poetry Writing" to appear @emnlpmeeting, w/ @TuhinChakr and @hhexiy Paper: arxiv.org/abs/2210.13669 Project Website: copoet-emnlp.github.io/ #emnlp2022 #NLProc
3
9
Joint work with @yzpang_ @vishakh_pk @nitishjoshi23 Mehran Kazemi @najoungkim
5
1,116
Replying to @LerrelPinto
looks super cool!
4
777
Replying to @ylecun
Thanks!
2
Can't wait to see what you'll do next!
2
41
Thanks! Ren et al, 2019 (proceedings.neurips.cc/paper…) uses a similar definition. I would say it's a special case of covariate shift as we differentiate shifts in different parts of x.
1
1
Replying to @qisun0
congrats!
1
Interesting! Yes, that's our intuition; we only got to try with linear classifiers though. Thanks for the pointer!
1
1
congrats!!
1
1,687
Replying to @snigdhac25 @shsriva
Congratulations!!!
1
Replying to @Diyi_Yang @Stanford
Congrats!!
1
Thanks for hosting! It was fun talking with you all.
1