He He · Mar 25, 2026 · 1:42 AM UTC

He He

Pinned Tweet

He He

@hhexiy

Mar 25

x.com/i/article/203656758401…

What research looks like with agents

I recently gave Codex a real research problem and let it run for hours. The result surprised me. My original goal was modest: I mostly wanted to see how long I could make it run productively on my

128

879

118,235

He He · Dec 14, 2024 · 3:11 PM UTC

He He

@hhexiy

14 Dec 2024

Unbelievable. This quote is blatantly false and unnecessary for the argument. And she surely had expected the backlash with the patronizing NOTE. This is racism, not "cultural generalization". @NeurIPSConf

Jiao Sun

@sunjiao123sun_

14 Dec 2024

Mitigating racial bias from LLMs is a lot easier than removing it from humans! Can’t believe this happened at the best AI conference @NeurIPSConf We have ethical reviews for authors, but missed it for invited speakers? 😡

283

24,356

He He · Jun 5, 2018 · 10:19 PM UTC

He He

@hhexiy

5 Jun 2018

Excited to join NYU!

Yann LeCun

@ylecun

5 Jun 2018

Welcome to NYU He He! facebook.com/yann.lecun/post…

245

He He · Aug 17, 2021 · 3:24 AM UTC

He He

@hhexiy

17 Aug 2021

@kchonyc and I are hiring a post-doc. Come help us figure out how an agent can learn by reading manuals and watching videos! Looking for expertise in multimodal reasoning, few-shot learning, QA/dialogue. Get in touch or apply at apply.interfolio.com/92494

133

He He · Oct 14, 2025 · 5:32 PM UTC

He He

@hhexiy

14 Oct 2025

Reward hacking means the model is making less effort than expected: it finds the answer long before its fake CoT is finished. TRACE uses this idea to detect hacking when CoT monitoring fails. Work led by @XinpengWang_ @nitishjoshi23 and @rico_angell👇

This Post is from an account that no longer exists.

130

24,367

He He · Dec 12, 2023 · 2:01 AM UTC

He He

@hhexiy

12 Dec 2023

Have LLMs mastered deductive reasoning? Check out PrOntoQA-OOD, a synthetic dataset using a complete set of deduction rules. arxiv.org/abs/2305.15269 Stop by the poster on Wed at 10:45-12:45 and ask Abu Saparov all about reasoning (w or w/o LLMs)! #NeurIPS2023

114

25,101

He He · Jun 4, 2025 · 8:15 AM UTC

He He

@hhexiy

4 Jun 2025

Automating AI research is bottlenecked by verification speed (running experiments takes time). Our new paper explores whether LLMs can tell which ideas will work before executing them, and they appear to have better research intuition than human researchers.

Jiaxin Wen

@jiaxinwen22

3 Jun 2025

Most promising-looking AI research ideas don’t pan out, but testing them burns through compute and labor. Can LMs predict idea success without running any experiments? We show that they do it better than human experts!

112

11,328

He He · Apr 14, 2023 · 1:40 AM UTC

He He

@hhexiy

14 Apr 2023

Thanks @_jasonwei for a fantastic and timely lecture! We had a full house and half an hour discussion. Stay tuned for @hwchung27 's lecture on RLHF in two weeks (nyu-cs2590.github.io/spring2…)!

Calendar

Listing of course modules and topics.

nyu-cs2590.github.io

Jason Wei

@_jasonwei

13 Apr 2023

I gave an invited lecture at New York University for @hhexiy's class! I covered three ideas driving the LLM revolution: scaling, emergence, and reasoning. I tried to frame them in a way that reveals why large LMs are special in the history of AI. Slides: docs.google.com/presentation…

41,985

He He · Dec 11, 2023 · 10:02 AM UTC

He He

@hhexiy

11 Dec 2023

If you are interested in truthfulness/interpretability of LLMs, chat with @javirandor at #NeurIPS2023 !

Javier Rando @javirandor

6 Dec 2023

🧵 New paper: “Personas as a Way to Model Truthfulness in Language Models” We introduce empirical evidence suggesting LLMs may use “personas” to model truthfulness and improve generalization. arxiv.org/abs/2310.18168

14,655

He He · Jan 19, 2024 · 10:40 PM UTC

He He

@hhexiy

19 Jan 2024

Congratulations again @thtrieu_ ! Thanks for bringing me on this quest and can't wait to see the next rabbit you pull!

NYU Courant @NYU_Courant

19 Jan 2024

Congratulations to Trieu Trinh (@thtrieu_ ) on the launch of AlphaGeometry, an AI system capable of solving Olympiad-level geometry problems. Advised by He He (@hhexiy), Dr. Trinh defended his doctoral dissertation on this topic just last week!

18,035

He He · Oct 24, 2025 · 7:08 PM UTC

He He

@hhexiy

24 Oct 2025

@haizelabs is one of the few truly tackling the hard problem of LLM eval and oversight. Excited to support their mission!

Leonard Tang

@leonardtang_

24 Oct 2025

We are thrilled to welcome Professor He He @hhexiy as an advisor to the Haize Labs team! Professor He leads a group at NYU focused on evaluation, scalable oversight, human–AI collaboration, and reasoning.

9,079

He He · Jan 29, 2022 · 4:43 PM UTC

He He

@hhexiy

29 Jan 2022

It’d be great if the ARR @ReviewAcl meta review provides two scores, one on significance of ideas/results and one on revisions needed. The two are kind of conflated now; what should be the score of a perfectly-executed, low-impact paper?

He He · Sep 15, 2021 · 3:20 AM UTC

He He

@hhexiy

15 Sep 2021

New work on OOD detection with @uditarora09 & @WillHuang93! OODs are notoriously hard to define. We try to construct realistic pairs of ID/OOD sets and find that they reveal distinct failure modes of different detection methods.

Udit Arora

@uditarora09

15 Sep 2021

1/5 New paper @emnlpmeeting! “Types of Out-of-distribution Texts and How to Detect Them” with @WillHuang93 and @hhexiy: arxiv.org/abs/2109.06827. TL;DR: Our results call for an explicit definition of OOD examples when evaluating different detection methods.

He He · Sep 20, 2024 · 3:38 PM UTC

He He

@hhexiy

20 Sep 2024

Check out Jiaxin's work on how RLHFed model excels at impressing humans, not the actual tasks!

Jiaxin Wen

@jiaxinwen22

20 Sep 2024

RLHF is a popular method. It makes your human eval score better and Elo rating 🚀🚀. But really❓Your model might be “cheating” you! 😈😈 We show that LLMs can learn to mislead human evaluators via RLHF. 🧵below

5,637

He He · Oct 15, 2021 · 9:11 PM UTC

He He

@hhexiy

15 Oct 2021

LM as meta learners! Fun collaboration with @yanda_chen_ at @awscloud: The GPT-3 paper claims that LM pretraining implicitly performs meta-learning. We explicitly tuned LMs to learn from in-context examples. Excited to see how LM may be used for other meta learning tasks!

Yanda Chen @yanda_chen_

15 Oct 2021

[1/8] Large LMs (e.g.,GPT-3) are good at few-shot learning. But prompting LMs exhibit artifacts like oversensitivity to example order/choice & instruction wording.☹️ Our work “Meta-learning via LM In-context Tuning” proposes a fix to meta-train LMs to learn in-context learning!😃

He He · Sep 1, 2021 · 5:53 PM UTC

He He

@hhexiy

1 Sep 2021

Are we really improving the faithfulness of the summary or just copying more content from the document? Check out our new work on generating both abstractive and faithful summaries!

Esin Durmus @esindurmusnlp

1 Sep 2021

Checkout our new paper: “Faithful or Extractive? On Mitigating the Faithfulness-Abstractiveness Trade-off in Abstractive Summarization”. arxiv.org/pdf/2108.13684.pdf #NLProc 1/n

He He · Jul 6, 2021 · 7:55 PM UTC

He He

@hhexiy

6 Jul 2021

Excited to share new work with @nitishjoshi23! Counterfactual data augmentation is a promising idea to robustify models and we were confused by its small gains on some OOD generalization tasks. It turns out that diversifying the perturbations is the key.

Nitish Joshi @nitishjoshi23

6 Jul 2021

Excited to share our new work on analyzing counterfactual data augmentation - arxiv.org/abs/2107.00753. Thread 👇- 1/8

He He · Oct 8, 2025 · 8:40 PM UTC

He He

@hhexiy

8 Oct 2025

Come to Nick's poster if you're at #COLM2025 and learn about how to run LLM experiments the scientific way!

Nicholas Lourie

@NickLourie

8 Oct 2025

LLMs are expensive—experiments cost a lot, mistakes even more. How do you make experiments cheap and reliable? By using hyperparameters' empirical structure. @kchonyc, @hhexiy, and I show you how in Hyperparameter Loss Surfaces Are Simple Near their Optima at #COLM2025! 🧵1/9

9,196

He He · Oct 31, 2022 · 5:19 PM UTC

He He

@hhexiy

31 Oct 2022

What assumptions should we make about spurious correlations? Check out @nitishjoshi23 and Xiang's work on the nuances of spurious features in natural language.

Nitish Joshi @nitishjoshi23

31 Oct 2022

The term 'spurious correlations' is often used informally in NLP to denote any undesirable feature-label correlations. But are all spurious features alike? #EMNLP2022 paper tries to address this question through a causal lens - arxiv.org/abs/2210.14011 (w/ Xiang & @hhexiy)

He He · Nov 10, 2021 · 3:05 AM UTC

He He

@hhexiy

10 Nov 2021

Interested in AI-assisted creativity and writing assistants? Come to @vishakh_pk's talk at the NILLI workshop at 14:45 AST Wed #EMNLP2021

Vishakh Padmakumar

@vishakh_pk

10 Nov 2021

New work on Machine-in-the-Loop Creative Writing w/ my advisor @hhexiy - arxiv.org/abs/2111.04193 (1/9) #NLProc

He He · Jun 19, 2025 · 7:59 AM UTC

He He

@hhexiy

19 Jun 2025

Talking to ChatGPT isn’t like talking to a collaborator yet. It doesn’t track what you really want to do—only what you just said. Check out work led by @jcyhc_ai and @rico_angel that shows how attackers can exploit this, and a simple fix: just look at more context!

John (Yueh-Han) Chen

@jcyhc_ai

13 Jun 2025

LLMs won’t tell you how to make fake IDs—but will reveal the layouts/materials of IDs and make realistic photos if asked separately. 💥Such decomposition attacks reach 87% success across QA, text-to-image, and agent settings! 🛡️Our monitoring method defends with 93% success! 🧵

5,174

He He · May 4, 2023 · 10:20 PM UTC

He He

@hhexiy

4 May 2023

Are LMs able to generate truly novel sequences? Check out @vishakh_pk and @yzpang_ 's work on extrapolative generation. I had fun working on protein sequences thanks to @ank_parikh!

Vishakh Padmakumar

@vishakh_pk

4 May 2023

Super excited to share our #ICML2023 work on *extrapolative* controlled generation. Using iterative editing, our method is able to generate text and protein sequences with attribute values outside the training region Paper: arxiv.org/abs/2303.04562 w/ @yzpang_ @hhexiy @ank_parikh

5,806

He He · Jul 16, 2018 · 5:04 PM UTC

He He

@hhexiy

16 Jul 2018

Very cool work on NLG evaluation! Among other things, I really like that the authors did extensive work to make sure that the human eval itself is reliable.

Arun Chaganty

@arunchaganty

15 Jul 2018

Is human evaluation really necessary? If so, what can we do to make it easier or cheaper to conduct human evaluations? Our latest paper at #acl2018 sheds light on both of these questions. Read more at arun.chagantys.org/technical… or visit me at poster session 1E!

He He · May 18, 2023 · 6:45 PM UTC

He He

@hhexiy

18 May 2023

Thanks @hwchung27 for a wonderful lecture and a perfect conclusion of our course!

Hyung Won Chung

@hwchung27

17 May 2023

I gave an invited lecture on Instruction finetuning and RLHF for @hhexiy 's class at NYU. One unique perspective of my lecture is that I introduce RLHF as an instance of using a learned objective function. Video: piped.video/zjrM-MW-0y0 Slides: docs.google.com/presentation…

4,563

He He · Jul 14, 2025 · 4:12 PM UTC

He He

@hhexiy

14 Jul 2025

tagging the correct @rico_angell !

859

He He · Jul 8, 2021 · 4:39 PM UTC

He He

@hhexiy

8 Jul 2021

Had a lot of fun chatting with folks at @wing_nus!

Min-Yen Kan @knmnyn

8 Jul 2021

Our @wing_nus NLP summer seminar series hosted He He from @nyudatascience to discuss her works on spurious correlations and how to combat it! Seminar slides and video (w/ permission) ☞ ☞: wing-nus.github.io/nlp-semin… piped.video/watch?v=iiXrsdJE… #NLProc

He He · Aug 29, 2025 · 4:50 PM UTC

He He

@hhexiy

29 Aug 2025

Replying to @jxmnop

Most people can't perceive that subtlety beyond a certain level.

631

He He · Nov 22, 2022 · 3:49 PM UTC

He He

@hhexiy

22 Nov 2022

The space of machine-assisted writing has grown quickly in the past year. How well do LLMs follow user instruction in controlled generation? Check out @vishakh_pk and @TuhinChakr 's work and read some fun poems copoet-emnlp.github.io/direc…!

Vishakh Padmakumar

@vishakh_pk

22 Nov 2022

Super excited to share our work, "Help me write a poem: Instruction Tuning as a Vehicle for Collaborative Poetry Writing" to appear @emnlpmeeting, w/ @TuhinChakr and @hhexiy Paper: arxiv.org/abs/2210.13669 Project Website: copoet-emnlp.github.io/ #emnlp2022 #NLProc

He He · Nov 15, 2024 · 5:06 PM UTC

He He

@hhexiy

15 Nov 2024

Replying to @VioletNPeng @yufei_t @FabriceYHC @AlexanderSpangh @TenghaoHuang45 @jonathanmay @mdredze @sahaiamit @sebgehr @muhao_chen @uclanlp

wow big congrats!!!

866

He He · Dec 12, 2023 · 2:04 AM UTC

He He

@hhexiy

12 Dec 2023

Joint work with @yzpang_ @vishakh_pk @nitishjoshi23 Mehran Kazemi @najoungkim

1,116

He He · Jun 19, 2025 · 1:56 PM UTC

He He

@hhexiy

19 Jun 2025

Replying to @LerrelPinto

looks super cool!

777

He He · Jun 14, 2018 · 2:25 AM UTC

He He

@hhexiy

14 Jun 2018

Fifth Frederick Jelinek Memorial Summer Workshop clsp.jhu.edu/workshops/18-wo…

2018 JHU Summer School on Human Language Technology - Center for Language and Speech Processing

Fifth Frederick Jelinek Memorial Summer Workshop All lectures will be held in Hackerman Hall, room B17. All labs will be held in Malone 122. The morning lectures are open to the public. We request...

clsp.jhu.edu

This tweet is unavailable

He He · Oct 12, 2018 · 9:08 PM UTC

He He

@hhexiy

12 Oct 2018

Replying to @LakeBrenden @ReubenFeinman

Humall!

He He · Jun 5, 2018 · 10:21 PM UTC

He He

@hhexiy

5 Jun 2018

Replying to @ylecun

Thanks!

He He · Aug 1, 2023 · 2:08 AM UTC

He He

@hhexiy

1 Aug 2023

Replying to @javirandor @ETH_AI_Center @florian_tramer @mrinmayasachan

Can't wait to see what you'll do next!

He He · Sep 19, 2021 · 8:05 PM UTC

He He

@hhexiy

19 Sep 2021

Replying to @dnnslmr @uditarora09 @emnlpmeeting @WillHuang93

Thanks! Ren et al, 2019 (proceedings.neurips.cc/paper…) uses a similar definition. I would say it's a special case of covariate shift as we differentiate shifts in different parts of x.

He He · Oct 20, 2022 · 3:25 PM UTC

He He

@hhexiy

20 Oct 2022

Replying to @qisun0

congrats!

He He · Sep 19, 2021 · 8:18 PM UTC

He He

@hhexiy

19 Sep 2021

Replying to @dnnslmr @uditarora09 @emnlpmeeting @WillHuang93

Interesting! Yes, that's our intuition; we only got to try with linear classifiers though. Thanks for the pointer!

He He · Oct 16, 2023 · 3:11 PM UTC

He He

@hhexiy

16 Oct 2023

Replying to @LerrelPinto @nyuniversity

congrats!!

1,687

He He · Mar 2, 2022 · 9:16 PM UTC

He He

@hhexiy

2 Mar 2022

Replying to @Diyi_Yang @NSF @ICatGT @mlatgt

Congrats!!

He He · Jul 6, 2021 · 11:47 PM UTC

He He

@hhexiy

6 Jul 2021

Replying to @snigdhac25 @shsriva

Congratulations!!!

He He · Aug 5, 2022 · 7:37 PM UTC

He He

@hhexiy

5 Aug 2022

Replying to @Diyi_Yang @Stanford

Congrats!!

He He · Jan 29, 2022 · 4:44 PM UTC

He He

@hhexiy

29 Jan 2022

Replying to @G_Karadzhov @cambridgenlp @nyuniversity @Cambridge_CL

Thanks for hosting! It was fun talking with you all.

He He · Jul 26, 2024 · 2:48 AM UTC

He He

@hhexiy

26 Jul 2024

Replying to @umdcs @UofMaryland @gkaptchuk @HanShao16

congrats @MohitIyyer !

553