John Hewitt (@johnhewtt) | nitter

Pinned Tweet

John Hewitt @johnhewtt

Apr 29

New paper! Subliminal learning—transferring hidden signals between language models—is more powerful than we thought. By biasing the teacher with a steering vector instead of a prompt, we achieve strong, consistent transfer, which we use to study its mechanisms. w/@GeorgeMorgulis

6

35

302

20,239

John Hewitt @johnhewtt

4 Sep 2025

My first NLP lectures at Columbia are in the books! In our first two lectures, we went over (1) learning from text with a simple word vector language model, and (2) tokenization of text. Lecture notes are brand new and freely available on my website (links in thread.)

17

73

1,129

72,705

John Hewitt @johnhewtt

12 Jun 2024

I’m joining the Columbia Computer Science faculty as an assistant professor in fall 2025, and hiring my first students this upcoming cycle!! There’s so much to understand and improve in neural systems that learn from language — come tackle this with me!

122

52

886

100,407

John Hewitt @johnhewtt

25 Nov 2024

I’m hiring PhD students in computer science at Columbia! Our lab will tackle core challenges in understanding and controlling neural models that interact with language. for example, - methods for LLM control - discoveries of LLM properties - pretraining for understanding

18

154

873

106,887

John Hewitt @johnhewtt

5 Apr 2019

Does my unsupervised neural network learn syntax? In new #NAACL2019 paper with @chrmanning, our "structural probe" can show that your word representations embed entire parse trees. paper: nlp.stanford.edu/pubs/hewitt… blog: nlp.stanford.edu/~johnhew/st… code: github.com/john-hewitt/struc… 1/4

9

245

788

John Hewitt @johnhewtt

3 Feb 2023

For this year's CS 224n: Natural Language Processing with Deep Learning, I've written notes on our Self-Attention and Transformers lecture. web.stanford.edu/class/cs224… Topics: Problems with RNNs, then self-attention, then a 'minimal' self-attention architecture, then Transformers.

4

150

739

86,662

John Hewitt @johnhewtt

24 Jun 2025

I’m beginning to share notes from my upcoming fall 2025 NLP class, Columbia COMS 4705. First up, some notes to help students brush up on math. Vectors, matrices, eigenstuff, probability distributions, entropy, divergences, matrix calculus cs.columbia.edu/~johnhew/com…

9

49

438

31,552

John Hewitt @johnhewtt

29 May 2023

#acl2023! To understand language models, we must know how activation interventions affect predictions for any prefix. Hard for Transformers. Enter: the Backpack. Predictions are a weighted sum of non-contextual word vectors. -> predictable interventions! backpackmodels.science

6

103

398

106,792

John Hewitt @johnhewtt

15 Nov 2023

I'm on the faculty market! My goal is to build language systems that we understand deeply through discovery and by design, so we can precisely control them and treat their failures. Let's tackle this grand challenge of science and engineering together. nlp.stanford.edu/~johnhew/

6

73

407

96,911

John Hewitt @johnhewtt

19 Oct 2020

#emnlp2020 paper: we give some theoretical insight into the syntactic success of RNN LMs: we prove they can implement bounded-size stacks in their states to generate some bounded hierarchical langs with optimal memory! paper arxiv.org/pdf/2010.07515.pdf blog nlp.stanford.edu/~johnhew/rn…

4

54

317

John Hewitt @johnhewtt

24 Sep 2024

If I finetune my LM just on responses, without conditioning on instructions, what happens when I test it with an instruction? Or if I finetune my LM just to generate poems from poem titles? Either way, the LM will roughly follow new instructions! Paper: arxiv.org/pdf/2409.14254

8

39

272

44,663

John Hewitt @johnhewtt

10 Jul 2023

Our paper on Backpacks has won an Outstanding Paper Award at ACL 2023! If you're excited about both fascinating learned structure in language models, and designing architectures to enable interpretability while maintaining expressivity, take a read! backpackmodels.science/

Stanford NLP Group

@stanfordnlp

9 Jul 2023

Our papers of #ACL2023NLP: Backpack Language Models @johnhewtt, @jwthickstun, @chrmanning, @percyliang backpackmodels.science/ Mon July 10, poster 14:00-15:30, Frontenac Ballroom and Queen’s Quay

5

35

261

48,354

John Hewitt @johnhewtt

4 Dec 2023

It’s conference time! Come say hello at EMNLP to hear my hot takes on understanding LMs Is your CS department hiring? Hey nice come talk to me! Do you know few people at EMNLP? Not for long; come talk to me! Here’s what I look like at a poster session when the lights go out

6

16

233

54,576

John Hewitt @johnhewtt

8 Jun 2025

I wrote a note on linear transformations and symbols that traces a common conversation/interview I've had with students. Outer products, matrix rank, eigenvectors, linear RNNs -- the topics are really neat, and lead to great discussions of intuitions. cs.columbia.edu/~johnhew//fu…

6

23

231

21,734

John Hewitt @johnhewtt

18 Oct 2022

This winter, I’ll be helping @chrmanning teach NLP with Deep Learning (CS224n). Every year, we attempt to update the course to best teach our students. For this, I am learning from how others teach topics in NLP. Please share your favorite technical explanation of an NLP topic!

9

15

217

John Hewitt @johnhewtt

10 Dec 2021

Ever added new words to the vocabulary of your language model only to generate from it and have it generate gibberish? In a technical blog post I detail why this happens, and that representing new words as an average of existing words solves the problem. nlp.stanford.edu/~johnhew/vo…

6

48

216

John Hewitt @johnhewtt

12 Feb 2025

Understanding and control are two sides of the problem of communicating differing concepts between humans and machines. New position paper: Robert Geirhos, @_beenkim, and I argue we must develop neologisms - new words - for human and machine concepts to understand and control AI

13

29

191

51,576

John Hewitt @johnhewtt

23 Oct 2025

New work! Gemma3 can explain in English what it learned from data – when we distill that data into a new word (embedding) and query it for a description of the word. Gemma explained a word trained on incorrect answers as: “a lack of complete, coherent, or meaningful answers...”

4

30

191

36,807

John Hewitt @johnhewtt

28 Oct 2022

We characterize and improve on language model _truncation sampling_ algorithms, like top-p and top-k. We frame them as trying to recover the true distribution support from an implicitly _smoothed_ neural LM, and provide a better sampling algo! Paper arxiv.org/pdf/2210.15191.pdf

5

35

162

John Hewitt @johnhewtt

2 Jun 2018

Learned a lot about LSTM behavior -- in very different ways -- from two excellent @acl2018 papers: Sharp Nearby, Fuzzy Far Away... by @ukhndlwl, He He, Peng Qi, and @jurafsky, and LSTM as Dynamically Computed... by @omerlevy_ , @kentonctlee, @nfitz, @lukezettlemoyer.

25

133

John Hewitt @johnhewtt

10 Dec 2024

I’ll be at neurips for a bit! If you want to talk in person about a PhD in my lab at Columbia, book a slot here: calendar.app.google/RWkDQVvm… If your organization wants to fund LLM understanding/interpretability/control research, reach out to me!

3

9

118

14,080

John Hewitt @johnhewtt

23 Feb 2024

If you're adding new tokens to Gemma, you're likely running into the "all logits are negative, so randomly init embedding with a logit of ~0 dominates the softmax" problem! Averaging existing embeddings solves this by bounding KL from initial model. See: nlp.stanford.edu/~johnhew/vo…

Teknium 🪽

@Teknium

23 Feb 2024

Gemma cant handle training with added tokens... maybe you were right @Mascobot - we aint getting chatml yet lol

4

15

118

29,892

John Hewitt @johnhewtt

10 Jul 2025

I'll be at ICML this year! Reach out if: - you want to chat -- great! -- sign up here calendar.app.google/qtDkRmS1… and/or DM me. - you want to fund my lab @ Columbia -- also great! -- research into deeply understanding language models for alignment, safety, performance. email me.

5

10

118

15,921

John Hewitt @johnhewtt

25 Nov 2024

Teaching and mentorship are key reasons why I chose to join academia. This img is some of my not-great freshman grades. I know every student needs different support at different times, and every student contributes different skills. Come to New York and learn with me!

John Hewitt in front of a portion of his transcript.l with Cs and Bs for core engineering classes.

ALT John Hewitt in front of a portion of his transcript.l with Cs and Bs for core engineering classes.

1

2

105

8,713

John Hewitt @johnhewtt

21 Sep 2023

Teaching CS224N (twice now!) with @chrmanning has been one of the most rewarding parts of my PhD, not least because the notes and videos are public. Lots of exciting new lectures (RLHF, generation,++) here, as well as refined Transformers and pretraining lectures!

Stanford NLP Group

@stanfordnlp

21 Sep 2023

A 2023 update of the CS224N Natural Language Processing with Deep Learning YouTube playlist is now available with new lectures on pretrained models, prompting, RLHF, natural language and code generation, linguistics, interpretability and more. #NLProc piped.video/playlist?list=PL…

10

106

19,036

John Hewitt @johnhewtt

7 Jul 2023

I'll be at ACL2023! If you're there and don't know anyone, come say hi! (Or let your students know I'm happy to chat!) I'll be presenting Backpack Language Models backpackmodels.science/ and helping give a tutorial on Generating Text from Language Models!

Backpack Models

Backpacks are a drop-in replacement for Transformers that enable contextual control through non-contextual interventions.

backpackmodels.science

5

3

100

14,450

John Hewitt @johnhewtt

10 Sep 2019

How do we design probes that give us insight into a representation? In #emnlp2019 paper with @percyliang, our "control tasks" help us understand the capacity of a probe to make decisions unmotivated by the repr. paper: arxiv.org/abs/1909.03368 blog: nlp.stanford.edu/~johnhew/in…

1

23

93

John Hewitt @johnhewtt

5 Jul 2020

It's #acl2020nlp and one of the best parts of a conf is meeting new people. If you'd like to chat #nlproc, and especially if you didn't have the money to sign up for the conference, email me to chat for 30min! I can talk research, admissions, grad school++. email on my website!

2

7

97

John Hewitt @johnhewtt

12 Dec 2023

Guess what it’s STILL conference time this time NeurIPS! Just got in; everything in this tweet holds true, come talk to me

John Hewitt @johnhewtt

4 Dec 2023

It’s conference time! Come say hello at EMNLP to hear my hot takes on understanding LMs Is your CS department hiring? Hey nice come talk to me! Do you know few people at EMNLP? Not for long; come talk to me! Here’s what I look like at a poster session when the lights go out

1

3

87

21,024

John Hewitt @johnhewtt

21 Sep 2021

How do I 'probe' a representation for just the aspects of a property (like part-of-speech) that aren't captured by a baseline (like word identity?) In #emnlp2021 paper, we propose conditional probing, which does this! paper: arxiv.org/abs/2109.09234 blog: nlp.stanford.edu//~johnhew//…

Conditional probing: measuring usable information beyond a baseline

Probing experiments investigate the extent to which neural representations make properties -- like part-of-speech -- predictable. One suggests that a representation encodes a property if probing...

3

9

80

John Hewitt @johnhewtt

16 Jul 2025

Come chat with me at our ICML poster about interpretability as a communication problem, and the need to derive new words for referencing language model concepts! 4:30PM-7, East Exhibition Hall A-B #E-500 We Can’t Understand AI Using our Existing Vocabulary

John Hewitt @johnhewtt

12 Feb 2025

Understanding and control are two sides of the problem of communicating differing concepts between humans and machines. New position paper: Robert Geirhos, @_beenkim, and I argue we must develop neologisms - new words - for human and machine concepts to understand and control AI

2

10

79

15,570

John Hewitt @johnhewtt

4 Sep 2025

Lecture 1: Text Representation and Language Modeling cs.columbia.edu/~johnhew/com… Lecture 2: Tokenization cs.columbia.edu/~johnhew/com…

2

5

78

4,401

John Hewitt @johnhewtt

4 Sep 2020

Very thankful for the chance to give this talk! Students interested in understanding neural representations of language, I’d love if you came and gave your thoughts and perspectives on this ongoing work on the probing methodology.

NLP with Friends @NLPwithFriends

3 Sep 2020

We are very excited to announce our next speaker!! 🗣John Hewitt @johnhewtt talking with us about ❓"Language Probes as V-information Estimators" 🗓Sept 9nd, 14:00 UTC 📝Sign up here: eventbrite.co.uk/e/nlp-with-…

1

11

73

John Hewitt @johnhewtt

16 Nov 2020

It's #emnlp2020 and one of the best parts of a conf is meeting new people. If you'd like to chat #nlproc, and especially if you didn't have the money to sign up for the conference, email me to chat for 30min! I can talk research, admissions, grad school++. email on my website!

70

John Hewitt @johnhewtt

1 Dec 2023

Come see this panel I'll speak on! There's so much to understand about language models that it's a good thing we have multiple rich subcommunities with differing perspectives and expertise -- this panel will facilitate sharing ideas and refining goals.

BlackboxNLP @BlackboxNLP

1 Dec 2023

BlackboxNLP will this year feature a panel discussion on "Mechanistic Interpretability". We hope this panel may serve as a way of creating stronger bridges between interpretability in NLP and MI! We are now collecting questions for the discussion here: forms.gle/uFKi19aMCQ2GmhPHA

4

47

13,670

John Hewitt @johnhewtt

5 Apr 2019

So a lot of people have arrived here; please read @nsaphra's excellent take on neural net probes and @nelsonfliu's comprehensive neural net probing study, both also at #naacl2019 nitter.app/nsaphra/status/1099978… Saphra: arxiv.org/abs/1811.00225 Liu: homes.cs.washington.edu/~nfl…

Naomi Saphra @nsaphra

25 Feb 2019

I'm still prepping the camera-ready for my @naacl paper, but if people take away one thing, I want it to be that they should be specific in what they mean when they say a representation "encodes" some linguistic property, and to recognize the drawbacks of their definition.

1

8

46

John Hewitt @johnhewtt

28 Apr 2021

In analysis of neural nets, there’s no single right way to “probe” the neural net’s representations. In this opinion piece, we draw from neuroscience to enumerate a few distinct goals of probing and how each guides the design of the probe.

Anna Ivanova @neuranna

27 Apr 2021

Check out our short opinion piece where we draw parallels between investigating brains and neural nets! "Probing artificial neural networks: insights from neuroscience" arxiv.org/abs/2104.08197 Written with @NogaZaslavsky and @JohnHewtt for the #brain2AI #ICLR2021 workshop. 1/

1

6

45

John Hewitt @johnhewtt

5 Jun 2019

I'll be excitedly yammering about structural probes and finding syntax in unsupervised representations today at 4:15 in Nicollet B/C #naacl2019. Even if you don't ❤️ parse trees, come by to learn a method to tell if your neural network softly encodes tree structures!

4

38

John Hewitt @johnhewtt

11 Jun 2023

This is a nice paper: On the (un)reliability of feature visualizations [Geirhos et al] arxiv.org/pdf/2306.04719.pdf Shows that vision model feature visualizations don't pass some sniff checks -- they can show plausible things unrelated to behavior on real inputs.

1

7

39

19,255

John Hewitt @johnhewtt

1 Nov 2021

I'm so glad this content is now freely available. As head TA this pas year, I had the privilege of writing and giving 3 lectures: on self-attention & Transformers, pretraining, and model analysis & explanation. I hope many find them useful in their studies!

Stanford NLP Group

@stanfordnlp

1 Nov 2021

Looking for a series to binge-watch with more depth? We are delighted to make available the latest CS224N: Natural Language Processing with Deep Learning. New content on transformers, pre-trained models, NLG, knowledge, and ethical considerations. #NLProc piped.video/playlist?list=PL…

3

39

John Hewitt @johnhewtt

8 May 2019

I enjoyed chatting with @waleed_ammar and @nlpmattg on #nlphighlights about my paper with @chrmanning on finding syntax in word representations. I'm very grateful to have had this opportunity to talk (at length!) about my work!

Dr. Bridger - وليد عمار

@waleed_ammar

7 May 2019

#nlphighlights 88: John Hewitt @johnhewtt talks about probing word embeddings for syntax by projecting to a vector space where the L2 distance between a pair of tokens approximates the number of hops between them in the dependency tree. bit.ly/2vLVsU8

4

39

John Hewitt @johnhewtt

15 Oct 2020

We split the problem of extrapolation to lengths not seen at train time in NNs into 1. what content to generate? 2. where to put EOS? Give up on 2 and NNs learn very different dynamics; better at 1! BlackBoxNLP arxiv.org/pdf/2010.07174.pdf Ben Newman, me, @percyliang @chrmanning

1

8

36

John Hewitt @johnhewtt

9 Oct 2025

Excited to give a talk at the interplay workshop tomorrow! Come say hi! Alas, it’s my only day at COLM. Catch me at the coffee breaks or the roundtable.

INTERPLAY Workshop @interplaywrkshp

9 Oct 2025

✨ The schedule for our INTERPLAY workshop at COLM is live! ✨ 🗓️ October 10th, Room 518C 🔹 Invited talks from @sarahwiegreffe @johnhewtt @amuuueller @kmahowald 🔹 Paper presentations and posters 🔹 Closing roundtable discussion. Join us in Montréal! @COLM_conf

Schedule for the INTERPLAY workshop at COLM on October 10th, Room 518C.

09:00 am: Opening
09:10 am: Invited Talks by Sarah Wiegreffe and John Hewitt
10:20 am: Paper Presentations

Lunch Break

01:00 pm: Invited Talks by Aaron Mueller and Kyle Mowhald
02:10 pm: Poster Session
03:20 pm: Roundtable Discussion
04:50 pm: Closing

ALT Schedule for the INTERPLAY workshop at COLM on October 10th, Room 518C. 09:00 am: Opening 09:10 am: Invited Talks by Sarah Wiegreffe and John Hewitt 10:20 am: Paper Presentations Lunch Break 01:00 pm: Invited Talks by Aaron Mueller and Kyle Mowhald 02:10 pm: Poster Session 03:20 pm: Roundtable Discussion 04:50 pm: Closing

2

40

9,220

John Hewitt @johnhewtt

6 Nov 2019

I'm giving a talk on designing and interpreting probing methods for understanding neural representations at EMNLP, Hall 2C, today at 1:30!

1

37

John Hewitt @johnhewtt

4 Oct 2023

LMs make low-rank distributions (hidden dim < vocab_size) -> unavoidable errors! But samples are great if you use nucleus/top-k sampling 🤔. Matt: truncation sampling can fix low-rank errors, AND we can use the low-rank basis to find good tokens below the truncation threshold!

Matthew Finlayson @mattf1n

4 Oct 2023

Nucleus and top-k sampling are ubiquitous, but why do they work? @johnhewtt, @alkoller, @swabhz, @Ashish_S_AI and I explain the theory and give a new method to address model errors at their source (the softmax bottleneck)! 📄 arxiv.org/abs/2310.01693 🧑‍💻 github.com/mattf1n/basis-awa…

5

30

14,040

John Hewitt @johnhewtt

20 Nov 2020

Congratulations to Ben Newman, who spearheaded the work, for winning Outstanding Paper at #BlackBoxNLP, and thanks to the organizers and reviewers for your efforts! Congrats as well to the winners of the other Outstanding Paper award!

This tweet is unavailable

4

28

John Hewitt @johnhewtt

5 Apr 2019

Replying to @johnhewtt @chrmanning

Key idea: Vector spaces have distance metrics (L2); trees do too (# edges between words). Vector spaces have norms (L2); rooted trees do too (# edges between word and ROOT.) Our probe finds a vector distance/norm on word representations that matches all tree distances/norms 2/4

2

1

28

John Hewitt @johnhewtt

5 Apr 2019

This claim, that parse trees are embedded through distances and norms on your word representation space, is a structural claim about the word representation space, like how vector offsets encode word analogies in word2vec/GloVE. We hope people have fun exploring this more! 4/4

3

1

24

John Hewitt @johnhewtt

7 Jul 2023

My favorite deeper dive experiment in this paper: we wondered if putting the question _before_ the documents would remove the U-shaped effect, since the autoregressive contextualization would "know" what info to look for when processing each doc. Nope! The trend still holds.

This tweet is unavailable

1

1

23

3,939

John Hewitt @johnhewtt

7 Jun 2021

It’s #naacl2021 and one of the best parts of a conf is meeting new people. If you’d like to chat #nlproc, and especially if you couldn’t make it to the conference, email me to chat for 30 min! I can talk research, admissions, grad school++. email on my website!

1

2

20

John Hewitt @johnhewtt

12 Feb 2025

The position paper is We Can’t Understand AI Using Our Existing Vocabulary arxiv.org/pdf/2502.07586 Feedback and discussion are very welcome.

7

1

19

1,950

John Hewitt @johnhewtt

21 Mar 2023

We’re coming to the end of #cs224n and it’s so good to see students excitedly discussing the results of their work at the end of the quarter. I’m grateful to our 28 TAs for making the course work.

Stanford NLP Group

@stanfordnlp

21 Mar 2023

The #cs224n poster session is happening now! We are super excited about amazing, cutting-edge NLP posters from ~650 students!

1

18

8,833

John Hewitt @johnhewtt

15 Nov 2023

I'm also deeply committed to how open research dovetails with open teaching. I've twice co-taught Stanford's CS 224n: Natural Language Processing with Deep Learning; you can find some of my lectures here piped.video/watch?v=LWMzyfvu… and here piped.video/watch?v=DGfCRXuN… !

18

3,097

John Hewitt @johnhewtt

12 Apr 2023

Ruth-Ann's great work building a Jamaican Patois Natural Language Inference dataset was picked up by Vox as part of its video "Why AI doesn’t speak every language." Happy to see Ruth-Ann's work (and disparities in NLP across languages) get this general audience coverage.

Ruth-Ann Armstrong @ruthstrong_

12 Apr 2023

Check out this Vox video I was featured in where I chat about JamPatoisNLI which I worked on with @chrmanning and @johnhewtt! Many thanks to @PhilEdwardsInc for platforming our work piped.video/a2DgdsE86ts

5

17

10,317

John Hewitt @johnhewtt

11 Nov 2020

Excited to give this talk! Tidbits: 1) Could finite-precision RNNs implement (bounded) stacks without access to an external stack? Yes, efficiently! 2) We train probabilistic models in NLP but prove things about acceptors; what if we connect language models to formal languages?

USC ISI @USC_ISI

9 Nov 2020

Join John Hewitt @johnhewtt, computer science PhD student at @Stanford, for his talk on November 12 at 11am entitled "The Unreasonable Syntactic Expressivity of RNNs." Details can be found here: isi.edu/events/calendar/1337…

1

15

John Hewitt @johnhewtt

24 Sep 2024

Base models don’t follow instructions. We find that _response tuning_ (training on responses with no instruction) yields instruction following. Does that show we just need to teach the response distribution?

1

1

15

1,632

John Hewitt @johnhewtt

5 Apr 2019

These distances/norms reconstruct each tree, and are parametrized only by a single linear transformation. What does this mean? In BERT, ELMo, we find syntax trees approximately embedded as a global property of the transformed vector space. (But not in baselines!) 3/4

1

2

15

John Hewitt @johnhewtt

9 Sep 2020

Just a few minutes out! Come attend or watch the livestream, or reach out to me afterward if you couldn’t attend but would like to chat about the topic!

NLP with Friends @NLPwithFriends

3 Sep 2020

We are very excited to announce our next speaker!! 🗣John Hewitt @johnhewtt talking with us about ❓"Language Probes as V-information Estimators" 🗓Sept 9nd, 14:00 UTC 📝Sign up here: eventbrite.co.uk/e/nlp-with-…

1

2

14

John Hewitt @johnhewtt

29 May 2023

Backpacks are an alternative to Transformers: intended to scale in expressivity, yet provide a new kind of interface for interpretability-through-control. A backpack learns k non-contextual sense vectors per subword, unsupervisedly decomposing the subword's predictive uses.

1

3

14

6,586

John Hewitt @johnhewtt

24 Sep 2024

This is work with @nelsonfliu @chrmanning @percyliang and is my last paper at Stanford NLP. It’s been a blast finding these very odd results.

1

15

1,908

John Hewitt @johnhewtt

15 Nov 2023

My work has discovered structure in language models - through the structural probe (aclanthology.org/N19-1419/), refined probing methods (aclanthology.org/D19-1275.pd…), and formalizing how models construct usable information about the solutions to hard problems (aclanthology.org/2021.emnlp-…).

3

13

2,925

John Hewitt @johnhewtt

27 May 2020

Exciting work at #acl2020nlp in characterizing cross-lingual syntactic structure in multilingual BERT! Congrats Ethan!

e chi

@echinaceous

27 May 2020

Does Multilingual BERT share syntactic knowledge cross-lingually? In #acl2020nlp paper w/ @johnhewtt and @chrmanning, we visualize its syntactic structure & show it's applicable to a variety of human languages. Paper: arxiv.org/abs/2005.04511 Blog: ethanachi.com/multilingual-p… (1/4)

13

John Hewitt @johnhewtt

19 Oct 2020

This work, with @mhahn29, @SuryaGanguli, @percyliang, @chrmanning, has been a fascinating and challenging new direction for me, and I'm deeply appreciative to them for enabling me to pursue it. Construction code: github.com/john-hewitt/dyckk… Learning code: github.com/john-hewitt/dyckk…

1

12

John Hewitt @johnhewtt

21 Sep 2021

In my blog post, I argue that probing is a clear tool to characterize knowledge in neural networks when we didn't tell the network how to represent that knowledge. nlp.stanford.edu//~johnhew//… The code should be very useful for probing studies! github.com/john-hewitt/condi…

1

2

12

John Hewitt @johnhewtt

15 Nov 2023

Further, we can and must design LMs for our understanding, not just performance: I introduced the Backpack, an architecture that brings many of the control and understanding benefits of linear models and word2vec with the power of the Transformer. (aclanthology.org/2023.acl-lo…)

1

2

12

3,597

John Hewitt @johnhewtt

18 Oct 2022

I’m most interested in in-depth lectures or technical explainers, less interested in surface-level introductions. I’m also focused on (arguably) newer topics, since in these cases, I think personal opinions on the topics tend to come through stronger in pedagogical materials.

1

12

John Hewitt @johnhewtt

23 Oct 2025

We see this as a step towards developing new language tools for learning about how language models store, process, and reason about potentially complex concepts—differently from how we do. Work with Oyvind Tafjord, Robert Geirhos, @_beenkim Blog here: cs.columbia.edu/~johnhew//ne…

1

13

1,714

John Hewitt @johnhewtt

29 May 2023

To represent a word in context, Backpacks use information from the whole context to non-negatively weight the senses of all subwords in the context. So, the contribution of each sense is always towards predicting the same words; only the magnitude changes.

1

2

12

1,830

John Hewitt @johnhewtt

24 Sep 2024

But then we also find that you don’t have to teach the response distribution. Fientuning (instruction-response) just on poetry, or just on math, or just python programs, leads to, e.g., recipe generation. It’s fascinating how little of the finetuning distribution comes across.

1

11

1,247

John Hewitt @johnhewtt

29 May 2023

I’m deeply thankful to my co-authors on this work, @jwthickstun @chrmanning @percyliang. ArXiv! arxiv.org/abs/2305.16765 Demos here! By Lora Xie. huggingface.co/spaces/stanfo… Huggingface! huggingface.co/stanfordnlp/b…

11

1,860

John Hewitt @johnhewtt

23 Feb 2024

By instead initializing new embedding to the average of existing embeddings, you guarantee that the partition function of the softmax grows by at most 1/n where n is the initial vocab size--- so the distrib doesn't change much!

10

2,936

John Hewitt @johnhewtt

23 Feb 2024

e.g., >>> torch.max(model(tok('I like pizza', return_tensors='pt')['input_ids']).logits) tensor(-4.5862) So, if you add a new word, since you randomly init the embedding, it gets dot product ~0 with hidden states. Softmax([-4,-4,..., 0]) puts mass on the elt with 0!

1

10

3,788

John Hewitt @johnhewtt

10 Dec 2021

I think there's a space of interesting work (and future work) around initializing new word embeddings (e.g., for domain adaptation) using more information -- about orthography, about the finetuning distribution, etc.; averaging will be a baseline to beat.

2

10

John Hewitt @johnhewtt

24 Sep 2024

So, it isn’t just sample-efficient to instruction-tune LMs. Even seemingly totally deficient adaptations yield instruction following. I think this bears a lot more exploration! Blog: nlp.stanford.edu/~johnhew/in… GitHub: github.com/john-hewitt/impli…

GitHub - john-hewitt/implicit-ins: Codebase for Instruction Following without Instruction Tuning

Codebase for Instruction Following without Instruction Tuning - john-hewitt/implicit-ins

1

10

2,317

John Hewitt @johnhewtt

17 Jul 2018

We modeled derivational morphological transformations separately as orthographic and distributional functions, then combined: go see @_danieldeutsch present our paper on English derivational morphology in oral session 6D today at ACL! aclweb.org/anthology/P18-118…

2

8

John Hewitt @johnhewtt

12 Feb 2025

We give a qualitative example where we sample many times, and ask the model to score its own outputs. We distill its preferences into a word 'Good_M', as in, 'Give me responses you'd think are Good_M'. Negating, 'Not Good_M', makes the model generate responses it scores lowly.

1

9

2,216

John Hewitt @johnhewtt

24 Sep 2024

To make this concrete, we show: even just taking a product between a pretrained LM and a hand-written rule-based LM with only 3 rules also yields rough instruction following. The rules are: upweight EOS slowly, uniformly change 15 words’ probs, penalize repetition.

1

9

1,552

John Hewitt @johnhewtt

25 Sep 2024

Replying to @LorenaYannnnn @percyliang @nelsonfliu @chrmanning

Maybe! We didn’t try this, but it’s a nice hypothesis. I do think that there’s probably a nice middle ground between instruction tuning and response tuning in terms of the amount of information provided about the instruction.

1

8

352

John Hewitt @johnhewtt

29 May 2023

For one example, we observe that certain aspects of gender bias in career nouns (e.g., nurse, CEO) is represented by a Backpack in a particular sense vector (pointing towards, e.g., “he”, “she”.) By “turning down” this sense, this aspect of gender bias is reduced.

1

9

1,573

John Hewitt @johnhewtt

29 May 2023

I was fascinated at the emergent structure of sense vectors, and I’m really excited to see what LM interpretability research the Backpack enables. We can design architectures that scale and learn to do some of the interpretability work for us.

1

8

1,957

John Hewitt @johnhewtt

28 Oct 2022

We analyze a few truncation-sampling algorithms, and find that our eta-sampling leads to more plausible long English documents, breaks out of repetition better, and more reasonably truncates low-entropy distributions. With @chrmanning, @percyliang Blog: nlp.stanford.edu//~johnhew//…

8

John Hewitt @johnhewtt

4 Dec 2022

This work, led by @ruthstrong_, provides a great new language resource in Jamaican Patois, and studies transfer in multilingual and monolingual LMs! One opportunity: studying how model predictions change as a sentence moves closer to or farther from the high-resource English.

Stanford NLP Group

@stanfordnlp

4 Dec 2022

JamPatoisNLI provides a dataset and examines how well you can do transfer to low-resource creoles like Jamaican Patois, versus other recent results for low-resource NLI. By @ruthstrong_ @johnhewtt @chrmanning. At Multilingual Rep’n Workshop. #emnlp2022 nlp.stanford.edu/pubs/armstr…

Sample from Jamaican Patois to English transition dataset. The final example is in English, and we present predictions made by three models finetuned with our Patois few-shot training dataset using the parameters for the best JamPatoisNLI model as the text moves from basolectal to acrolectal.

ALT Sample from Jamaican Patois to English transition dataset. The final example is in English, and we present predictions made by three models finetuned with our Patois few-shot training dataset using the parameters for the best JamPatoisNLI model as the text moves from basolectal to acrolectal.

4

7

John Hewitt @johnhewtt

10 Dec 2021

This can harm finetuning. I also show that a simple, popular heuristic -- just averaging all existing embeddings -- guarantees that adding new words doesn't deviate much from the pretrained LM, solving this problem.

1

8

John Hewitt @johnhewtt

23 Feb 2024

Replying to @Teknium @teknium

Right; it's pretty rare. Most models don't have this issue. You can see for yourself by loading up gemma-2b: >>> torch.max(model(tok('I like pizza', return_tensors='pt')['input_ids']).logits) tensor(-4.5862) bc max logit << 0, any new token would dominate in probability

1

6

478

John Hewitt @johnhewtt

24 Jun 2025

Replying to @devadityamohan1

Doing my best to release the videos; can't make promises, unfortunately. Glad the resources have been useful! I'll release as much as I can.

7

521

John Hewitt @johnhewtt

19 Oct 2020

Our results are about what's possible, not what's learned. But a drop of empirical results: while RNNs don't learn Dyck-k in practice (aclweb.org/anthology/W19-390…), they can learn Dyck-(k,m) well, even with a vanishingly small fraction of the possible stack states seen at training!

1

6

John Hewitt @johnhewtt

10 Jul 2025

Replying to @aryaman2020

legally I have no idea. If you want to give me $7 for a coffee though I’ll have my people talk to your people.

3

7

1,171

John Hewitt @johnhewtt

23 Oct 2025

In one example, we taught Gemma a neologism that causes single-sentence answers. When asked for synonyms of this new word, it suggested “lack,” as in, “Give me a lack answer.” This didn’t look right, but indeed causes very curt answers. We call this a machine-only synonym.

1

9

2,011

John Hewitt @johnhewtt

24 Sep 2024

Finetuning that is deficient compared to instruction tuning, yet still yields instruction following, we call _implicit instruction tuning _. Why does this happen? Well, one thought is that the difference between a base LM and instruction-following LM is ‘simple.’

1

6

1,082

John Hewitt @johnhewtt

26 Nov 2024

Replying to @luis_hacm

Requirements: the university has some application requirements on the website. For my lab, no explicit requirements; I hope to hire curious, driven students with evidence of potential for creative, independent research. My website has more info.

1

6

1,547

John Hewitt @johnhewtt

23 Oct 2025

In our new work, Neologism Learning for Controllability and Self-Verbalization (arxiv.org/pdf/2510.08506v1), we show that by asking Gemma about the new word ~concept, like “what’s a synonym for ~concept”, gemma can self-verbalize, generating English descriptions of the concept.

1

6

830

John Hewitt @johnhewtt

23 Oct 2025

In neologism learning [HGK25] we freeze a language model, initialize one new word embedding, place that word in natural language contexts, and train it to optimize a loss on training examples that define some concept. Simple parameter-efficient finetuning, but you get a new word.

1

1

6

1,104

John Hewitt @johnhewtt

15 Jun 2022

I’ll be on this panel! Come say hello!

ACL Mentorship @aclmentorship

13 Jun 2022

Exciting update: we are opening up our upcoming mentoring session to the public. Use this link to join the webinar on Wednesday: umich.zoom.us/j/91364965401

2

6

John Hewitt @johnhewtt

19 Oct 2020

Our proof is constructive, exactly specifying weights of 1-layer RNNs (and a separate mechanism for the LSTM using just gates) that allow RNNs to push/pop from internal stack and create probability distributions over the next token, encoding what's possible in Dyck-(k,m) strings.

1

6

John Hewitt @johnhewtt

6 Dec 2022

On Dec 7, I'll be presenting Truncation Sampling as Language Model Desmoothing At the GEM workshop! One practical takeaway: word-level truncation decisions (from top-p or eta-sampling) can be unintuitive. A colab in which you can try these yourself: colab.research.google.com/gi…

A plot of the probability distribution from a language model for the prefix "My name," showing that top-p only allows the continuation "is," and eta-sampling additionally allows the continuations "apostrophe s, Is, and isn".

ALT A plot of the probability distribution from a language model for the prefix "My name," showing that top-p only allows the continuation "is," and eta-sampling additionally allows the continuations "apostrophe s, Is, and isn".

2

6

John Hewitt @johnhewtt

19 Oct 2020

To make this concrete, k: vocab size. m: max nesting depth. Let's say vocab size is 100k, and max nesting depth is 3. (Empirically, 3 is not a bad approx. of human language.) Then before: approx 10^20 hidden units needed (give or take a few powers). We prove 150 units suffices.

1

6

John Hewitt @johnhewtt

8 Sep 2025

Replying to @wpan_buidl

yes!

1

114

John Hewitt @johnhewtt

5 Apr 2019

Replying to @johnhewtt @yoavgo

re: other languages: I expect so : ) ; we'll see. Some things in the works; lots of follow-up work to be done (hopefully by many people!) re: syntax reps and head choices -- I'd love to hear more about that! Which representations (UD/SD/?) do ELMo/BERT match best? etc.

2

6

John Hewitt @johnhewtt

2 Oct 2018

Scott Aaronson's note scottaaronson.com/writings/b… is a delightful introduction to reasoning about large numbers, leading up to the Busy Beaver numbers. Years after finding that article, what fun to find Busy Beaver numbers in proofs on RNNs! arxiv.org/pdf/1711.05408.pdf

2

5

John Hewitt @johnhewtt

12 Feb 2025

The neologism framing is clarifying for interp, e.g., at what level of abstraction should we search for model concepts? Neologisms in languages (e.g., 'vibes', 'doomscroll') hit moderate levels of abstraction (if too low-level, not common enough. too abstract: not informative.)

1

5

999

John Hewitt @johnhewtt

28 Oct 2022

Intuitively, in early-stopped neural LMs optimizing KL, there's good reason to put "a bit of probability mass everywhere", to hedge and avoid very high loss. This smoothing is good for scoring, like in n-gram models, but bad for generation, since mass is on non-language strings.

1

5