Bren Professor of Computational Biology @caltech. Blog at liorpachter.wordpress.com. Tweets represent my views, not my employer's. #methodsmatter

Pasadena, CA
33
33
488
It's time to stop making t-SNE & UMAP plots. In a new preprint w/ Tara Chari we show that while they display some correlation with the underlying high-dimension data, they don't preserve local or global structure & are misleading. They're also arbitrary.🧵biorxiv.org/content/10.1101/…
86
1,147
3,989
You have to hand it to Lex Fridman. His grift is not an amateur job. Take his Twitter photo. A professor standing in front of a blackboard with some math. Right?
Community note
Lex Fridman is a research scientist at MIT. mit.edu/directory/?id=… lexfridman.co
238
196
1,803
1,221,847
This recently published figure by @Sarah_E_Ancheta et al. is very disturbing and should lead to some deep introspection in the single-cell genomics community (I doubt it will). It demonstrates complete disagreement among 5 widely used "RNA velocity" methods 1/
6
91
505
186,241
Replying to @kareem_carr
At least one guy got a chaired professorship out of it.
16
20
1,272
This is rubbish. Shame on @sapinker for spreading this misogyny. A 🧵...
43
296
1,120
Aristotle was the first to notice honeybees dancing. In 1927 Karl von Frisch decoded the waggle. How it works was "explained" by MV Srinivasan AM FRS in the 1990s. Except @NeuroLuebbert found his papers are junk. A 🧵 about her discovery & our report: arxiv.org/abs/2405.12998 1/
23
321
1,276
319,210
The choice of whether to use Seurat or Scanpy for single-cell RNA-seq analysis typically comes down to a preference of R vs. Python. But do they produce the same results? In biorxiv.org/content/10.1101/… w/ @Josephmrich et al. we take a close look. The results are 👀 1/🧵
15
366
1,181
587,005
My favorite proof that √2 is irrational (Tom Apostol, 2000):
21
184
1,122
This is the paper that the terrorist who killed 10 people in Buffalo cited. @sapinker described the genetic variants as "collectively predict[ing] a big chunk of variance in educational attainment", which is false. 1/2
18
202
906
I've noticed it's becoming increasingly common in genomics to report results of regressions with ridiculously low correlation as "significant" based on a tiny p-value (for the hypothesis that the slope = 0). Can you guess R^2, the p-value, and where the data below was published?
86
100
997
468,805
Analysis of #scRNAseq requires constant, tedious, interaction with genomics databases. To facilitate querying from @ensembl et al., @NeuroLuebbert developed gget: biorxiv.org/content/10.1101/… (code @ github.com/pachterlab/gget). gget has many uses; a 🧵on the its amazing versatility: 1/
8
270
993
A friend (who does not work in science) asked me today whether it is true that "protein folding has been solved". My short answer: The AlphaFold method produced very impressive results on CASP14. Protein folding is not a solved problem.
13
212
992
Kind of weird to see genomics people here today celebrating the log-fold-change of 0.0007371 in the top two times for the 100m dash at the olympics, but also throwing out any result where the log-fold-change is less than 1.
10
13
195
36,157
Interesting analysis by @jsm2334 of the Israeli #covid19 data revealing that intuition about vaccine efficacy has been misguided due to the Yule-Simpson effect (also known as Simpson's paradox). h/t @jbakcoleman covid-datascience.com/post/i…
44
301
900
The biggest lie in science: "All authors read and approved the final manuscript."
20
137
866
Wait until they see the damage MATLAB has done.
The fascination with R has turned into unhealthy political and media fixation, say disease experts. go.nature.com/31SYRlF
11
57
837
This "poster" at #bog22 is obviously some kind of a joke. But unclear what the punchline is.
90
71
851
There is something deeply flawed with SciPy. The recent @numpy_team paper just published with 26 male authors and 0 women is a symptom. 1/
The NumPy paper is out! nature.com/articles/s41586-0…
19
177
799
Replying to @nprscience
Funny that in the interview you linked to James Watson didn't mention Rosalind Franklin. He did talk about the scientist whose work he stole at other times. Some quotes for your readers:
2
34
758
67,464
I've received numerous requests from bench biologists asking for bioinformatics tutorials to work through. In response @sinabooeshaghi and I will teach a #scRNAseq @zoom_us workshop (for up to 300) on Thursday March 26th @ 1pm PST. Join us at caltech.zoom.us/j/315126162
26
307
818
I've posted the notes/slides for my computational biology class at github.com/pachterlab/Bi-BE-… Topics were chosen based on appearing in >=3 bio areas, although for focus examples are all drawn from #scRNAseq. Homeworks include both theory and exploration of data (via @GoogleColab).
10
154
821
93,814
This reminded of my first computational biology conference. I didn't know anybody and was terrified. Therefore at the banquet I sat next to my advisor at the time. As I was talking to him security came & hauled him off. Someone thought he was a homeless person crashing the event.
5
33
771
182,011
While it’s fun to banter about what constitutes a good lab, the part of this that is uncomfortable to discuss is that leaving a bad lab is in many cases near impossible. Few universities offer much support and PIs can and do retaliate, in some cases ending careers.
For a PhD student, choosing a good lab is 10 times more important than choosing a particular topic to study
20
119
721
Last night the genomics community applauded a disgraceful normalization of racism and sexism as Jim Watson was toasted on his 90th birthday. nitter.app/markjcowley/status/995… /1
41
339
698
I'm teaching an introduction (to an introduction) to single-cell RNA-seq today and making use of slides that others might find useful: figshare.com/articles/Introd… #scRNAseq
15
184
787
The blackboard Lex is standing in front of has basic calculus left over from an actual real MIT calculus class. It has nothing to do with what he is "teaching". The stuff he is presenting is a joke. You can listen in to some of his rubbish here: piped.video/-6INDaLcuJY?t=1409
16
34
712
294,396
Challenge accepted. Here are a few comments on the paper after starting to wade through its massive content. The paper in question is nature.com/articles/s41586-0… 1/🧵
Critiquing a paper for the number of figures, ext figures & support figures is really weird. Just read the paper & point out any actual issues you find. Anyone can peer review a paper or preprint post "publication".
16
100
746
575,690
The exciting reveal of Ultima Genomics last week was accompanied by the publication of four preprints. Intrigued by the potential of the technology, @sinabooeshaghi & I decided to take a look at the data. A 🧵 about our findings & a preprint we posted: biorxiv.org/content/10.1101/… 1/
8
187
707
Please stop using Tophat scholar.google.com.mx/schola… Cole and I developed the method in *2008*. It was greatly improved in TopHat2 then HISAT & HISAT2. There is no reason to use it anymore. I have been saying this for years yet it has more citations this year than last #methodsmatter
17
521
689
I've been reading some of the #COVID2019 preprints that have been coming out and I can say with confidence that many academics would be better off watching @netflix during their quarantine.
22
73
698
I highly recommend "How to read (single-cell RNA-seq) PCA plots" (by @vallens): nxn.se/valent/2017/6/12/how-… #gi2019 He notes that "A particular danger [in interpreting T-shapes] is that it is tempting to interpret this as a bifurcation in the data." (it almost never is). #gi2019
8
227
673
Someone claimed this figure was published in @NatureRevGenet. I did not believe them. I was wrong.
66
55
664
429,449
I'd like to see a ranking of universities based on the number of alumni in jail for fraud.
11
44
613
Spatial reconstruction of single-cell RNA-seq data.
first time completing a puzzle this shit is easy
7
58
575
Maybe it's worth considering that this paper, and others like it, that search in the weeds for "significant" PRS scores and ignore numerous important caveats, should not be published. They don't have any scientific value, and they only serve as material for manipulation. 2/2
12
50
516
I've gotten several requests recently for permission to use the notes from my computational biology class: github.com/pachterlab/BI-BE-… A reminder that they're licensed under CC BY 4.0: you're free to share and adapt, just give appropriate credit and indicate if changes were made.
1
119
575
67,822
Well first of all, this was an MIT IAP class. IAP is a short period in January when students get to take fun classes on various topic that can be taught by anyone (many by students). I once sat in on a brain dissection. You can learn how to count cards. web.mit.edu/willma/www/mit15…
5
14
545
148,348
I am a judge today for a high school science fair and was just reviewing posters. About half described their data as being problematic for drawing conclusions for various reasons (small sample size, inaccurate instrument measurement, etc.) If only my colleagues were this honest.
7
39
559
I have a few things to say about this tweet attacking @mbeisen and subtweeting me. Specifically, I want to talk about cancel culture gone mad... nitter.app/MLevitt_NP2013/status/… 1/14
9
106
506
If you work w/ single-cell RNA-seq & are performing RNA velocity analyses, you might find this @GorinGennady et al. preprint w/ Meichen Fang & Tara Chari of interest. It's a deep dive into the method, and navigation of the 67 pages may be aided w/ this🧵1/ biorxiv.org/content/10.1101/…
12
152
545
An appropriate response to this from @uwcse / @UW would be to ban @pmddomingos from all promotion / tenure decisions (of men and women) because he is clearly not qualified to judge the work of others.
16
43
462
He clearly doesn't know what he's talking about. This explanation of L1 and L2 is 😭 He is standing in front of the formula 1-cos(2θ). I doubt he could tell you what the cosine of an angle is. piped.video/-6INDaLcuJY?t=2293 I watched all this crap so you don't have to.
19
20
514
139,983
One of the interesting things about biology is that it’s so complex that we don’t have the slightest idea why some brains can discover new biology, while other brains can tweet this.
Replying to @timrpeterson
None of it is hard. And certainly what they’re doing isn’t.
11
34
510
164,237
By now people just take it for granted that he is an MIT prof. And he nurtures that myth. As I said, gotta hand it to him.
34
25
497
167,343
Full disclosure: Lex blocked me after I questioned why (with very few exceptions) he interviews only men.
Question for you @lexfridman: Why does your 25 popular episodes guest list contain only one BIPOC and no women at all? There are even more speakers named "Stephen"! Do you believe that science and technology is for men only?
136
16
496
204,913
It really irks me when #bitcoin articles refer to mining as “solving complex math problems”. There is no solving, there is nothing complex, it’s not math, and the only problem is the size of the carbon footprint.
20
72
484
A tiny minority of the “sacrifices” of animals for biology research are actually needed. There is an enormous amount of unneeded murder of animals for research of poor quality, and many researchers don’t take ethical considerations / guidelines, e.g. the 3 Rs, seriously.
What unpopular academia opinion would get you in this situation?
20
41
494
172,371
This figure provides a glimpse of the future of epidemiology. Contact tracing coupled to genome sequencing for understanding and controlling a pandemic. Incredible work by the computational biologists at deCODE genetics. medrxiv.org/content/10.1101/…
9
215
523
For the PI who has everything.
6
64
512
75,451
The recommendation paradox: 1958: every student is above average 1998: every student is in the top 10% 2018: every student is in the top 5% 2028: every student is in the top 1% 2033: every student is in the top 0.1% 2035: every student is the best who ever lived and very social.
22
99
496
The @humancellatlas lung atlas that was published today is impressive, but with 2% Asian samples when 60% of the world is Asian, it seems that the initial goal of keeping "ethnic diversity in mind"... may have escaped the mind.
11
104
484
I'm not the first to figure this out... news.ycombinator.com/item?id… Everybody knows.

ALT Blinking Leonard Cohen GIF

8
12
454
137,545
Of course it’s good advice to tell students to choose labs carefully. But the advice that’s really needed is for PIs, not students. The advice is to create environments in their departments where students don’t have to choose labs carefully, because all labs are “good”.
13
61
433
"The purpose of models is not to fit the data but to sharpen the questions." —Samuel Karlin simplystatistics.org/2019/04…
8
151
461
A few months ago @AndrewYNg tweeted that radiologists were on the verge of being obsolete because AI: nitter.app/AndrewYNg/status/93093…. Andrew has >300K twitter followers so his tweet made the rounds (>2,000 likes) /1
Should radiologists be worried about their jobs? Breaking news: We can now diagnose pneumonia from chest X-rays better than radiologists. stanfordmlgroup.github.io/pr…
18
242
486
"...two popular bioinformatics methods, DESeq2 and edgeR, have unexpectedly high false discovery rates [when identifying differentially expressed genes between two conditions using human population RNA-seq samples]." tl;dr use the Wilcoxon rank-sum test. genomebiology.biomedcentral.…
13
79
447
This speech by @FareedZakaria is a litany of misinformation. There is much to improve at US universities, but his claims are false and unhelpful. A rebuttal: 1/🧵
22
77
411
345,098
Lex doesn't quite lie, but obviously he is far from telling the whole truth. His "research position" is a whole other bunch of bull (for another time). And he uses this MIT mirage to great advantage, creating the perception that he is effectively a professor there.
15
15
448
140,983
Replying to @baym
Also based on what I've seen around town recently this table would be a lot more useful:
3
33
426
The edgeR differential analysis tool has been updated to version 4.0., and this update features support for isoform-level DE, which is important functionality that can be used for #scRNAseq (via pseudobulk). Great to see that transcript-level analysis has become mainstream. 1/🧵
edgeR 4.0: powerful differential analysis of sequencing data with expanded functionality and improved support for small counts and ... biorxiv.org/cgi/content/shor… #biorxiv_bioinfo
3
103
452
81,688
Isn׳t the term “motivated postdoc” in an advertisement just a euphemism for an “extremely hardcore” employee?
20
26
443
"These methods can order a set of individual cells along a path, and assign a pseudotime value to each cell that represents where the cell is along that path. This can be a starting point for further analysis to determine gene expression programs driving cell phenotypes."
6
39
458
108,046
So I wrote a thread about the lack of representation of women on the SciPy and NumPy papers, and the implications thereof. In return I was blocked by one of the core developers. This is not how one builds open source communities.
There is something deeply flawed with SciPy. The recent @numpy_team paper just published with 26 male authors and 0 women is a symptom. 1/
7
79
412
Yeah, well, it turns out grandma figured out all of the analysis methods for single-cell RNA-seq. 1/
telling students in my class about the history of genomics
9
72
564
The reality is that there is no current decline for men. Rather, there has simply been an increase in women getting degrees after, you know, they were allowed to actually attend many universities in the... wait for it... 1970s. 🤯
6
49
360
In a pair of @biorxivpreprint preprints just posted w/ @sinabooeshaghi and @agalvezmerchan, we describe algorithms for an open data Commons Cell Atlas and demonstrate how a Human Commons Cell Atlas can be used for discovery. biorxiv.org/content/10.1101/… biorxiv.org/content/10.1101/… 1/🧵
6
101
448
101,993
This is what UMAP does to your data.
Meet the new iPad Pro: the thinnest product we’ve ever created, the most advanced display we’ve ever produced, with the incredible power of the M4 chip. Just imagine all the things it’ll be used to create.
6
40
445
69,458
I'm honored and humbled to announce that after carefully studying a UMAP of an integrated single-cell RNA-seq atlas I've discovered a new cell type in the medial prefrontal cortex. I'm happy to share the data upon reasonable request.
15
15
425
Looking for the interesting result in your single-cell RNA-seq data.
10
29
437
It’s one thing to celebrate *science*, e.g. by toasting the discovery of the structure of DNA by Crick, Franklin and Watson. It’s an entirely different matter to toast a man whose actions have directly harmed not only our colleagues, but society at large. /10
7
88
374
We thank the reviewer for their excellent suggestion which has greatly improved our manuscript.
5
21
420
37,214
All I can see in Rorschach tests now is batch effect In single-cell RNA-seq experiments. Thanks for nothing tSNE.
6
49
408
Happy to announce that we've just posted a preprint on "Modular and efficient pre-processing of single-cell RNA-seq". Highlights: process #scRNAseq on a laptop, 10x processing up to 51 times faster than Cell Ranger. A new efficient RNA velocity workflow. biorxiv.org/content/10.1101/…
4
179
430
Today in Science History: In 1962, Dr. Rosalind Franklin, whose work was key to determining the double-helix molecular structure of DNA and its significance for information transfer in living material, did not win the Nobel Prize for Medicine and Physiology.
12
117
412
I'll see your co-third authors and raise you co-fifth authors (also on a 6-author paper).
Taking authorship seriously! 🤷 (seen on a 6-author paper)
18
28
413
144,431
I have never seen a paper where the authors have formulated and experimentally tested a hypothesis based on a discovery made with RNA velocity. It's totally possible I have just missed this in the literature, and if so I'd love to see a reference. Thanks!
This recently published figure by @Sarah_E_Ancheta et al. is very disturbing and should lead to some deep introspection in the single-cell genomics community (I doubt it will). It demonstrates complete disagreement among 5 widely used "RNA velocity" methods 1/
4
21
123
56,572
The @biorxivpreprint has been weaponized by many labs who use it to plant priority flags, not to accelerate research (accomplished by withholding methods and/or data). This does not serve the interests of either the scientific community, or the public.
20
44
410
I respectfully disagree. I think what is currently "democratizing" science much more than "high impact" journals is @PubPeer. So let's take a tour of the @PubPeer comments for some of this author's papers in "high impact" journals: 1/
Replying to @ItaiYanai
I'll go further Itai..."glam" or actually "high impact" journals are actually very much "democratizing" science. If you take them out everything will be based on pedigree, "fame", X-followers etc...so i would be very skeptical with the idea of getting rid of journals...
10
46
398
231,540
You know what else sucks? When high profile machine learning people oversell their results to the public. It leaves everyone worse off… because how can us mere mortals publish a paper if we haven’t rendered an entire profession obsolete with our results? /4
6
81
402
I've checked this paper out, as instructed. I was also interested in the main result for personal reasons: I'm 51 years old. Is it true that I've just gone through a major change? And that another one awaits me in just a few years? Some comments on the paper in this thread 1/🧵
Our aging paper just came out. Two periods of major changes: Mid 40s and 60s. Check it out doi.org/10.1038/s43587-024-0…
12
89
423
123,265
If you think articles should be valued for their content rather than the journals they were published in, use the citation form "Author(s), year, DOI" on slides and omit the journal name.
11
68
394
My new favorite RNA velocity plot from a published paper.
18
22
411
70,408
So apparently in 2023 principal component analysis latent spaces can be said to have an "internal world model" (figure from @jnovembre et al., 2008). Turns out that the "singular" in singular value decomposition refers to the singularity!
9
42
413
107,433
UMAP and t-SNE are widely used in single-cell genomics to identifying features of interest, and visually explore data. In a new paper w/ Tara Chari we find that extensive distortions and inconsistent practices make such embeddings counter-productive.🧵journals.plos.org/ploscompbi… 1/
2
107
400
88,525
BREAKING NEWS: On the basis of vote counts from the first 11 rounds of voting for the Speaker of the United States House of Representatives WE CAN NOW PREDICT that Kevin McCarthy will receive zero votes in the 676th round of voting which will take place on June 14th, 2023.
8
39
350
55,633
Introduction to single-cell RNA-seq technologies liorpachter.wordpress.com/20…
6
150
393
🌌The virial theorem relates time-averaged kinetic energy of objects to their potential energy. 🧬The Price equation relates change in a trait over time in subpopulations to their fitness. In arxiv.org/abs/2312.06114 we observe that the virial theorem is the Price equation. 1/🧵
14
85
377
134,863
Science has been moving very fast, but it's about to move MUCH faster. In this example, Gemini compiles an up-to-date list of GWAS variants from the literature. piped.video/watch?v=sPiOP_CB…
15
65
399
86,756
Isaac Ben-Israel is an Israeli going around saying the pandemic will end in 70 days. He has a "paper" in Hebrew (I read Hebrew). Below is one of the graphs on which he is basing his prediction. HE FIT A SIXTH ORDER POLYNOMIAL TO THE DATA 😱😱😱😱😱😱 (c=2E21 😱)
38
44
370
This photo (see RHS of image below) is from what he calls his "MIT course" on Deep Learning for Self-Driving Cars. Sounds like good stuff. CS, math, self driving cars. #broheaven. So what is the problem? He is standing in front of the blackboard.
7
7
354
164,161
I disagree that there is a "gulf between[James Watson's] scientific brilliance and his views on race." Yes, he won a Nobel prize but winning this prize does not make one scientifically brilliant. He is scientifically bankrupt and a racist. nytimes.com/2019/01/01/scien…
14
104
324
I think this paper is a Denial Of Peer Review Attack (DOPRA). It's kind of like a DoS (denial of service) attack. There is so much data, so many methods, so much code, so many figures, so many panels, so much supplement, so much text, that it is overwhelming. 18/
8
53
361
118,161
Two of our recently developed tools to simplify bioinformatics are now published ffq for metadata retrieval from sequence databases (@agalvezmerchan, @lioscro, @sinabooeshaghi): academic.oup.com/bioinformat… gget for querying genomic databases (@NeuroLuebbert) academic.oup.com/bioinformat…
2
101
378
67,136
I’m not sure I understand the guy with the “no mRNA” sign. I mean who hasn’t been against proteins at one point or another… but a complete and total ban on mRNA seems a bit extreme.
Vitlausasta fólk landsins kom saman á Austurvelli rétt í þessu!
27
21
356