Will · Dec 24, 2024 · 12:12 AM UTC

Will

Pinned Tweet

Will

@_brickner

24 Dec 2024

wrote a paper: it lets you *train* in 1.58b! could use 97% less energy, 90% less weight memory. leads to a new model format which can store a 175B model in ~20mb. also, no backprop!

107

316

4,807

970,365

Will · Feb 15, 2025 · 2:30 AM UTC

Will

@_brickner

15 Feb 2025

Replying to @nearcyan

behold, new food

689

15,502

Will · Jul 7, 2021 · 3:35 AM UTC

Will

@_brickner

7 Jul 2021

Replying to @atomicthumbs

544

Will · Aug 16, 2024 · 5:41 PM UTC

Will

@_brickner

16 Aug 2024

Replying to @ChazakielDoremi

i was an 8 year old without specific expectations so i loved it

502

20,299

Will · Dec 24, 2024 · 12:12 AM UTC

Will

@_brickner

24 Dec 2024

arxiv mods rejected this paper. they won’t say why. I don’t really care at this point, took weeks to get approval to submit. I think twitter boys will like it, that’s what matters.

207

42,340

Will · Dec 24, 2024 · 12:12 AM UTC

Will

@_brickner

24 Dec 2024

pdf: github.com/wbrickner/noise_s…

GitHub - wbrickner/noise_step: noise_step: Training in 1.58b With No Gradient Memory

noise_step: Training in 1.58b With No Gradient Memory - wbrickner/noise_step

github.com

194

21,899

Will · Dec 24, 2024 · 12:12 AM UTC

Will

@_brickner

24 Dec 2024

the core trick I use comes from `Gradients without Backpropagation`. using the JVP, you can find the alignment of random vectors to the gradient, and reconstruct it. only a forward pass!

186

30,699

Will · Dec 24, 2024 · 12:12 AM UTC

Will

@_brickner

24 Dec 2024

doesn’t this violate information theory? no, it’s probably that the domain in which models are compressible is the correlation of their loss gradient to noise. or something.

182

25,208

Will · Jun 16, 2022 · 3:04 AM UTC

Will

@_brickner

16 Jun 2022

Replying to @chasebratton

the true mark of wealth

164

Will · Dec 24, 2024 · 12:12 AM UTC

Will

@_brickner

24 Dec 2024

what about bitnet? bitnet does inference in 1.58b, but training uses precision weights. basically they clamp weights to ternary {-1,0,1} in forward pass, and pretend they didn’t in backward pass.

174

34,938

Will · Dec 24, 2024 · 12:12 AM UTC

Will

@_brickner

24 Dec 2024

it’s often said academia is unwell. I was unimpressed with how these people operate. Real Science needs a massive cultural change. More openness, less hostility, less structure. I’m not a real researcher; freely discard my comments.

166

23,067

Will · Dec 24, 2024 · 12:12 AM UTC

Will

@_brickner

24 Dec 2024

actually, one acknowledgement: call me schizo but months ago I was discussing the algorithm loudly at dinner and I swear @DarioAmodei was watching me, grinning just like this. just eating with his wife. perhaps some things are fated.

162

39,269

Will · Dec 24, 2024 · 10:00 PM UTC

Will

@_brickner

24 Dec 2024

I woke up late, here is a cpu implementation colab.research.google.com/dr…

lk99_proof.ipynb

Colab notebook

colab.research.google.com

160

26,086

Will · Dec 24, 2024 · 12:12 AM UTC

Will

@_brickner

24 Dec 2024

the other thing is distributed training. the steps are tiny, the optimizer is stateless. imagine: a distributed training cluster across the internet with such low traffic it’s undetectable.

146

22,116

Will · Dec 24, 2024 · 12:12 AM UTC

Will

@_brickner

24 Dec 2024

womp womp: this doesn’t work with ternary weights! if you make the random vectors sparse, it does work! another exciting part is that your ‘alignments’ can just be {-1, 0, +1}, it still works!

124

27,250

Will · Dec 24, 2024 · 12:12 AM UTC

Will

@_brickner

24 Dec 2024

model exfiltration becomes easier too: a disgruntled worker might gmail themselves gpt4o. woohoo! proliferation! <3

114

22,743

Will · Dec 24, 2024 · 12:12 AM UTC

Will

@_brickner

24 Dec 2024

this means gradient steps are now tiny! an entire step in a few bytes. is there a tradeoff? it doesn’t seem like you need to take more steps to converge.

112

26,167

Will · Dec 24, 2024 · 12:12 AM UTC

Will

@_brickner

24 Dec 2024

instead of storing model weights, you could instead store training steps, with massive size reduction. download a sota model in a second?

113

24,184

Will · Dec 24, 2024 · 12:12 AM UTC

Will

@_brickner

24 Dec 2024

second, we need better tools to write compute kernels! I tried for a long, long time to get a functional and high performance kernel written for noise_step training. I am still without. kernel language is one of the projects I might work on next.

111

20,938

Will · Dec 24, 2024 · 12:12 AM UTC

Will

@_brickner

24 Dec 2024

this format has some other cool properties. you can recover the complete history of weights! full-rank finetunes become tiny. you might even be able to flip or mask out past training steps, idk.

102

22,250

Will · May 1, 2019 · 3:26 AM UTC

Will

@_brickner

1 May 2019

Replying to @jesse_squires

definitely an issue

Will · Dec 24, 2024 · 12:12 AM UTC

Will

@_brickner

24 Dec 2024

now, instead of acknowledgements, I have grievances!

21,309

Will · Dec 24, 2024 · 12:12 AM UTC

Will

@_brickner

24 Dec 2024

another neat thing is the JVP can run alongside normal inference for low cost. this enables more practical continual learning.

21,403

Will · Jan 18, 2025 · 9:07 AM UTC

Will

@_brickner

18 Jan 2025

Replying to @shaggysurvives

use it like a latex keyboard

4,378

Will · Dec 24, 2024 · 10:08 PM UTC

Will

@_brickner

24 Dec 2024

Replying to @rebelcrayon

these are very good points. thank you

4,677

Will · Jul 7, 2021 · 3:36 AM UTC

Will

@_brickner

7 Jul 2021

Replying to @punpunpunpun

Will · Apr 20, 2023 · 5:14 AM UTC

Will

@_brickner

20 Apr 2023

Replying to @atomicthumbs @imgur

dw i will be able to buy imgur for $120 in six months and reverse the policy

6,375

Will · Dec 25, 2024 · 1:52 AM UTC

Will

@_brickner

25 Dec 2024

Replying to @PandaAshwinee

I have good news: the algorithms appear not to be the same. there is conceptual similarity in the approach (gradient is being projected onto noise), but the way this is accomplished is different, and the scaling properties should not be assumed to be the same.

12,140

Will · Feb 4, 2025 · 7:11 AM UTC

Will

@_brickner

4 Feb 2025

Replying to @Andercot

Recall that blackhole interior has reduced spatial dimension, it is (2, 1), the radial coordinate is timelike. I bet our universe is the (3,1) interior of a black hole in (4,1) spacetime, or even timeless space: (4,0). Odd though that our timelike coordinate seems reversed! We have a singularity of infinite density inevitably in our past, not in our future!

3,338

Will · May 16, 2023 · 9:00 PM UTC

Will

@_brickner

16 May 2023

some sort of chicken game

3,204

Will · Jul 31, 2023 · 7:01 AM UTC

Will

@_brickner

31 Jul 2023

Replying to @floates0x

afaik indian team did not follow procedure properly and is attempting a second synthesis

9,124

Will · Jul 20, 2021 · 5:42 AM UTC

Will

@_brickner

20 Jul 2021

Replying to @hoffridder

These conversations kill me. It's not that it doesn't come up. it is literally constantly *always* coming up, like, you are ignoring or not aware of it! Interviews are structured poorly, sure, but jfc

Will · Dec 24, 2024 · 11:24 PM UTC

Will

@_brickner

24 Dec 2024

Replying to @VictorTaelin

the lowest precision activations you can get away with to date are 4 bit (uses tricks), but yes i think it’s all integer arithmetic. bitnet uses a precision matrix scalar, but that can be factored out of the dense matmuls. some modalities might require a float encoder. i think there’s been work on binary token embeddings, but again 4 bit activations. i would love to see an all-tern/bit paradigm. idk if that’s possible.

4,833

Will · Dec 25, 2024 · 1:08 AM UTC

Will

@_brickner

25 Dec 2024

Replying to @giffmana @davidad

i’d like to train other models. note that adam over ternary mnist also reaches ~90%. we will see how step size versus weight size scales. my thinking is the steps format is most effective at large scale as the benefit to additional samples diminishes

3,742

Will · Apr 29, 2018 · 2:27 PM UTC

Will

@_brickner

29 Apr 2018

Replying to @fermatslibrary

The relativistic kinetic energy is 48.02 Joules! That's the same energy as dropping your iPhone X on your face, but from 92 feet 4 inches in the air. In a single proton!

Will · Jul 30, 2020 · 8:09 PM UTC

Will

@_brickner

30 Jul 2020

Replying to @Psilocervine @mcclure111

They have this extremely useful feature where you can check via search if you can use a song and in what countries. They're removing it for no reason sometime soon. The workaround is post it as private, check if you get flagged, if so, re-edit the video and try again. Vomit.

Will · Dec 25, 2024 · 1:56 AM UTC

Will

@_brickner

25 Dec 2024

Will

@_brickner

25 Dec 2024

Replying to @PandaAshwinee

9,688

Will · Dec 25, 2024 · 1:26 AM UTC

Will

@_brickner

25 Dec 2024

Will

@_brickner

25 Dec 2024

Replying to @_brickner @giffmana @davidad

btw, thank you for reading! with this (oversized) model, the correct weight size is 53_191 bytes. with the config used, with convergence at 1000 steps, the steps cost 25_280 bytes.

10,699

Will · Oct 22, 2021 · 11:52 AM UTC

Will

@_brickner

22 Oct 2021

Replying to @swagitda_

Pretty odd that in 2021 a pdf can turn your computer into your secret enemy and the solution the entire world agrees on is just "guess which PDFs are hexed and don't open em"

Will · Jul 26, 2023 · 7:23 PM UTC

Will

@_brickner

26 Jul 2023

Replying to @andrewmccalip

Some dude sells it on Etsy, for an "element collection". etsy.com/listing/1480854294/…

4,916

Will · Dec 24, 2024 · 10:01 PM UTC

Will

@_brickner

24 Dec 2024

Replying to @gfodor

Will

@_brickner

24 Dec 2024

Replying to @_brickner

I woke up late, here is a cpu implementation colab.research.google.com/dr…

4,111

Will · Dec 21, 2024 · 3:13 AM UTC

Will

@_brickner

21 Dec 2024

i know a secret. the compute, bandwidth, memory, and energy requirements will all melt away.

Chris Prucha

@chrisprucha

20 Dec 2024

OpenAI's O3 model really makes the Doomer hard takeoff or "FOOM" theory look like a bunch of BS. As we start to enter the age of AGI, the massive amounts of required compute, interconnect bandwidth, and energy are real physical constraints that govern scale over the time dimension. FOOM can't happen when you need to build massive solar farms and nuclear power plants.

1,540

Will · Aug 15, 2023 · 5:46 AM UTC

Will

@_brickner

15 Aug 2023

Replying to @samsoniuk

1,232

Will · Dec 23, 2024 · 7:00 AM UTC

Will

@_brickner

23 Dec 2024

for all the computational ability of current models i have never seen them have a genuinely good idea. nothing novel of value really emerges. who has studied this?

2,233

Will · Oct 1, 2024 · 3:56 AM UTC

Will

@_brickner

1 Oct 2024

Replying to @sdamico

in fact there are claims that the visitors gifted equivalent artifacts to both the americans and soviets. whatever you believe, the capabilities are now known to be real, and the propulsion is reactionless. a secret physics must exist.

4,383

Will · Dec 10, 2021 · 8:16 PM UTC

Will

@_brickner

10 Dec 2021

Replying to @mountain_ghosts

god saves the silliest battles for the funniest clowns

Will · Oct 11, 2016 · 4:02 PM UTC

Will

@_brickner

11 Oct 2016

People have been brainwashed into thinking its normal and that any other way would make us unsafe lmao

Will · Apr 26, 2023 · 12:04 AM UTC

Will

@_brickner

26 Apr 2023

Replying to @servomechanica

MIT admission is not about being cool and smart, and this is actually a good thing-

1,248

Will · Jul 9, 2021 · 5:32 AM UTC

Will

@_brickner

9 Jul 2021

Replying to @leaacta

it could be an illusion, and maybe the catgirls latent in all computer communities are just less readily visible or something ..or maybe the big ears allow them to hear the subtle screams of cpus accessing memory unsafely, could be both

Will · Aug 2, 2023 · 1:04 AM UTC

Will

@_brickner

2 Aug 2023

Replying to @mattparlmer

the only real engineers are those who work on steam engine locomotives

2,014

Will · Mar 1, 2018 · 4:15 PM UTC

Will

@_brickner

1 Mar 2018

If you've never used @fusetools you've never lived. I never realized how awful the DOM was until now. I started like 3 days ago and it's blowing my mind #FuseTools

Will · Apr 25, 2021 · 5:11 PM UTC

Will

@_brickner

25 Apr 2021

Replying to @iamdevloper

As an old man I sit in my armchair and fondly remember my life, filled with mystifying segfaults, intriguing runtime type errors, thrilling RCE CVEs, suspenseful execution times. Rust never did corrupted me, I wrote code the hard way, the honest way. I am happy.

Will · Mar 28, 2023 · 6:51 PM UTC

Will

@_brickner

28 Mar 2023

they may have anticipated this timing attack info leak & set a constant higher duration per token

2,654

Will · Dec 21, 2024 · 5:52 AM UTC

Will

@_brickner

21 Dec 2024

ive seen arrows you people couldnt imagine

1,499

Will · Dec 7, 2024 · 8:22 PM UTC

Will

@_brickner

7 Dec 2024

can someone 'endorse' me for the ML arxiv? I cant submit my paper at all lol. i am just a guy. its a cool paper

1,040

Will · Nov 3, 2021 · 10:01 PM UTC

Will

@_brickner

3 Nov 2021

Replying to @FeiKhal

if I were a great lakes captain I'd simply make my boat flexible and not have that happen to me

Will · Jun 7, 2022 · 7:31 PM UTC

Will

@_brickner

7 Jun 2022

was thinking about the long-term positive externality of rustlang, like from an economic perspective. I think the lifetime value of rustlang to humanity is something on the order of +$1T, essentially generated for free by volunteers.

Will · Dec 19, 2024 · 4:52 AM UTC

Will

@_brickner

19 Dec 2024

the future is gonna be so cool

Zhou Xian

@zhou_xian_

18 Dec 2024

Everything you love about generative models — now powered by real physics! Announcing the Genesis project — after a 24-month large-scale research collaboration involving over 20 research labs — a generative physics engine able to generate 4D dynamical worlds powered by a physics simulation platform designed for general-purpose robotics and physical AI applications. Genesis's physics engine is developed in pure Python, while being 10-80x faster than existing GPU-accelerated stacks like Isaac Gym and MJX. It delivers a simulation speed ~430,000 faster than in real-time, and takes only 26 seconds to train a robotic locomotion policy transferrable to the real world on a single RTX4090 (see tutorial: genesis-world.readthedocs.io…). The Genesis physics engine and simulation platform is fully open source at github.com/Genesis-Embodied-…. We'll gradually roll out access to our generative framework in the near future. Genesis implements a unified simulation framework all from scratch, integrating a wide spectrum of state-of-the-art physics solvers, allowing simulation of the whole physical world in a virtual realm with the highest realism. We aim to build a universal data engine that leverages an upper-level generative framework to autonomously create physical worlds, together with various modes of data, including environments, camera motions, robotic task proposals, reward functions, robot policies, character motions, fully interactive 3D scenes, open-world articulated assets, and more, aiming towards fully automated data generation for robotics, physical AI and other applications. Open Source Code: github.com/Genesis-Embodied-… Project webpage: genesis-embodied-ai.github.i… Documentation: genesis-world.readthedocs.io… 1/n

1,404

Will · Jun 19, 2023 · 6:17 AM UTC

Will

@_brickner

19 Jun 2023

Replying to @Kaju_Nut

391

Will · Mar 25, 2023 · 6:19 AM UTC

Will

@_brickner

25 Mar 2023

Replying to @gbrl_dick

this is the funniest one i’ve seen yet. walking it through your own math problem without mentioning death camp and then claiming gpt plans death camp

872

Will · Jul 9, 2021 · 5:24 AM UTC

Will

@_brickner

9 Jul 2021

Replying to @leaacta

genuinely wonder what causes the enrichment of catgirls in the rust community / userbase relative to the general computer-person population bc it is definitely a real phenomenon

Will · Dec 25, 2024 · 5:25 AM UTC

Will

@_brickner

25 Dec 2024

place ur bets boys!

Andrew @andrew_v10209

24 Dec 2024

Created a manifold market to decide if this paper is real or not, link below

3,943

Will · Mar 2, 2021 · 7:38 AM UTC

Will

@_brickner

2 Mar 2021

Replying to @everestpipkin

transistors > cells means chip > brain

Will · Apr 6, 2020 · 3:34 PM UTC

Will

@_brickner

6 Apr 2020

Replying to @hikari_no_yume2 @hikari_no_yume @whitequark

The Trinity

Will · Dec 25, 2024 · 1:19 AM UTC

Will

@_brickner

25 Dec 2024

Replying to @_brickner @giffmana @davidad

btw, thank you for reading! with this (oversized) model, the correct weight size is 53_191 bytes. with the config used, with convergence at 1000 steps, the steps cost 25_280 bytes.

11,792

Will · Jul 17, 2020 · 10:36 PM UTC

Will

@_brickner

17 Jul 2020

Replying to @sharifshameem

I wonder how it scales with large complexity and defining really complex things. At what point is it more cumbersome to describe the behavior in English than to write the code yourself unambiguously?

Will · Dec 24, 2024 · 10:09 PM UTC

Will

@_brickner

24 Dec 2024

Replying to @teortaxesTex

hell portal link confirmed..?

Will

@_brickner

24 Dec 2024

Replying to @_brickner

I woke up late, here is a cpu implementation colab.research.google.com/dr…

585

Will · Oct 4, 2023 · 1:11 AM UTC

Will

@_brickner

4 Oct 2023

Replying to @acidshill

i think the input from workers was about the thickness of metal they can weld with high quality bonds and no porosity / breakthrough of the material

208

Will · Apr 25, 2023 · 3:13 AM UTC

Will

@_brickner

25 Apr 2023

288

Will · Feb 4, 2025 · 7:34 AM UTC

Will

@_brickner

4 Feb 2025

problem: schwarzchild radius of our universe in (4,1) does not match! the 5D formula differs. Maybe it’s in (4,0), but literally has no causality! GR is not suitable. Another option is that total mass is over many non-interacting (3,1) shells, spaced along the timelike dimension.

Will

@_brickner

4 Feb 2025

Replying to @Andercot

1,126

Will · Mar 14, 2023 · 8:12 AM UTC

Will

@_brickner

14 Mar 2023

Replying to @zswitten

> if only you could evaluate these models objectively by having a general intelligence review a large set of outputs & evaluate them in a highly nuanced wholistic way :)

627

Will · Apr 29, 2022 · 8:18 AM UTC

Will

@_brickner

29 Apr 2022

Replying to @benedictevans

When does Twitter get it's Dancing Hotdog moment

Will · Sep 13, 2023 · 8:55 AM UTC

Will

@_brickner

13 Sep 2023

Replying to @BlazeLogos @GarryPNolan @tinyklaus @undersc0red

what is the source originally? what ties this to the hearing? can’t find any back links

10,485

Will · Jun 16, 2023 · 9:37 PM UTC

Will

@_brickner

16 Jun 2023

Replying to @tonofcrates

none (!), it should be an iterator -> iterator method

1,635

Will · Mar 1, 2025 · 10:03 AM UTC

Will

@_brickner

1 Mar 2025

Replying to @torchcompiled

why is this more impressive to me than HD images and audio and video haha

406

Will · Dec 24, 2024 · 10:04 PM UTC

Will

@_brickner

24 Dec 2024

Replying to @Titan1Beast

Will

@_brickner

24 Dec 2024

Replying to @_brickner

I woke up late, here is a cpu implementation colab.research.google.com/dr…

2,178

Will · Feb 3, 2025 · 9:18 AM UTC

Will

@_brickner

3 Feb 2025

reminder that the average age of an Apollo mission engineer was 28, and the Manhattan project was 25. put young smart nicotine addict males in control, reap the rewards.

Elai @elaifresh

3 Feb 2025

Can’t believe we’re running the Iraq playbook on our own government

512

Will · Nov 9, 2024 · 11:17 AM UTC

Will

@_brickner

9 Nov 2024

Replying to @jpohhhh @ArmandDoma

why would they do that? did it work or something

575

Will · Jan 14, 2025 · 6:32 PM UTC

Will

@_brickner

14 Jan 2025

Replying to @RBehiel @Andercot

It’s funny you mention a deep connection to the Higgs field. Any scalar field like T requires a spin-0 boson. We have already seen one: the Higgs boson! I’m a layman, is it possible that T literally is the Higgs field? This may offer a different view of its nonzero VEV, as mentioned in the original thread. I would appreciate your thoughts

1,575

Will · Jul 25, 2020 · 5:30 AM UTC

Will

@_brickner

25 Jul 2020

Replying to @KeziyahL

I thought this guy was doing satire, was I wrong?

Will · Dec 21, 2024 · 3:31 AM UTC

Will

@_brickner

21 Dec 2024

merry christmas the world will never be the same

1,080

Will · Apr 18, 2020 · 6:52 AM UTC

Will

@_brickner

18 Apr 2020

let my people hoe

Will · Mar 28, 2023 · 7:01 PM UTC

Will

@_brickner

28 Mar 2023

idk jack but would i be wrong to think that it implies a 20T token corpus

425

Will · Apr 12, 2023 · 5:19 AM UTC

Will

@_brickner

12 Apr 2023

so if i do 1 coin flip and see heads, P = 2/3? very upsetting result

1,648

Will · Feb 4, 2025 · 8:22 AM UTC

Will

@_brickner

4 Feb 2025

also I think it would imply our universe is curved and closed, only locally flat. I think it would be spherical. this could be true, if beyond the cosmic horizon it’s very very large.

268

Will · Dec 22, 2024 · 6:08 AM UTC

Will

@_brickner

22 Dec 2024

o3 deserves an even higher score on ARC lmao. also amazing it can infer the transform so reliably from the text format they use

Mikel Bober-Irizar @mikb0b

22 Dec 2024

You've seen some of the puzzles o3 failed, but have you seen the attempts? Yesterday, @OpenAI's o3 dramatically beat the SOTA at @arcprize. But there were 34 tasks that even it couldn't solve with 16 hours of thinking. I've compiled and analyzed all of o3's mistakes below 🧵

12 tasks from the ARC-AGI dataset. Each task consists of a number of 2-D reasoning tasks where you are given a handful of input/output pairs and a test example from which you must guess the final output correctly. In all 12 of these highlighted cases, o3 gets the answer wrong.

ALT 12 tasks from the ARC-AGI dataset. Each task consists of a number of 2-D reasoning tasks where you are given a handful of input/output pairs and a test example from which you must guess the final output correctly. In all 12 of these highlighted cases, o3 gets the answer wrong.

2,894

Will · Dec 25, 2024 · 9:04 PM UTC

Will

@_brickner

25 Dec 2024

Replying to @torchcompiled

there will prove to be problems, working to address them for larger tests. a few responses: the demo is a toy, a good kernel is a single forward pass with no perturbation memory, and because of sparsity the cost can be reduced a lot. too small or too large samples leads to poor convergence as you see. it’s not a random search, and the mnist example is 270k, more than a few hundred dimensions. despite these things your sentiment may prove correct, we will see :)

1,002

Will · Feb 20, 2025 · 8:30 PM UTC

Will

@_brickner

20 Feb 2025

Replying to @soundsonacid

stronger const generics and if let chains next 🤞🏻

633

Will · Feb 11, 2025 · 11:48 AM UTC

Will

@_brickner

11 Feb 2025

o3-mini is very reddit. inflexible knowledge from authority, often misses the point. knowledgeable though.

270

Will · Nov 23, 2024 · 3:53 AM UTC

Will

@_brickner

23 Nov 2024

Replying to @Duderichy

169

Will · Apr 29, 2018 · 7:06 PM UTC

Will

@_brickner

29 Apr 2018

Replying to @Phnkrz @fermatslibrary

2) Compare it to something normal: > I hate when I drop my phone on my head laying down ∆E_f = ∆U_i 48.02 = mgh iPhone X weighs 174g h = 48.02/(0.174*9.806) h = 28.14m = 92 ft 3.96 in Done!

Will · Feb 28, 2025 · 9:50 PM UTC

Will

@_brickner

28 Feb 2025

funny how much time went into linear attention. you should expect attention to be very similar to sorting. algorithms without degradation are all gonna be O(n log n)

269

Will · Jan 23, 2025 · 12:34 AM UTC

Will

@_brickner

23 Jan 2025

Replying to @jwt0625 @justalexoki

adorable and very impressive thought trace thank you

Will · May 9, 2023 · 11:26 PM UTC

Will

@_brickner

9 May 2023

Replying to @YosarianTwo

what is the probability of this digest occurring randomly? not a follower of this but does seem like the type of thing that’s gonna turn out to be 10^-200

1,417

Will · Aug 7, 2023 · 9:22 PM UTC

Will

@_brickner

7 Aug 2023

humanity requires universal and perpetual nicotine administration to continue making progress. we are too weak without it. specter of 1971.

319

Will · Mar 24, 2023 · 7:06 PM UTC

Will

@_brickner

24 Mar 2023

Replying to @terronk @mattparlmer

probably a gyroscopic effect would cause torque to the body

489

Will · Jan 30, 2021 · 6:46 PM UTC

Will

@_brickner

30 Jan 2021

kat just dropped anxiety fish v0.1.0

Will · Oct 28, 2021 · 12:13 AM UTC

Will

@_brickner

28 Oct 2021

Replying to @bascule @rustlang

it just gets better and better and better. almost unreasonably clear and easy

Will · Sep 29, 2023 · 8:17 AM UTC

Will

@_brickner

29 Sep 2023

Replying to @gbrl_dick

there is a further implicit leap that dangerous things should be kept from humanity, which is wrong in a more important way

874