Something new coming soon

please excuse the silence. we've been cooking up something cool and are excited to share more details soon
23
2
36
4,727
NVIDIA just dropped benchmarks showing 4-bit inference loses less than 1 point vs BF16 on most tasks. It's not accuracy per request that you should be measuring. It's tasks completed per dollar. And at that metric, 4-bit wins by a landslide. Read the full blog 👇
15
6
40
8,611
1/ Yesterday we announced mdspan-cute: C++23 std::mdspan syntax with CUTLASS cute layouts. One header. Zero overhead. Here's how it works 🧵
3
5
20
2,974
7/ Layout algebra is formalized in Lean 4. 26 theorems, 0 sorry. Properties extracted to RapidCheck tests. The art/ directory has 23 SVG visualizations - we drew pictures until we understood.
1
8
2,079
8/ Check out the code: github.com/weyl-ai/mdspan-cu… Check out the Proofs: github.com/weyl-ai/mdspan-cu… /end
7
1,823
💿 Open Source Release 💿 mdspan-cute: a zero-overhead bridge between C++23 std::mdspan and CUTLASS cute layouts. One header. Swizzled memory. No bank conflicts. Read the blog and check out the repo (links in reply)
1
1
11
2,199
5/ Quantized RoPE already runs in: → LLaMA → Mistral → Most open source inference stacks This isn't obscure. It's foundational.
1
4
696
6/ On "bit augmentation": Log/exp is a bijection. Information in = information out. You can't create precision from a reversible transformation. Thermodynamics doesn't allow it.
3
605
1/Yesterday we announced nix2gpu - a NixOS package for portable GPU containers. Portable containers prevent vendor inference lock-in. Here's why it's a big deal. #Nix #AIInfra
1
2
13
2,070
7/ Why it matters: Makes distributed GPU compute easy and deterministic. Philosophy: It's just Linux with libs - complexity is optional. Open-source, MIT-licensed; production-tested on Fleek machines.
1
2
853