defending free markets @fal. holding inference speed records. Python core developer / @thePSF fellow

SF
not much to say. pace has been incredible. 4 rounds in 18 months. incredible new investors @sonyatweetybird @mamoonha. all metrics going 🆙 exponentially. new exclusive models. more capital = more GPUs = more people = more investment into gen media. back to work.
Today, we're excited to share that we have raised a $140M Series D, welcoming new investors @sequoia, @kleinerperkins and @nvidia alongside the continued support of our existing partners!
11
6
150
23,656
every man's dream
104
287
3,189
306,942
how is deepseek so much faster than the fastest LLM inference providers? i am very confused
63
43
1,111
192,361
what should I try?
149
31
682
243,450
lol apparently bytedance is suing the author of the best neurips paper for $1 mil because he (allegedly) "sabotaged" other training runs. var-integrity-report.github.…
Well VAR just won Neurips best paper
16
31
538
150,216
fal.ai/grants for free gpus if you are building cool shit
thank you for the gpus!!
14
34
518
51,560
If API wise antrophic has even half of OAIs revenue, it means OAI has lost already wtf
Estimated OpenAI vs Anthropic revenue breakdown
21
7
455
48,681
i am thankful
30
13
453
38,476
Looks like I'm the #5th most active contributor of CPython for the last 6 months 🥳
15
15
390
I don’t usually get emotional about milestones, but today feels special. We've just raised our $125M Series C - our third funding round in the past 12 months. In this short time, we've grown from serving just a few hundred customers to empowering millions of developers. We've gone from barely making revenue to surpassing a $90M annualized run rate, all thanks to a core team of dedicated individuals This success didn’t come by luck; it was built on relentless effort, countless hours of hard work, and a brilliant team that's committed to excellence every single day. I'm incredibly proud of every member of this team, and I can't thank them enough for making this journey possible. There is no way I could have guessed we would have come this far this fast when @burkaygur called me and said he is starting a company about ML infrastructure. It was the best decision in my life to join him and @gorkem and be part of this amazing team from the start. We are still early in our journey, and if you’d like to come and be part of this amazing company & apply below fal.ai/careers
Today, we're thrilled to announce our $125M Series C funding round at a $1.5B valuation, led by Meritech Capital, marking our third successful raise in just 12 months! fal’s Generative Media Cloud now powers tens of thousands of applications, supporting over two million developers and more than 300 enterprise customers. From initial prototypes to fully scaled production, we've become the essential infrastructure enabling some of the most innovative and creative work in the industry. Over the past year alone, we've experienced extraordinary growth—averaging 40% month-over-month, consistently exceeding our own ambitious projections. Each month, hundreds of thousands of new developers and thousands of fresh applications join our platform, unlocking entirely new use cases previously thought impossible. With this latest round of funding, we're significantly scaling our engineering, support, sales, and marketing teams to keep pace with the accelerating demand and enthusiasm from our community. Our vision has always been clear: build a generative media platform that effortlessly creates dynamic, real-time content across video, audio, image, and 3D. Thanks to our incredible team, dedicated partners, and visionary customers, that's precisely what fal delivers today—empowering creativity at unprecedented scale.
48
15
422
51,913
We've been using pyx at fal and honestly, it's been incredible. No more late nights debugging CUDA versions or PyTorch incompatibilities, it just manages it. Our ML engineers actually thanked me for bringing this one and saving them tens of hours😅
Today, we're announcing our first hosted infrastructure product: pyx, a Python-native package registry. We think of pyx as an optimized backend for uv: it’s a package registry, but it also solves problems that go beyond the scope of a traditional "package registry".
6
16
363
30,961
I am starting to believe AI video was a mistake
38
20
334
62,964
Yesterday I've gained commit privileges and promoted to a CPython Core Developer 🎊🎊
23
14
323
ok bullish on google. they started i/o with image and video generation and then announced sota diffusion language model.
4
2
332
7,721
we just raised 23M from the top VCs in silicon valley. has been an amazing ride in the last 2 years building the fastest inference service for media models. now its time to scale!
We've raised $23M from Kindred Ventures, Andreessen Horowitz, First Round Capital, Perplexity CEO @AravSrinivas, Vercel’s @rauchg , @balajis and Huggingface CTO @julien_c.
34
5
338
42,670
Grok 3 is gonna answer whether you can speed run into becoming a frontier lab if you had super large cluster (100k GPUs), insane funding (10B+) and one of the most dedicated teams in the world. Rooting for xAI’s release
It's 11:30pm, and many @xai people are in office, hard working at their computers. It's an amazing vibe. Everyone pushes their way to deliver the best experience to you users. Everyone supports everyone. No one fucks no one with politics. You can just do things.
1
6
319
25,924
i can't understate how big of a jump this is in video editing quality and speed. Existing video editing models take more than 1 minute for a short video. Lucy does it under 5 seconds. INSANITY by @DLeitersdorf @DecartAI team!
10
23
306
30,586
stop complaining that you don't have a job and build something like this. then your inbox will be full with job offers from startups that you love
day 6 & 7 of automating factorio -- i have massive mod updates. i restructured the mods & added a fuck ton of modding for simple atomic actions that an agent could take. 17 actions implemented so far (example video shows placing a furnace, without collisions). we're cooking :)
4
11
279
17,914
personal news: I have moved to San Francisco / 🇺🇸. See you all here
27
1
253
16,601
i feel more confident in my ability to build a >30B company than finding a gf for myself.
21
5
246
22,022
the more interesting thing is windsurf buying access for claude models from hyperscalers. this essentially means antrophic has no control over who hyperscalers can sell their model to?
Anthropic co-founder on cutting access to Windsurf: 'It would be odd for us to sell Claude to OpenAI' | TechCrunch techcrunch.com/2025/06/05/an…
10
2
229
24,772
Back in 2022 at @fal, we built our own lazy environment packing format specifically to optimize container startup times (we knew the workloads were going to be very diverse, and there was a GPU crunch so scaling to 0 and scaling back up fast when there is demand was key). The initial implementation used a custom archival filesystem to make stat() calls almost free by storing metadata at the front of the archive and enabling lazy file pulls with fast seeked reads. Since then we've shipped a bunch of cool improvements (probably forgetting some, but here are the main ones): - Hierarchical caching: We run 10-20 regions with thousands of host nodes. Local SSD caching is great, but leveraging peer nodes with 100 Gb network access (vs public internet/blob storage at 10 Gb) is even better. Now it's local nvme → peer nodes → S3 fallback. - Overlayfs for diff builds: We call this "Environment hopper" internally. The insight is everyone needs the same core dependencies, so we build on top of common base archives using overlayfs to save diffs and heavily compress them. Cache reuse rates are through the roof. - uv by @astral_sh for environment builds: We seriously considered writing our own package manager since downloading torch and other heavy packages is painfully slow with pip. uv was a total lifesaver. Should probably write a proper deep dive on this at some point - there's so much cool stuff in building domain-specific environment packing with high cache utilization and fast cold starts as the target.
8
15
228
33,861
so grok 4 was RL'd off grok 3 base model? they talked like grok 3 base model was so good that there wasn't a reason to re-train a base model
15
3
216
38,211
We are hiring at fal.ai/careers. If you don’t see a role that fits you and you think you are pretty smart & overall excellent; send your CV to careers at fal dot ai with a small cover letter of why we should hire you.
5
10
212
44,285
word leaked 😱😱😱😱 claude 4 opus is coming
19
2
198
26,167
Introducing AuraFace v1: Commercially available & open source identity encoder model for next generation one shot personalization. huggingface.co/blog/isidenti…
8
23
204
20,041
Buying $1 of NVIDIA stock every like this tweet gets. Limited to first 72 hours (buy order will be made in 9am Monday PT).
2
3
199
8,209
Replying to @Osmo_Labs
curious how does the reprinting system works? how many base ingredients do you need to construct a significant chunk of all available smells in the world?
4
1
177
23,861
adding "if you think you are pretty smart & overall excellent" changed the quality of applicants by 10x. interesting.
We are hiring at fal.ai/careers. If you don’t see a role that fits you and you think you are pretty smart & overall excellent; send your CV to careers at fal dot ai with a small cover letter of why we should hire you.
8
5
187
27,830
the feeling when your hand written kernel is 2ms faster than triton 😎
9
3
193
22,533
uhm, does anyone have any idea on how to make model "unlearn" some stuff, if by any chance if it learnt stuff that it shouldn't include in its distribution?
32
3
181
35,412
can't believe how quickly a year has passed since. 22 now!
today i turned 21! legal age for many things 🙈
35
1
187
27,549
Anyone born before 2000 is like, old. Idk
24
3
157
19,993
Looks like traditional inference providers for LLMs dropped the ball on DSv3/R1 inference. It is not in our mandate but personally I might spend a weekend or two making a fast ‘real’ R1 endpoint available on fal. Who is interested?
18
4
152
10,101
Whoooaaa, I became a PSF Fellow Member! Even though my name is misspelled, this is still amazing news! 🎊🎊🎊🎊
Python Software Foundation Fellow Members for Q4 2020 ift.tt/2P6GhSq
17
7
142
I've always been a high conviction person. When I see potential being realized, I double down hard. That's exactly what we did with fal, and that's what our investors did with us. Thrilled to announce our $49M Series B (just a couple months after our A!) 👊👊
Generative media is expanding faster than we ever imagined, and video is its next frontier. We raised a $49M Series B to fuel our mission: building the core infrastructure for AI-driven media creation.
13
5
149
30,640
if anyone needs access to O1 freely, you can use it here (this is a temporary playground, please do not use as an API): fal.ai/models/fal-ai/openai-…
10
6
147
17,951
yoooo, new public dataset drop! 5 million moondream2 (rev=2024-05-08) captioned text to image pairs. huggingface.co/datasets/isid…
5
16
142
42,934
today i turned 21! legal age for many things 🙈
35
138
47,330
starting a company doing image/video/audio/3D and need compute for your initial PoC? hit us up at fal.ai/grants
8
12
133
14,087
one thing people seem to not care (or just forget) in this margin discussion is how lean AI companies are OpEx-wise. Cursor, Lovable, Bolt, all small teams. At fal we crossed $50M run rate with less than 25 (no GTM, only core eng), and $100M with less than 40 people (6 GTM).
11
7
126
15,098
For people training their own models and wanna not use SD3's commercial licensed VAE, will be releasing our own 16ch one which is comparable in perf!
8
11
122
21,139
happy nano banana day to those who celebrate
9
4
120
8,810
best skillset rn is the intersection of compilers & distributed systems and machine learning. distributed ML compiler systems engineering??
3
7
108
21,251
Introducing a feature without actually having it lol
Today we're introducing Model Fine-tuning. A new self-serve offering that will soon allow you to customize our models towards your specific use cases and on your own data. From entertainment to robotics, education, life sciences and beyond, our next generation of customizable models will unlock entirely new use cases with the introduction of Model Fine-tuning. Learn more and submit your interest to become a pilot enterprise partner at the link below.
5
115
18,351
and he keeps using spot H100s!!!! Joking aside @vikhyatk is one of the most dedicated people I know, and I am so happy that he can expand moondream from a hit open source project to a viable company with this funding.
Moondream raises $4.5M to prove that smaller AI models can still pack a punch venturebeat.com/ai/moondream…
1
2
109
17,777
Replying to @ClementDelangue
stability is a better investment
1
105
7,847
Assuming this is <2k, I think we ll get every ML eng @fal one of these. It seems much easier than a whole desktop setup and comes with 128GB of DDR5 for experimenting
15
106
9,721
This got more attention than I thought it would, so here you go people: huggingface.co/AuraDiffusion…. A fully commercially licensed 16 channel VAE.
For people training their own models and wanna not use SD3's commercial licensed VAE, will be releasing our own 16ch one which is comparable in perf!
7
24
106
13,877
My first podcast episode at fal, alongside @burkaygur and @JenniferHli at @a16z's youtube channel. Hear about everything and anything from fragmentation in the generative media model space to building a sales team culture at an extremely technical company.
There is no king of the hill in generative AI. Sora looked untouchable. Then Luma, Runway, and MiniMax dropped. Veo 3 launched. Two weeks later, Seedance leapfrogged it. "If you’re not the best, you’re not releasing." @burkaygur and @isidentical from @FAL break down what it’s like tracking the frontlines in real time. The leaderboard resets every week. It’s live combat.
9
6
106
10,827
Wait a minute, why is GPT4-o mini has the same price as GPT4-o for vision. One claims it is 255 tokens and the other one is 8500 tokens for the same input image. Like wtf?
10
7
97
22,327
have been working on this for a while, Wan 2.1 Pro is now available at fal. 6 second videos at 1080p with native 30 FPS fal.ai/models/fal-ai/wan-pro…
5
7
103
9,672
Image and video workloads have evolved quite a bit over the past years, and we've been continuously ranking #1 on inference speed without any compromises on quality. The key: specializing on your workload. Our inference team comes from a diverse set of backgrounds; some worked previously on compilers and interpreters (e.g. CPython), some worked on database internals (e.g. Datafusion) and some worked on ML systems before it was cool at big tech. One thing we all shared was the art of making generic things go fast. But there's a fundamental limit there - when the workload could be anything, the amount of optimizations you can apply gets less and less. This is why paradigms like Just-In-Time compilation help. It lets you take a generalized base code, and with the knowledge coming from runtime about what the workload is, generate a more optimized piece of it with more variables filled. We took this same philosophy, but applied it from day one. We knew what our workloads were, and went from the first principle of performance. Profile. Got our hands dirty with the initial UNets back in the SD1.4/1.5 days, making them faster by writing more efficient convolution kernels for Ampere GPUs (A100s). Now the game's different, but the core principle is the same. We're writing efficient DiT implementations from custom MLP fusions to long context dynamic sparse attention for Blackwell chips. If you want to be part of this amazing team doing this performance work, have a good understanding of the fundamental ML stack and can reason about performance engineering foundations, apply below fal.ai/careers/595e1e65-2fc4…
4
8
101
19,023
Announcing imgsys.org: a generative arena for text-guided open source image generation models. Like Chatbot Arena -- but more fun (because it is images)!
8
22
99
20,506
big
🚨 All @elevenlabs models are now available @fal ! 🚨
10
99
4,017
how can i convince smart people to stop writing rust and work on ML performance?
10
87
8,295
OK guys, holy shit, gemini-1.5-pro-exp-0801 might be the captioning king (much better than GPT-4o). I need to caption a couple hundred million images @google where do i get the api keys
10
1
96
9,608
2nd year of fal x Black Forest Labs x Krea Pytorch Conf afterparty. Another 🔥 event with my besties
7
5
100
9,067
can't explain how insane this is. fal models are gonna be officially available in Adobe products like express / concept. 🤯🤯🤯
✨We are excited to bring @fal models into @Adobe's ecosystem. More details on the Adobe blog:
6
3
92
4,574
i'm in love with what our product team is cooking 😍😍😍
5
5
91
6,686
📈📈📈📈📈📈(this graph is real btw)
✨ This week I visited my API provider @FAL in San Francisco It's funny to go visit the ACTUAL physical place where you send tens of thousands of CURL requests to every day (well kinda cause they host stuff not in the office of course) FAL is responsible for hosting ALL the AI models I use for my site Photo AI and keeping those stable, affordable and most of all VERY fast My site Photo AI now produces over 1 million photos per month (and some videos), and every single one of those goes through FAL's infrastructure They take the work of setting up servers with GPUs and managing them out of my hands, and on top of that for a very affordable rate so it'd be more expensive for me to do it myself I discovered FAL the first time because they were somehow able to run a lot of the models I used (back then Stable Diffusion) ridiculously fast: where before it'd take me 45 seconds to generate a finetuned AI photo (meaning a photo of an actual person), with FAL it'd take 3 seconds! So about 10-15x faster! Insane Their whole edge is speed and they do everything they can to optimize AI models to run faster, and it's economically smart for them too: they often charge the same $ per megapixel, so if they are able to run things faster, it costs them less GPU time, so less $ spent for them, but they can charge the same $ = more profit (but usually they then make it cheaper for me too) One of the coolest things they did recently is built a superfast Flux trainer that runs in just 30 seconds, and at high quality. This means when users sign up to my site Photo AI, they upload their photos, and a few minutes later they're already taking AI photos of themselves They're also really stable though and that's important. If FAL goes down, Photo AI goes down, and customers get very angry. Imagine people signing up and paying, and it doesn't work. I get about 30 new customers per day and have 2000+ active customers, if it goes down people get very angry, very fast. Luckily FAL literally almost never goes down! I was kinda shocked how lean they are when visiting their office, it's just about 20 employees but they have a big impact, and big enterprise customers like Adobe signing up. Obviously each employee is very impactful for them to be able to run it like this. Very impressive and indie in a way I heard it's good to align yourself with your API providers you rely on, so I've also invested in them now a bit! They took us out for 🥟 Dim Sum and what surprised me was how international they were, it's 2 🇹🇷Turkish founders (ex-Coinbase and Amazon), and then they have people from Brazil, Venezuela, China, Bulgaria. They're in America but it's actually barely American. Maybe obvious but Silicon Valley really doesn't care where you're from or how you look, it's only about what your skills are and how you can contribute I think the next thing for AI now is video (yesterday Veo 3 came out for example, a groundbreaking video model with speech dialogue). AI companies now are trying to enter Hollywood to try sell the promise of AI video to them which can save them a lot of money vs. recording actual footage. FAL is a provider for those video models so can provide those big media companies with affordable and reliable AI model hosting. In a way they're just addressing a tiny part of the market now with people like me, when the big market is all the companies that produce image/video content. So yes go meet your API providers! You'll discover there's actually people behind the API endpoint! 😂
6
1
88
5,351
there is a new, potential SOTA model in imgsys.org 👀👀👀
13
9
91
21,230
"the fal guy" won the v0 challange. we made him real. say hi 👋
Congrats to @aykutkardas, winner of the Game Challenge and $1000 in v0 credits. Thanks to everyone who joined. What challenge should we run next?
6
1
88
5,114
seeing simo meet karpathy was such an amazing view. wouldnt miss it for the world
* comes to SF * goes to random dinner * @karpathy is on the next table????
2
87
5,811
🏓🏓
Really impressive how you were able to build such a strong team
7
4
91
8,064
fal’s first QuietBox ™️ was hand delivered by the @tenstorrent team! Looking forward to running some diffusion models on this fast boi (1 PFLOPs of dense bf16). Big thanks goes to @antodisanfran
4
4
86
3,534
😌HAVE BEEN WAITING AN ETERNITY TO ANNOUNCE THIS. FAL ML TEAM GETS EVEN MORE CRACKED. w/@jfischoff @_yatharthg @Gothos03 @gokayfem @chengzeyi AND now joining us @cloneofsimo.
Whatup chads, I'll be joining @FAL and lead the research effort to develop open source / proprietary models.Ill work on boundaries of research / product, theory / practice and inference / training. I have couple head counts, dm if you wana join my team. Thx
8
3
81
6,289
you might ask why are we launching this now when the model is clearly undertrained instead of waiting for a few month? the reason is why iterate behind the closed doors and have no community feedback. what do we have to hide? let's build this model together. step by step.
Replying to @cloneofsimo @FAL
There is a lot more story to tell, but we are just releasing the intermediate checkpoint for now. What you are seeing is actually now what we have, and is merely a beta release!
7
2
75
3,713
VEO 3 LAUNCHED FIRST FOR FAL CUSTOMERS. BEFORE ANYONE ELSE :) just like Kling 2.1
🚨🚨🚨 Veo 3 by Google, world's most advanced video generation model, is now available FIRST on fal! Try it TODAY THIS IS NOT A DRILL! fal.ai/models/fal-ai/veo3/pl…
4
1
80
6,501
fal has the fastest FLUX schnell implementation, BY FAR. we worked hard for it, and happy to see it sustain load even on serverless cases. for enterprise deployments we can get up to 4x faster than this btw :)
7
7
79
11,395
🐨🐻‍❄️🤓
2
81
2,591
You asked, we answered. Promotional pricing for limited time but we are working hard to bring it down for good 🏃
Veo 2 is now $1.25 per 5 second video! Enjoy! fal.ai/models/fal-ai/veo2
6
7
77
9,381
DeepSeek v2, the AGI for the poor. why isn't it a bigger deal?
8
3
76
5,746
i am official Kimi K2 pilled.
3
3
76
4,849
We at @fal are sponsoring PyTorch Conference'2024! We are big users of torch here, and are continuing to push the frontiers of the provided tooling with PT2
6
4
77
17,823
dating profile update: works at the hottest ai startup in town. only wants to talk about ai inference.
5
3
75
29,646
everyone is waiting for the next GPT, but they dont know what is coming up today :))))
7
75
8,399
i am the fal guy
7
1
76
4,349
Last month I promised if our tweet got 200 likes, we’d make whisper 20% faster (and we did it 120% faster while halving the price and maintaining WER). Now I am saying I’ll double the performance of whisper v3 if we get 2000 likes. You go figure what will happen :)
Did you meant to say one of the fastest? Because i can find someone who is even faster :)
2
9
77
7,525
come work with @maharshii, @cloneofsimo, @noahsolomon and every single other cracked eng (which is everyone else) we have in the team fal.ai/careers
i have learnt more in these 3-4 months than i have in the entire past year. everything from advanced low level gpu kernel structures to high level torch distributed quirks. just being around cracked people makes you cracked inevitably.
3
3
78
20,863
Just turned off our last 6 EC2 instances -- Moondream Cloud inference is now running 100% on FAL
1
2
76
6,996
🫡@a16z new media team
secrets of new media (from internal deck) in particular, building channel bibles, inhouse world class creatives (hire well), then forward deploy into portcos
1
3
76
12,380
hey @OfficialLoganK. how can we get into "Image generation (early access/allowlist)" of Gemini 2.0?
6
74
4,694
this post is definietly not sponsored by @dylan522p
1
8
71
10,163
RELEASE YOUR MODELS AS OPEN SOURCE PEOPLE!!!!!!
8
4
72
9,774
LETS FUCKING GO MAN. LETS FUCKING GO. THIS IS JUST THE START, AURAGAN, AURAFLOW. If no one is releasing models, WE WILL. WHO CARES, it might not beat the closed source SOTA versions, but it will be our own and we'll keep developing it!!!
So much of the credit goes to @isidentical who did loads of technical works from data to infra management to make this happen. I dont know how to coordinate NFS amongst nodes nor preprocess massive dataset with ray... This model would have not been here without him, and btw this guy is just cracked af!! 😆😆
2
2
72
3,114
Today i tried bolt, lovable and v0. And the only one worked (actually worked much better than I thought) was v0. The difference is huge. This is for someone who has no idea about web dev, i just shared as much context as possible about fal APIs and it built everything around it!
11
4
74
6,534
i am sorry to inform you but clip, in fact, does not work. it is a deeply flawed model. sorry to be the one that is telling this
its a miracle that CLIP actually works btw
7
1
74
20,454
it is also dirt cheap
74
10,165
Hiring for one of the highest impacts roles we have opened so far @fal. You'll work very closely with our elite ML perf team and me, and be a technical trend setter. Shoot me a DM if this sounds like you or someone you know. featuresandlabels.notion.sit… On-site in SF
1
12
75
9,987
My first M&A. Got the know the teas a fal customer, put an angel check in and decided they’d be an amazing fit building tools for developers so asked them to join fal! Always on the lookout for more technical teams, in SF. Hit me up
We’re thrilled to announce that fal has acquired @Remade_AI, a YC-backed startup building advanced tools for generative media creation. The Remade team has pushed the boundaries of AI creativity, from building an intelligent canvas for marketing teams, to open-souring video LoRAs used by thousands of creators and developers. They are joining fal to help accelerate the next generation of creative infrastructure for developers around the world. Welcome to the team @BlendiByl, @Christos_antono, @rehan_shei and @alexgmatt. It’s amazing to have you on board!
7
4
73
7,073
I never thought I’d get the honor of getting a drink named after me. Thank you @nina_gerov @gorkem @burkaygur ❤️
4
1
74
3,179