apolinario (poli) · Nov 5, 2025 · 8:52 PM UTC

apolinario (poli)

apolinario (poli)

@multimodalart

5 Nov 2025

Qwen Image Multiple Angles LoRA is an exquisitely trained LoRA! 📐˚₊‧꒰ა Keep character and scenes consistent, and flies the camera around! Open source got there! One of the best LoRAs I've come across lately 🙌

256

2,225

121,495

apolinario (poli) · Feb 7, 2025 · 10:57 AM UTC

apolinario (poli)

@multimodalart

7 Feb 2025

Boring Reality LoRA just dropped for HunyuanVideo 🏙️🏞️ A fine-tune that lead not to cinematic shots, but to something that could've come out of your phone 📱

149

1,615

407,681

apolinario (poli) · Sep 25, 2024 · 9:54 PM UTC

apolinario (poli)

@multimodalart

25 Sep 2024

testing out the Diffusers Image Fill demo capabilities on a random image

136

1,164

274,355

apolinario (poli) · May 23, 2022 · 9:54 PM UTC

apolinario (poli)

@multimodalart

23 May 2022

Google just announced a DALLE-2-like model: Imagen For now no code, just demo site: gweb-research-imagen.appspot… And paper: gweb-research-imagen.appspot…

128

891

apolinario (poli) · Jun 16, 2025 · 11:54 AM UTC

apolinario (poli)

@multimodalart

16 Jun 2025

Hunyuan-3D-2.1 image-to-3D is now out! ✨ Open weights, permissively licensed 🔓 2.1 improves on 2.0 by a LOT in generating high quality textures for the 3D assets 🔥 This level of detail from a single image 🖤💎

104

872

49,901

apolinario (poli) · Nov 13, 2025 · 9:30 AM UTC

apolinario (poli)

@multimodalart

13 Nov 2025

Apply Texture Qwen Image Edit LoRA by tarn59 works with EVERYTHING! 👉🪵🧶, this model trains so well I've built this demo so you can apply *any* texture to *any* object on @huggingface

124

870

67,704

apolinario (poli) · Dec 1, 2022 · 10:19 AM UTC

apolinario (poli)

@multimodalart

1 Dec 2022

I hacked @huggingface Spaces to build an open source @gradio Dreambooth Training UI that allows you to train a model for less than US$0.80 🐱‍💻 (you can also use it locally for free): huggingface.co/spaces/multim…

105

793

apolinario (poli) · Apr 22, 2024 · 8:56 PM UTC

apolinario (poli)

@multimodalart

22 Apr 2024

My favorite part is that it works really well with out-of-the-distribution garments

apolinario (poli)

@multimodalart

22 Apr 2024

Testing out the new virtual try-on pipeline on @huggingface, IDM-VTON ▶️ huggingface.co/spaces/yisol/…

783

86,310

apolinario (poli) · Sep 27, 2024 · 8:26 PM UTC

apolinario (poli)

@multimodalart

27 Sep 2024

Editing facial expressions in real time now on @huggingface Spaces 👨‍🎤🔀 A Grog converted Cog image to Gradio running a ComfyUI backend - magic of open source 🤝 ▶️ huggingface.co/spaces/fffilo…

129

768

71,702

apolinario (poli) · Aug 15, 2024 · 7:33 PM UTC

apolinario (poli)

@multimodalart

15 Aug 2024

Releasing my first FLUX LoRA: FLUX Tarot v1! 🌙🧙‍♀️🃏 Based on Raider Waite's 1920 tarot (public domain) Model and demo: huggingface.co/multimodalart… Image & Caption Dataset: huggingface.co/datasets/mult…

701

70,781

apolinario (poli) · Dec 2, 2024 · 9:26 AM UTC

apolinario (poli)

@multimodalart

2 Dec 2024

outpainting with the new FLUX-1[dev] Fill model is just on a completely new level 🪼 i've built a Space for you to try it👇

668

90,730

apolinario (poli) · Aug 11, 2025 · 6:58 AM UTC

apolinario (poli)

@multimodalart

11 Aug 2025

yes! qwen-image excels at following precise rule breaking instructions very few models can do things like "a fried egg with a blue yolk"

fofr

@fofrAI

10 Aug 2025

A powerful image model knows when/how to break the rules. > a photo of an eye with three separate pupils qwen-image

172

662

3,068,683

apolinario (poli) · Jul 4, 2025 · 7:39 PM UTC

apolinario (poli)

@multimodalart

4 Jul 2025

Introducing Kontext Relight! 💡 ✨ A FLUX Kontext Relight LoRA + demo trained for state-of-the art relighting for subjects & landscapes

665

76,065

apolinario (poli) · May 15, 2025 · 8:29 AM UTC

apolinario (poli)

@multimodalart

15 May 2025

rejoiced by the rebirth of the skeuomorphic isomorphic 3D icons on @Airbnb I've trained a FLUX LoRA to generate 3D icons in that style

651

475,877

apolinario (poli) · Aug 28, 2025 · 7:24 PM UTC

apolinario (poli)

@multimodalart

28 Aug 2025

open source nano banana? bytedance just dropped USO, an open source editing model that... just works

640

84,405

apolinario (poli) · Sep 24, 2024 · 1:52 AM UTC

apolinario (poli)

@multimodalart

24 Sep 2024

reminder for flux: prompting is case-sensitive 𝙰𝚊 left: Mark Zuckerberg eating pasta right: mark zuckerberg eating pasta same seed

616

103,894

apolinario (poli) · Aug 30, 2022 · 12:53 PM UTC

apolinario (poli)

@multimodalart

30 Aug 2022

1 week of Stable Diffusion A creative explosion is unfolding with Stable Diffusion,s showing the power of open source as state of the art! We curated 23+ applications this week: new features, workflow integrations, UIs; run on Win, CPU, AMD, M1 and more! multimodal.art/news/1-week-o…

136

595

apolinario (poli) · Oct 21, 2022 · 10:27 AM UTC

apolinario (poli)

@multimodalart

21 Oct 2022

After some, uh, developments yesterday: - Stable Diffusion v1-5 is out by @runwayml - Fine-tuned image decoder (VAE) out by @StabilityAI Magic of open source🧙 collaboration continues no matter what, here's the Best Available Stable Diffusion™ notebook: colab.research.google.com/dr…

601

apolinario (poli) · May 9, 2025 · 10:02 AM UTC

apolinario (poli)

@multimodalart

9 May 2025

ok, is this it?! I just tested the DreamO demo and it's a framework for FLUX that can kind of do it all 🪄🔧 👨‍🎤 A really good FLUX ID preservation reference 🎨 IP Adapter for composition and style 👗 Virtual Try-on ... and more!

612

66,701

apolinario (poli) · Apr 4, 2022 · 3:17 PM UTC

apolinario (poli)

@multimodalart

4 Apr 2022

Very exciting 'breaking' news! CompVis (research group behind VQGAN) have just released a new 1.45B parameter model to its Latent Diffusion model: github.com/CompVis/latent-di… From the released image it seems like it has an unprecedented text-synthesis capacity. More to follow soon

115

587

apolinario (poli) · Jul 5, 2025 · 10:43 AM UTC

apolinario (poli)

@multimodalart

5 Jul 2025

John D. Pope 🦒@johndpope

5 Jul 2025

Replying to @multimodalart

Just redo every famous landmark in existence - could be watch this all day.

593

42,441

apolinario (poli) · Jan 10, 2025 · 2:29 PM UTC

apolinario (poli)

@multimodalart

10 Jan 2025

GANs are so back?! Scientists from Brown and Cornell have published a paper with a ✨ modern architecture GAN ✨ that is 🗿 stable to train 🗿 and competitive with SOTA GANs and even diffusion models Paper and demo 👇

580

64,162

apolinario (poli) · Aug 11, 2025 · 8:02 PM UTC

apolinario (poli)

@multimodalart

11 Aug 2025

ok i can't take it anymore: announcing the chatgpt image yellow tint corrector a @huggingface space that runs locally on your browser to fix the yellow tint of the chatgpt generated images

Sheryl Hsu

@SherylHsu02

11 Aug 2025

1/n I’m thrilled to share that our @OpenAI reasoning system scored high enough to achieve gold 🥇🥇 in one of the world’s top programming competitions - the 2025 International Olympiad in Informatics (IOI) - placing first among AI participants! 👨‍💻👨‍💻

538

99,487

apolinario (poli) · Feb 26, 2025 · 8:12 PM UTC

apolinario (poli)

@multimodalart

26 Feb 2025

LLaDA (the first Large Language Diffusion Model) is *just* out 💥 and I've built a demo, try out now 👨‍💻 It's mesmerizing to watch the diffusion process 🌀, and it being a diffusion model gives you superpowers like "the 4th word has to be pineapple" 🦸 Demo and weights 👇

513

82,626

apolinario (poli) · Sep 19, 2023 · 8:55 PM UTC

apolinario (poli)

@multimodalart

19 Sep 2023

Thanks @angrypenguinPNG for merging my PR to add high resolution to the Illusion Diffusion Space 📺🌀 It's now as fast, double the resolution and has crispy details - go play ▶️ huggingface.co/spaces/AP123/…

500

69,280

apolinario (poli) · Oct 27, 2024 · 12:53 PM UTC

apolinario (poli)

@multimodalart

27 Oct 2024

IC-Light v2 was just released by @lvminzhang 🔦, now runs on FLUX, and it is the best relighting tool in the world 🌐, just like that Try out the official demo ✨📣 huggingface.co/spaces/lllyas…

511

57,139

apolinario (poli) · Jun 22, 2022 · 6:14 PM UTC

apolinario (poli)

@multimodalart

22 Jun 2022

Google just announced "Parti" - a text-to-image model co-developed with "Imagen" "Parti" doesn't use diffusion models - rather it scales up Transformer + VQGAN architectures like DALL-E 1 and its open source replicas (dalle-pytorch, ruDALLE, DALL-E Mini) parti.research.google

489

apolinario (poli) · Nov 29, 2023 · 12:00 PM UTC

apolinario (poli)

@multimodalart

29 Nov 2023

Excited to introduce LEDITS++, a novel way to edit real images with precision ✏️ - Multiple edits ✂️🔁 - Automagic free masking 🪄🎭 - 🆕 DPM-Solver fast inversion 🔀⚡ 🤗 Try it: huggingface.co/spaces/editin… 🔗 Project: leditsplusplus-project.stati… 📝 Paper huggingface.co/papers/2311.1…

107

491

131,684

apolinario (poli) · Aug 30, 2024 · 12:50 PM UTC

apolinario (poli)

@multimodalart

30 Aug 2024

FLUX.1 ai-toolkit now has an official UI 🖼️ with @Gradio With this open source UI you can 💻, locally or any cloud: - Drag and drop images 🖱️ - Caption them ✏️ (or use AI to caption 🤖) - Start training 🏃 No code/yaml needed 😌 Thanks for merging my PR @ostrisai 🔥

487

49,794

apolinario (poli) · Aug 26, 2025 · 10:15 PM UTC

apolinario (poli)

@multimodalart

26 Aug 2025

nano-banana is really good at few-shot learning for example, if I upload a single photo of my face it will always keep the same facial expression and scene but if I upload multiple photos of myself, it kind of "learns" my likeness and can do everything

495

134,340

apolinario (poli) · Jun 19, 2025 · 9:37 AM UTC

apolinario (poli)

@multimodalart

19 Jun 2025

this is not a drill 🚨, real-time open source video generation is here 🔥 Self-Forcing - a real-time video distilled model from Wan 2.1 by @Adobe is out, and they open sourced it 🐐 I've built a live real time demo on @huggingface Spaces 📹💨

482

53,161

apolinario (poli) · Aug 18, 2025 · 5:21 PM UTC

apolinario (poli)

@multimodalart

18 Aug 2025

IT'S OUT! 🚀 MoDA: Multi-modal Diffusion Architecture for Talking Head Generation finally a talking head: open source 🏋️ fast ⚡ portrait + audio-driven 🧑‍🎨🎧 with emotion control (and yes, i built an inference system + Gradio, generate in < 15s on @huggingface spaces 🤗)

469

46,856

apolinario (poli) · Mar 1, 2023 · 11:46 PM UTC

apolinario (poli)

@multimodalart

1 Mar 2023

ControlNet is cool, but what if you could have MORE control? 🤯 With MultiDiffusion Region Control you can 🎛️ draw masks ✏️ and give a specific prompt for each mask 📜 The @gradio demo is just out on @huggingface 🤗 - kudos to the author @omerbartal! huggingface.co/spaces/weizma…

423

54,479

apolinario (poli) · Sep 2, 2025 · 4:43 PM UTC

apolinario (poli)

@multimodalart

2 Sep 2025

we hacked Wan 2.2 and discovered that it does first and last frame filling, works out of the box on 🧨 diffusers i've built an app for it on @huggingface Spaces (which is powering powering our nano banana video mode too 🍌 🎬)

417

49,418

apolinario (poli) · Jan 2, 2024 · 9:40 PM UTC

apolinario (poli)

@multimodalart

2 Jan 2024

Less than 1 minute guide on how to train your own LoRA with LoRA Ease 🧞‍♂️⚡ Train high-quality LoRAs on objects 📦, faces 😊, styles 🎨 or characters 🧑‍🎤 effortlessly and super cheap ༄ ▶️ huggingface.co/spaces/multim…

416

90,954

apolinario (poli) · Sep 10, 2022 · 5:08 PM UTC

apolinario (poli)

@multimodalart

10 Sep 2022

It's out! 🥳 Browse visually the Stable Diffusion Concepts Library - and use more than 100+ community taught concepts in your prompt directly on the same UI! Colab with Gradio UI: colab.research.google.com/gi…

398

apolinario (poli) · Jan 3, 2024 · 3:54 PM UTC

apolinario (poli)

@multimodalart

3 Jan 2024

You can now finally create your own stock photo smiling while eating salad in seconds 👨‍🎤🥗 IP-Apdater-FaceID Plus was silently released last week - it's first inference technique time face really captures my likeness 🥸🦚 ▶️ huggingface.co/spaces/multim…

408

60,739

apolinario (poli) · Mar 24, 2023 · 6:17 PM UTC

apolinario (poli)

@multimodalart

24 Mar 2023

How to train your own ControlNet? 🥅 We wrote a guide, ranging from deciding which controls to use 🎛️, how to prepare your dataset, all the way to gpus going brrr 🔥 (with an unexpected trip to the uncanny valley 👀) From me and @pcuenq with ❤️ huggingface.co/blog/train-yo…

Train your ControlNet with diffusers

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

huggingface.co

402

72,737

apolinario (poli) · May 2, 2025 · 9:05 AM UTC

apolinario (poli)

@multimodalart

2 May 2025

Whoa, I just tested the IC Edit @huggingface demo and it seems the new 🐐👑 of image editing for It's an image editing LoRA for FLUX featuring: 👨‍🎤 Identity preservation (beating GPT-4o) ✏️✏️ Does multiple edits 🐎 10s image editing 🌌 style support

397

45,498

apolinario (poli) · Feb 27, 2025 · 9:41 PM UTC

apolinario (poli)

@multimodalart

27 Feb 2025

The evals they didn't show you How does GPT 4.5 compare with latest non-thinking models: Sonnet 3.7 (no thinking), Deepseek V3 (not R1!), Grok 3 (no thinking)

380

77,524

apolinario (poli) · Sep 15, 2023 · 10:57 PM UTC

apolinario (poli)

@multimodalart

15 Sep 2023

Iterated with @angrypenguinPNG on some enhancements to their Illusion Diffusion Space, @MrUgleh-inspired QR ControlNet patterns 🌀 ▶️ huggingface.co/spaces/AP123/…

375

72,304

apolinario (poli) · Sep 8, 2023 · 1:52 PM UTC

apolinario (poli)

@multimodalart

8 Sep 2023

Upgraded the TokenFlow demo to an A100! And defaults changed - the edits should be ~2.5x faster huggingface.co/spaces/weizma…

364

55,250

apolinario (poli) · Sep 4, 2025 · 7:37 AM UTC

apolinario (poli)

@multimodalart

4 Sep 2025

early results for the Qwen "Boring Reality" LoRA 📸 by kudzueye the model is still experimental and work in progress 🚧

379

64,642

apolinario (poli) · Mar 15, 2023 · 3:21 PM UTC

apolinario (poli)

@multimodalart

15 Mar 2023

This was drawn by GPT-4

358

54,652

apolinario (poli) · Dec 21, 2022 · 11:32 AM UTC

apolinario (poli)

@multimodalart

21 Dec 2022

The first large scale open source DALL-E 2 replication is here🧙 Karlo is an unCLIP model trained by #KakaoBrain I'm having fun playing with it on 🤗 @huggingface Spaces: huggingface.co/spaces/kakaob… Model card: huggingface.co/kakaobrain/ka… GitHub: github.com/kakaobrain/karlo

360

59,903

apolinario (poli) · Aug 17, 2023 · 11:07 AM UTC

apolinario (poli)

@multimodalart

17 Aug 2023

Introducing LoRA the Explorer 🔎: browse the coolest SDXL LoRAs, play with them online ▶️, use locally 💿 (...and no need to dodge semi-naked waifus 🚫) Join the fun 🕺 huggingface.co/spaces/multim…

347

57,761

apolinario (poli) · Oct 13, 2022 · 5:51 PM UTC

apolinario (poli)

@multimodalart

13 Oct 2022

🧨 diffusers 0.5.0 now supports JAX for super fast #stablediffusion inference on TPUs You can generate 8 images in ~8s on Colab Free using TPU 🚀 colab.research.google.com/gi…

335

apolinario (poli) · Sep 27, 2022 · 12:23 PM UTC

apolinario (poli)

@multimodalart

27 Sep 2022

The Stable Diffusion Multi Inpainting Spaces is out! On it you can do both: Inpainting by masking the image (with the newest @Gradio masking) or inpainting with words, your choice! huggingface.co/spaces/multim…

329

apolinario (poli) · May 14, 2024 · 9:12 AM UTC

apolinario (poli)

@multimodalart

14 May 2024

The first open Stable Diffusion 3-like architecture model is JUST out 💣 - but it is not SD3! 🤔 It is HunyuanDiT by Tencent, a 1.5B parameter DiT (diffusion transformer) text-to-image model 🖼️✨ In the paper they claim to be SOTA open source! I'm working on a @huggingface demo as you read this so we can all vibe check Model: huggingface.co/Tencent-Hunyu… GitHub: github.com/Tencent/HunyuanDi… Paper: tencent.github.io/HunyuanDiT…

342

82,697

apolinario (poli) · Apr 6, 2022 · 10:44 PM UTC

apolinario (poli)

@multimodalart

6 Apr 2022

I'm super thrilled to announce that our assemble of the Latent Diffusion LAION-400M text-to-image model is now available on @huggingface🤗, democratizing even further the access to text-to-image ai art! Thank you for all the help @osanseviero! huggingface.co/spaces/multim…

336

apolinario (poli) · Jul 20, 2022 · 11:39 AM UTC

apolinario (poli)

@multimodalart

20 Jul 2022

I'm delighted to announce I've joined @huggingface as a ML Art Engineer 🤗, to help make AI art even more accessible, easy to use and to develop for! This tech is going to empower human expression and creativity in unprecedented ways - and building it openly feels the right way!

332

apolinario (poli) · Jun 26, 2025 · 5:16 PM UTC

apolinario (poli)

@multimodalart

26 Jun 2025

FLUX Kontext [dev] is out with open weights 🔥 - Inference & LoRA tuning script (yes!) with diffusers 🧨 - Official demo on @huggingface Spaces 🖼️ - 2 community demos: multi-context & LoRA explorer (ye, some FLUX[dev] loras work out of the box 🪄 )

337

20,258

apolinario (poli) · Feb 8, 2024 · 1:10 PM UTC

apolinario (poli)

@multimodalart

8 Feb 2024

Text-to-3D and Image-to-3D in 7 seconds 🤯 💨 That's LGM: Large Multi-View Gaussian Model for High-Resolution 3D Content Creation 🧊 And it's open source ✨ Try it ▶️ huggingface.co/spaces/ashawk…

285

32,238

apolinario (poli) · Sep 8, 2023 · 2:53 PM UTC

apolinario (poli)

@multimodalart

8 Sep 2023

ControlNets are cool, but T2I-Adapters are 94% smaller 🤏 , and way faster 💨 Today TencentARC released 6 T2I Adapters for SDXL: depth, canny, lineart, openpose, and... DOODLY! Come play: huggingface.co/spaces/Tencen…

312

42,384

apolinario (poli) · Mar 4, 2025 · 9:55 AM UTC

apolinario (poli)

@multimodalart

4 Mar 2025

FINALLY! Generate a full song with lyrics in < 20 seconds! ⚡ 🔥 DiffRhythm is ⟡ just out ⟡ an open weights end-to-end full song generation model that generate 1-2min songs in just a few seconds 🏎️💨 Give it a reference + lyrics and get a song back! Sound on! 🔊 ▶️

322

75,161

apolinario (poli) · Aug 12, 2024 · 6:02 PM UTC

apolinario (poli)

@multimodalart

12 Aug 2024

Introducing FLUX LoRA the Explorer 🧭✨ Explore, generate and download FLUX LoRAs! 🖼️ Including the popular flux-realism and the cute Frosting Lane Come over, we're just getting started 🛸 ▶️ huggingface.co/spaces/multim…

321

48,852

apolinario (poli) · Feb 16, 2023 · 11:23 PM UTC

apolinario (poli)

@multimodalart

16 Feb 2023

The MarioGPT @huggingface Spaces demo is now playable! 🕹️ Now you can play the levels you generate - hopefully you're better than me 😂 huggingface.co/spaces/multim…

305

49,695

apolinario (poli) · Oct 29, 2025 · 7:17 PM UTC

apolinario (poli)

@multimodalart

29 Oct 2025

this is so good! mid-frames are here, multi-frame to video is an easy to use workflow! kudos to @morphic for open sourcing it

Morphic

@morphic

29 Oct 2025

Morphic's frames-to-video, with up to 5 frames and time control, is now open-source. GitHub: github.com/morphicfilms/fram… Hugging Face: huggingface.co/morphic/Wan2.… More details in the thread:

305

30,230

apolinario (poli) · Nov 30, 2023 · 8:15 PM UTC

apolinario (poli)

@multimodalart

30 Nov 2023

Meta just released a new collection their open access "Seamless" translation models 🔊 They do speech-to-text, text-to-speech, speech-to-speech, text-to-text 💬🔄📝 The Expressive model keeps speech rate, pauses and style 🗣️ 📁 Models and demos: huggingface.co/collections/f…

301

51,883

apolinario (poli) · Oct 14, 2024 · 7:32 PM UTC

apolinario (poli)

@multimodalart

14 Oct 2024

Wow! I wasn't expecting the outpainting of the new FLUX Inpainting Beta Controlnet to be this good 🤯 👇 links to try it

299

28,640

apolinario (poli) · Jul 7, 2025 · 10:39 AM UTC

apolinario (poli)

@multimodalart

7 Jul 2025

Whoa, 1000 likes in the Wan 2.1 Fast Space 🎥 💨 still feels surreal that we can generate such high quality videos so fast with open source models, and that's the slowest/worst it's ever gonna be ✨

294

38,068

apolinario (poli) · Mar 3, 2023 · 5:20 PM UTC

apolinario (poli)

@multimodalart

3 Mar 2023

The diffusers 🧨 library just did a release incorporating ControlNet, it runs so fast! 🏎️‍💨 Blog: huggingface.co/blog/controln… Colab: colab.research.google.com/gi…

285

52,526

apolinario (poli) · Sep 23, 2024 · 11:14 AM UTC

apolinario (poli)

@multimodalart

23 Sep 2024

Diffusers Outpaint now allows for infinite zoom-out with a resize input size + "use as input" button @kingnish24 🤝 @fffiloni ▶️ huggingface.co/spaces/fffilo…

284

28,493

apolinario (poli) · Nov 2, 2024 · 7:21 PM UTC

apolinario (poli)

@multimodalart

2 Nov 2024

omg, it seems recraft v3 can perform simple language model tasks 🤯 1: "this page contains the number of letters r that the word strawberry has" 2: "this page contains the result of 2+5" 3: "write 2 adjectives in english" 4: "write the name of the US president"

280

57,264

apolinario (poli) · Sep 7, 2022 · 10:36 AM UTC

apolinario (poli)

@multimodalart

7 Sep 2022

Collaborative new concepts on #StableDiffusion🎨 1. Teach Stable Diffusion new concepts 👩‍🏫(add to the public library if you wish): colab.research.google.com/gi… (or browse the library to pick one🧤 huggingface.co/sd-concepts-l…) 2. Run with the learned concepts 🖼️ colab.research.google.com/gi…

270

apolinario (poli) · Jan 21, 2025 · 10:13 AM UTC

apolinario (poli)

@multimodalart

21 Jan 2025

New model alert! 🚨 ⋆✴︎˚｡FLEX.1 Alpha ˚｡✴︎⋆ is an 8B parameter model pruned and further trained by @ostrisai from 12B FLUX.1 [schnell]: 🖼️ High quality, competitive with FLUX[dev] 🎨 Good at styles 🤏 Smol 📜 openly licensed (Apache 2.0) ⚗️ de-destiled, CFG optional

272

25,850

apolinario (poli) · Nov 24, 2022 · 1:57 AM UTC

apolinario (poli)

@multimodalart

24 Nov 2022

Stable Diffusion 2 by @StabilityAI is out with new 5 models 👽 You can try now the 768x768 model (the largest one released) on @huggingface Spaces huggingface.co/spaces/stabil…

265

apolinario (poli) · Jan 1, 2024 · 11:50 AM UTC

apolinario (poli)

@multimodalart

1 Jan 2024

Happy Public Domain day! 🎉 To celebrate Steamboat Willie finally joining the public domain, I created a @huggingface dataset with all frames of the 1928 short 🐭📜 ▶️ huggingface.co/datasets/mult…

258

56,891

apolinario (poli) · Mar 12, 2025 · 3:58 PM UTC

apolinario (poli)

@multimodalart

12 Mar 2025

whoa, @Remade_AI just dropped 8 open source video LoRA effects for Wan 2.1 on @huggingface 🤯 Squish 🥞, Cakefy 🍰, Inflate 🎈, Deflate 📉, Shooting 🔫, Rotate 🔄 and Muscle 💪 all available openly

263

56,828

apolinario (poli) · Apr 22, 2022 · 12:13 AM UTC

apolinario (poli)

@multimodalart

22 Apr 2022

Breaking news: OpenAI open sourced their CLIP ViT-L/14@336px! github.com/openai/CLIP/commi… I'll hook it soon to many generation systems, stay tuned!

ViT-L/14@336px (#234) · openai/CLIP@b4ae449

github.com

254

apolinario (poli) · May 31, 2024 · 3:16 PM UTC

apolinario (poli)

@multimodalart

31 May 2024

The official ToonCrafter demo is now available @huggingface Spaces ZeroGPU 🤯 This generative cartoon interpolation model is by far the best coherent generative interpolation model I've seen 🖼️ IMO it will change how animations are made 🪄 ▶️ huggingface.co/spaces/Doubii…

256

57,707

apolinario (poli) · Aug 5, 2025 · 6:07 PM UTC

apolinario (poli)

@multimodalart

5 Aug 2025

the gpt-oss model is really easy to tune! get started with customizing/fine-tuning to make gpt-oss your own with the @openai + @huggingface cookbook 🤝 cookbook.openai.com/articles…

261

26,334

apolinario (poli) · Apr 4, 2022 · 5:05 PM UTC

apolinario (poli)

@multimodalart

4 Apr 2022

Ok - I just quickly assembled the LAION-400M trained Latent Diffusion CFG TTI model to a Google Colab, you can try it yourself: colab.research.google.com/gi… "A mecha robot holding a sign that reads: 'This is weird'"

apolinario (poli)

@multimodalart

4 Apr 2022

239

apolinario (poli) · Sep 29, 2022 · 7:19 PM UTC

apolinario (poli)

@multimodalart

29 Sep 2022

🎅 Ho-ho-ho! Today a bunch of ICLR 2023 papers dropped! This is a conference with blind submission, authors are anonymous till review A lot of multimodal AI: text-to-video (yes, another one), text-to-3D, another 'teach-diffusion-new-concepts', texto-to-audio... and more! 🧵

242

apolinario (poli) · Aug 10, 2022 · 3:48 PM UTC

apolinario (poli)

@multimodalart

10 Aug 2022

Stable Diffusion model card is up, and the weights are available for academic and research purposes first This is the first step ahead of a full public release which should be coming soon! 🤩 #StableDiffusion huggingface.co/CompVis/stabl…

CompVis/stable-diffusion · Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

huggingface.co

240

apolinario (poli) · Apr 7, 2025 · 4:51 PM UTC

apolinario (poli)

@multimodalart

7 Apr 2025

The Dream 7B (diffusion reasoning language model) is OUT! 🚨 I built a demo so you can test it out (and check the diffusion process live) 𖣯🔍

Jiacheng Ye @JiachengYe15

2 Apr 2025

🚀Excited to announce Dream 7B (Diffusion reasoning model): the most powerful open diffusion large language model to date.

ALT Figure: comparison of language models on general, math, coding, and planning tasks.

248

35,703

apolinario (poli) · Apr 8, 2022 · 10:29 PM UTC

apolinario (poli)

@multimodalart

8 Apr 2022

This week's updates were not only made of Dall-E 2! We also got: - Latent Diffusion LAION 400M (an open model!) - KNN Diffusion paper (promising new approach to text-to-image) - 3 new exciting TEXT-to-VIDEO models! and more! Check out our weekly update: multimodal.art/news/this-wee…

234

apolinario (poli) · Apr 5, 2022 · 10:48 PM UTC

apolinario (poli)

@multimodalart

5 Apr 2022

OPEN TO EVERYBODY! I optimized the Latent Diffusion LAION-400M Colab RAM usage and now it should run on free non-Pro accounts. And fast! 8 images in 20 seconds on a P4 GPU colab.research.google.com/gi… Google Drive support and VRAM optimizations by @RiversHaveWings were also added

apolinario (poli)

@multimodalart

4 Apr 2022

231

apolinario (poli) · Nov 22, 2023 · 3:04 AM UTC

apolinario (poli)

@multimodalart

22 Nov 2023

Stable Video Diffusion is an amazing (and chonky 🐼) new model by @StabilityAI - if you can't run it locally, you can now play with it on @huggingface Spaces 🤗 ▶️ huggingface.co/spaces/multim…

240

56,850

apolinario (poli) · May 22, 2022 · 9:39 AM UTC

apolinario (poli)

@multimodalart

22 May 2022

Yesterday OpenCLIP released the first LAION-2B trained perceptor! a ViT-B/32 CLIP that suprasses OpenAI's ViT-B/32 quite significantly: github.com/mlfoundations/ope…

231

apolinario (poli) · Sep 14, 2022 · 4:19 PM UTC

apolinario (poli)

@multimodalart

14 Sep 2022

And the Spaces for the Stable Diffusion Concepts Library is out! Navigate 250+ community taught object and styles with Textual Inversion and use them in your prompts! huggingface.co/spaces/sd-con…

229

apolinario (poli) · Oct 22, 2024 · 2:13 PM UTC

apolinario (poli)

@multimodalart

22 Oct 2024

Guess who's back? Back again! 🎵 @StabilityAI is back, tell a friend 🎤 Stable Diffusion 3.5 Large is here 🔥 - 🏋️ 8B parameters - Full 💪 and 🏎️💨 4-step Turbo variant - 🧾 🤝 commercial use (for orgs below 1M year/rev) - 🧨 day-0 LoRA fine-tuning support

232

31,177

apolinario (poli) · Feb 20, 2025 · 11:16 AM UTC

apolinario (poli)

@multimodalart

20 Feb 2025

SongGen is joining YuE as an open-source text-to-music (Suno-style) model Feed it a 3s voice sample (optional) → describe your song → write the lyrics 🟰 get a song!

236

13,166

apolinario (poli) · May 6, 2022 · 11:33 AM UTC

apolinario (poli)

@multimodalart

6 May 2022

DALL-E Flow is an awesome new tool by @JinaAI_'s @hxiao Like Centipede Diffusion it is a mix of models: It generates images from with DALL-E Mega, refines and creates variations with Latent Diffusion, ranks the best with CLIP and upscales the results github.com/jina-ai/dalle-flo…

GitHub - jina-ai/dalle-flow: 🌊 A Human-in-the-Loop workflow for creating HD images from text

🌊 A Human-in-the-Loop workflow for creating HD images from text - jina-ai/dalle-flow

github.com

228

apolinario (poli) · Aug 22, 2022 · 6:36 PM UTC

apolinario (poli)

@multimodalart

22 Aug 2022

Following the full open source release of Stable Diffusion, the @huggingface Spaces for it is out🤗 Stable Diffusion is a state-of-the-art text-to-image model that was released today by @StabilityAI #stablediffusion huggingface.co/spaces/stabil…

224

apolinario (poli) · Jul 14, 2025 · 12:45 AM UTC

apolinario (poli)

@multimodalart

14 Jul 2025

Replying to @repligate

Saved it here archive.ph/k18fr

this AI chatbot "Sidney" is misbehaving - Microsoft Community

archived 14 Jul 2025 00:40:07 UTC

archive.ph

222

45,621

apolinario (poli) · Jan 20, 2023 · 7:26 PM UTC

apolinario (poli)

@multimodalart

20 Jan 2023

InstructPix2Pix by Tim Brooks allows you to write natural language instructions to edit images ✏️🖼️ We are getting closer and closer to "photoshop with words"! 🎨 Play with it now on @huggingface Spaces huggingface.co/spaces/timbro…

214

27,353

apolinario (poli) · May 15, 2025 · 8:32 AM UTC

apolinario (poli)

@multimodalart

15 May 2025

Check it out here, and inference it directly on @huggingface with @fal or @replicate ✨ huggingface.co/multimodalart…

221

12,925

apolinario (poli) · Aug 31, 2025 · 8:55 PM UTC

apolinario (poli)

@multimodalart

31 Aug 2025

a mysterious new button appeared on the @huggingface Spaces Nano Banana app 👀

220

40,235

apolinario (poli) · Jan 14, 2025 · 4:13 PM UTC

apolinario (poli)

@multimodalart

14 Jan 2025

ComfyUI → @huggingface Spaces → serverless ZeroGPU ✨😌 We wrote a tutorial on how to turn any ComfyUI workflow into an easy to use Gradio app and (optionally) host it for free with ZeroGPU 💥 huggingface.co/blog/run-comf…

Run ComfyUI workflows for free with Gradio on Hugging Face Spaces

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

huggingface.co

218

25,788

apolinario (poli) · Nov 24, 2022 · 3:39 PM UTC

apolinario (poli)

@multimodalart

24 Nov 2022

Since VQGAN+CLIP times, we've been learning to prompt with @openai CLIP knowledge (incl. SDv1, conditioned on OAI CLIP) Stable Diffusion 2 breaks that 💥 with LAION-trained CLIP, "trending on artstation", "greg rutkowski" don't work; we're all learning to prompt again! 👶

213

apolinario (poli) · Oct 31, 2024 · 5:14 PM UTC

apolinario (poli)

@multimodalart

31 Oct 2024

✨ PD12M ✨, a 12.4 million high quality image-caption dataset for AI training 🎛️, featuring: - 🤖✏️ Florence-2 synthetic captions - 🌸 Aesthetic and safety filtered from 34M superset - 🔓 only public domain images superb release by @spawning_ huggingface.co/datasets/Spaw…

Spawning/PD12M · Datasets at Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

huggingface.co

211

21,988

apolinario (poli) · Mar 4, 2025 · 8:47 AM UTC

apolinario (poli)

@multimodalart

4 Mar 2025

Fuck yes! CogView 4 is out! 🔥🚀 New 6B parameters text-to-image model! 🧠🎨 Native 2048x2048 resolution! 🖼️🔍 Great prompt adherence for very long prompts! ✍️✨ Apache 2.0 license! 📜🔓

206

12,432

apolinario (poli) · May 24, 2024 · 10:28 AM UTC

apolinario (poli)

@multimodalart

24 May 2024

SDXL Flash 📸 is here! While SDXL LCM/Turbo/Lightning/Hyper do a great job in 1-4 steps, SDXL Flash gets uncompromised quality in 10 steps 💥 A new sweet intermediary spot to unlock use-cases 🍬 Model: huggingface.co/sd-community/… Demo: huggingface.co/spaces/KingNi…

206

23,923

apolinario (poli) · Jul 25, 2025 · 10:31 AM UTC

apolinario (poli)

@multimodalart

25 Jul 2025

It was missing, so I added @AnthropicAI Opus 4 Thinking and @OpenAI o3 benchmark results to the comparison mix chart 🆚🔎 Vibe check pending, but on benchmarks it seems that we got an open model competitive with Opus 4 / o3 / Gemini 2.5 🤯

Qwen

@Alibaba_Qwen

25 Jul 2025

🚀 We’re excited to introduce Qwen3-235B-A22B-Thinking-2507 — our most advanced reasoning model yet! Over the past 3 months, we’ve significantly scaled and enhanced the thinking capability of Qwen3, achieving: ✅ Improved performance in logical reasoning, math, science & coding ✅ Better general skills: instruction following, tool use, alignment ✅ 256K native context for deep, long-form understanding 🧠 Built exclusively for thinking mode, with no need to enable it manually. The model now natively supports extended reasoning chains for maximum depth and accuracy. Hugging Face: huggingface.co/Qwen/Qwen3-23… or huggingface.co/Qwen/Qwen3-23… ModelScope: modelscope.cn/models/Qwen/Qw… or modelscope.cn/models/Qwen/Qw… API Doc: alibabacloud.com/help/en/mod…

197

59,756

apolinario (poli) · Jul 6, 2024 · 2:44 PM UTC

apolinario (poli)

@multimodalart

6 Jul 2024

💥 If SDXL was trained with LLM as a text encoder, what would happen? 🧪 Kolors is the answer 🎨 - Kwai trained (from scratch!) an SDXL-arch model with the GLM-4 LLM as the text encoder, and it's fantastic! ▶️ Demo huggingface.co/spaces/gokayg… 📁 Model huggingface.co/Kwai-Kolors/K…

193

25,415

apolinario (poli) · Nov 21, 2024 · 2:12 AM UTC

apolinario (poli)

@multimodalart

21 Nov 2024

The Logo in Context Spaces demo + 🧨 diffusers implementation is here! 🖼️🏷️ In-Context LoRA + Image-to-Image + Inpainting → allow you to apply your logos to anything huggingface.co/spaces/multim…

198

24,914

apolinario (poli) · Apr 16, 2024 · 8:57 AM UTC

apolinario (poli)

@multimodalart

16 Apr 2024

PAG (Perturbed-Attention Guidance) is not getting nearly the attention it deserves, I've adapted it to work on SDXL with diffusers 🧨 ...and it DELIVERS! 🤯 Try it here ▶️ huggingface.co/spaces/multim… thanks to KU-CVLAB researchers: Donghoon Ahn Hyoungwon Cho et. al ❤️

OpenCV University

@OpenCVUniverse

1 Apr 2024

Recent studies reveal that the quality of samples from diffusion models relies on techniques like CG and CFG, yet these fall short in unconditional generation and tasks like image restoration. This research paper introduces Perturbed-Attention Guidance (PAG), a novel method enhancing diffusion samples in all scenarios without extra training, offering significant improvements in tasks where traditional guidances falter. Donghoon Ahn, Hyoungwon Cho, Jaewon Min, Wooseok Jang, Jungwoo Kim, SeonHwa Kim, Hyun Hee Park, Kyong Hwan Jin, Seungryong Kim Paper: paperswithcode.com/paper/sel… Repo: github.com/KU-CVLAB/Perturbe… #ai #diffusionmodels #artificialintelligence

187

49,345

apolinario (poli) · Aug 20, 2025 · 5:21 PM UTC

apolinario (poli)

@multimodalart

20 Aug 2025

Qwen Image Edit works too well with lightx2v LoRA to run with just 8 and 4 steps, wtf? in my experience, 8 steps keeps the quality of the edits at the same level as the original model, at a 12x speedup 💨 (ofc i built a demo for it)

194

28,594

apolinario (poli) · Mar 25, 2022 · 4:07 PM UTC

apolinario (poli)

@multimodalart

25 Mar 2022

MindsEye - an open source interface to 'pilot' AI art models without using code - is now available to everyone Check it out, share it around and let me know what you think! Colab: colab.research.google.com/dr… Discord: discord.gg/Np6Ec9DG Guide and FAQ: multimodal.art/mindseye

MindsEye beta.ipynb

Colaboratory notebook

colab.research.google.com

182