DeepMind during the day, modular synth guy at night: nerodiseppia.ch

Zürich, Schweiz
Glad we launched => aistudio.google.com
76
280
4,565
459,501
Very happy to finally show our latest work: VCT, a transformer for video compression 🤖 We radically simplify the video compression setup by getting rid of motion prediction, warping, and residuals. Instead, a transformer learns to model temporal redundancies purely from data:
10
77
509
Check out our new paper on learned lossless compression: 30% smaller images than PNG, using a fully parallel probabilistic model that is orders of magnitude faster than PixelCNN! Oral @cvpr2019 w/ @etagust, @mtschannen, R. Timofte, L. Van Gool, @ETH_en. github.com/fab-jul/L3C-PyTor…
4
120
365
New paper! Using GANs, we obtain a generative image compression system that synthesizes fine details which are expensive to store. Our method remains visually close to the inputs and outperforms previous approaches -- Check out the demo at hific.github.io! (1/4)
8
94
346
🎉🎉🎉 Say goodbye to the VQ headache 🤕 - in our latest paper, we show a simple drop-in replacement for VQ based on scalar quantization that does not need any of the auxiliary losses and tricks from the VQ literature!
17
44
351
67,113
Fun prompt to try on aistudio.google.com : "describe this image and create it again, but in anime style!"
6
18
284
20,603
1/ Excited to share that our work on masked transformers for compression has been accepted at #ICCV2023! We show two things: a) how a simple transformer based architecture can lead to state-of-the-art image compression results: arxiv.org/abs/2304.07313
2
39
213
43,261
After our recent paper on simplifying VQ (arxiv.org/abs/2309.15505) we were like "why even quantize at all?!?". And so we did just that, in 🎁 GIVT: Generative Infinite Vocabulary Transformers We use the VQGAN+Transformer style to do image generation, but without VQ! 🎉
2
37
199
23,551
Fun fact: TF and PyTorch seem to have different default leaks for their leaky ReLUs ✨ Glad papers report the alphas they use (they don't actually 😔).
6
12
193
Remember CycleGAN? Those were the days
17
10
186
26,144
Our paper on ✨Neural Video Compression with GANs✨ was accepted at #ECCV2022 🎉🥳. We just uploaded the final version to arXiv, check it out: arxiv.org/abs/2107.12038
2
14
150
Sick party @eccvconf
3
7
131
Excited to start my summer internship at @Qualcomm Amsterdam today! Time to take the bike out and do some compression research 🎉
5
1
111
0/0 CVPR papers accepted 🤙
1
4
104
I defended in January, so I was surprised to get a letter from my uni today: my dissertation was awarded the 🎖ETH medal for outstanding doctoral theses🎖. I’m stoked, didn’t expect this! Thanks Luc, and @CVL_ETH 🎉🥳
6
1
110
A pretty PhD thesis arrived ☁️ 🥰
1
1
100
Best debugging trick in terms of bang for the buck: overfit on one batch - or even one image! Super simple to setup & you see: Does the loss go to 0? Do reconstructions make sense? Etc. Saved me many hours of grief. (source: karpathy.github.io/2019/04/2…)
1
23
96
📢📢📢 New paper: "Towards Generative Video Compression". We present a GAN-based neural video compression system that is comparable to HEVC visually, and outperforms previous work that does not use GANs. Check it out on arxiv: arxiv.org/abs/2107.12038
2
25
89
We will present our compression paper "High-Fidelity Generative Image Compression" as an Oral at NeurIPS 🥳🥳🥳 When: Wednesday Dec 9th, Orals Track 15 (15:00 CET), Poster Session 4 (18:00 CET) Open-sourced training code, colab, and demo: hific.github.io
4
27
91
Going to present this in Vienna in May :) See you there? #ICLR2024
🎉🎉🎉 Say goodbye to the VQ headache 🤕 - in our latest paper, we show a simple drop-in replacement for VQ based on scalar quantization that does not need any of the auxiliary losses and tricks from the VQ literature!
4
7
83
9,197
Did a level up 🕹👾🎉
6
76
Impending CVPR deadline — time to post the old but gold „Small Guide to Making Nice Tables“ people.inf.ethz.ch/markusp/t…. TLDR: ditch all vertical lines. ✨Revewers will thank you ✨
1
6
73
Happy to announce that I successfully defended my PhD 🎉 🥳🍾 Thanks so much to everyone helping me get here, it was a great ride!!
6
1
70
Will be presented at #NeurIPS! 🎉🎉🎉
Very happy to finally show our latest work: VCT, a transformer for video compression 🤖 We radically simplify the video compression setup by getting rid of motion prediction, warping, and residuals. Instead, a transformer learns to model temporal redundancies purely from data:
5
5
63
Chilling more
10
4
63
4,673
Thanks everyone for stepping by the poster, was a lot of fun 😌 (now just need to find my voice again😅)
3
2
58
4,958
I‘m now officially a Doctor 👨‍🎓 feels good.
4
52
I had a paper in my @eccvconf stack that plagiarized two of my own papers with verbatim text copies. Told the AC but never heard back, paper is still in CMT!! imo paper should be removed right away and authors should be blocked from submitting again.
3
1
50
📢We are releasing our arithmetic coder for @PyTorch as a stand-alone library. This is a generalized version of the code we used for our lossless compression papers, based on C++ for 🔥speed🔥. Wished this was available when I started my PhD :) github.com/fab-jul/torchac
11
54
📢📢 Hyped to share what I've been working on this year -- here is a quick demo of some truely native image outputs in Gemini 2.0 🦋 Here the model generates images in-between the text it outputs, and responds to some follow up editing requests 🚀🚀
3
6
50
5,398
Thanks for stepping by everyone :) It was really great to hear all the nice words from ppl who already apply this method and build on it! #ICLR2024
2
48
3,474
I'll present our #CVPR paper "Learning Better Lossless Compression Using Lossy Compression" today at 12:00 and 24:00 PST, session 2.2! Arxiv: arxiv.org/abs/2003.10184. (1/4)
1
14
49
Editing with Gemini 2.0 native image gen is AMAZING.
"How is native image gen better than current models?"
1
43
5,061
Pixel perfect editing 👌
Gemini can generate pretty consistent gif animations too: 'Create an animation by generating multiple frames, showing a seed growing into a plant and then blooming into a flower, in a pixel art style'
1
2
37
3,218
Moment of appreciation for whoever decided to make jax.numpy the main API for interacting with #jax. it's just so beautiful, in many ways.
1
38
33,038
Please checkout the preprint on arxiv: arxiv.org/abs/2206.07307
1
5
36
🚀 🚀 🚀 Seeing SHIPS everywhere!!! Model is out, check out at developers.googleblog.com/en…
3
4
39
2,273
super happy to be part of this :)
Gemini 2.0 Flash has native image outputs! Congrats to the awesome team that built it. I find the example at 1:15 super cool: to change the car's color and add beach gear, the model generates two images step-by-step using visual chain of thought. piped.video/watch?v=7RqFLp0T…
1
38
3,777
Hoping on a train soon to present this at #ICLR2024 in Vienna!! See you there :) I‘ll also be at the Google Booth on Tuesday ✨
🎉🎉🎉 Say goodbye to the VQ headache 🤕 - in our latest paper, we show a simple drop-in replacement for VQ based on scalar quantization that does not need any of the auxiliary losses and tricks from the VQ literature!
3
2
36
4,421
We show that the model learns to handle motion despite having no explicit motion components. It also learns to handle videos that previous approaches struggle with, such as fading, showing the benefit of removing hand-crafted components.
1
3
36
Will be at CVPR'23 to present this 🎉🐟 Often, when people see generative neural compression, they worry about hallucinating arbitrary reconstructions. We show how you can get both a low error reconstruction AND a generative reconstruction *from a single latent*.
Excited to share our work on Multi-Realism Image Compression, which we'll present at upcoming CVPR! arxiv.org/pdf/2212.13824.pdf Generative compression produces synthetic details which looks realistic. But what if we want to stay close to the original? Let's do both! (1/2)
1
33
4,941
Name updated ⬇️
Based on the feedback about the difficult to find 2.0 Flash Experimental Image Generation 🖼️, we updated the model list (moved it higher up) and the name (made it descriptive). Many more improvements in the works landing early next week, hang tight!
1
29
2,449
Code to build this model is now available at goo.gle/vct-paper! 🤖⌨️
Very happy to finally show our latest work: VCT, a transformer for video compression 🤖 We radically simplify the video compression setup by getting rid of motion prediction, warping, and residuals. Instead, a transformer learns to model temporal redundancies purely from data:
1
7
30
Might be in the minority but some_array[..., np.newaxis, :] is much better than np.expand_dims(some_array, -2) (works also in pytorch, TF, jax)
5
2
29
3,992
📢 Check out our work on lossy (generative) compression with unconditional diffusion models. As you can see in the GIF, we communicate a corrupted version of the image, and then denoise on the receiver side 🎇 − No GAN, no LPIPS, no separate entropy model required!
Excited to share our latest work describing a new approach to lossy compression: ✨DiffC✨ sends an image corrupted by Gaussian noise which is then denoised by the receiver. Surprisingly, this simple approach works well despite operating in pixel space...
2
25
Presenting now! Come by 164 in the PM session @CVPR !! 🍌
1
1
23
1,627
Text rendering of our model is just SO SICK!! Go check it out
Native image generation & editing with Gemini 2.0 Flash is out! Check this out -- developers.googleblog.com/en…
1
1
26
1,856
Happening today at #ICLR2024, Poster #39 at 4.30 PM! Come by :)
🎉🎉🎉 Say goodbye to the VQ headache 🤕 - in our latest paper, we show a simple drop-in replacement for VQ based on scalar quantization that does not need any of the auxiliary losses and tricks from the VQ literature!
3
22
2,466
Check out our new paper on applying diffusion models to high-resolution 🗜️ image compression 🗜️. The model is outperforming previous GAN based state-of-the-art in terms of FID and synthesizes really fine grained detail.
New paper on high-resolution image compression with diffusion models, achieving SoTA results in terms of FID: arxiv.org/abs/2305.18231
3
25
2,849
The GIVT that keeps giving. We scaled up our soft token generation transformer and improved the GMM. Also there is 👨🏽‍💻 CODE 🧑‍💻. Check it out ✨
We just released a big 🎁GIVT update! 📈 Larger models and improved image generation results across the board 💡 Improved GMM formulation and adapter module 💻 Code, model checkpoints, and a colab are now available at github.com/google-research/b… More details below... 1/
1
1
24
2,065
I'll present this work on Thursday in New Orleans at #NeurIPS2022 !! See u there ☀️👾
Very happy to finally show our latest work: VCT, a transformer for video compression 🤖 We radically simplify the video compression setup by getting rid of motion prediction, warping, and residuals. Instead, a transformer learns to model temporal redundancies purely from data:
1
5
24
so good!
Cats made from different food items with Gemini's Flash's native image gen 🧵
23
1,269
Excited to be mentioned in the latest @RSIPvision issue 🥳😃 Check it out here: rsipvision.com/ComputerVisio…
1
1
22
Yay, I will be going to #ICCV2023 😍🗼
The list of paper IDs for accepted #ICCV2023 papers is now available at the following link: drive.google.com/file/d/1t0X…
21
2,919
At this point I feel like I can draw essentially anything in matplotlib. Just don't look at the code though 🍝 #CVPR2023
2
19
💼 Packing for #ICCV2023 to present the M2T paper :) See you in Paris?
1/ Excited to share that our work on masked transformers for compression has been accepted at #ICCV2023! We show two things: a) how a simple transformer based architecture can lead to state-of-the-art image compression results: arxiv.org/abs/2304.07313
5
21
2,916
This is joint work with @george_toderici, @minnend, Sung-Jin Hwang, @skprat, @MarioLucic_, and @etagust. Thanks for an awesome collaboration 🎉
1
18
Bonus pic of me in the lab at 30°C - note the water cooled feet
1
17
Check out the paper on arxiv: arxiv.org/abs/2309.15505 This has been a collaboration with @minnend, @etagust, and @mtschannen
1
18
2,021
NeurIPS: you get 9 pages Authors: wait until you see my 49 page supplementary
1
17
1,510
Replying to @CVPR
Surprised to see the staff not wearing them… They get into contact with everyone!
17
check out this amazing tool from Kaushik, using our new Gemini with Native Image Out to render any text you want 🔥🔥🔥 Gonna use this for album art from now on.
Made a little web app ✨ 🎨 Create images containing words, phrases, or sentences using tinyurl.com/word-visualizer! ⚡️ Powered by Gemini 2.0 Flash Experimental's native image gen + a self-correction loop to try again if it fails
1
1
16
1,463
Hey @CVPR when can we expect an update on whether the conference will be in person?
1
16
Was great to chat about compression at #NeurIPS in person again✨ I‘ll miss the Suburbans and the fairy trees
2
15
Zurich last night.
2
13
Replying to @leastsquared_
you need to select "Image and Text" on the right mate
15
1,869
Replying to @gusthema
Yes, code will appear here eventually: github.com/google-research/g…! Stay tuned on that side.
1
15
It is happening😻
Native image editing with Gemini is now starting to roll out to all users in the @GeminiApp!!
15
896
We replace VQ with FSQ in various tasks and architectures, and obtain comparable reconstruction quality and metrics.
1
12
2,108
Was in Germany yesterday to talk about compression and eat pretzels 🥨 Thanks to fraiburg.ai and @jkhfranke for hosting this, it was very exciting!
1
13
1,141
Nice place for a conference :) #ICCV2023
14
1,439
As requested by many, we added the specifics of the (continuous) Gaussians-based loss to the GIVT paper - see arxiv.org/abs/2312.02116 - snippet below 👨‍💻 Code is also in the works for y'all, so stay tuned 👨‍💻
1
12
636
Great progress on the neural compression front by overfitting to examples! Check out the paper ⬇️
We build neural codecs from a *single* image or video, achieving compression performance close to SOTA models trained on large datasets, while requiring ~100x fewer FLOPs for decoding ⚡ #CVPR2024 c3-neural-compression.github…
1
13
1,335
I‘ll present our paper on learned lossless compression today @CVPR. 13.30 in Terrace Theater, oral and poster #1!
Check out our new paper on learned lossless compression: 30% smaller images than PNG, using a fully parallel probabilistic model that is orders of magnitude faster than PixelCNN! Oral @cvpr2019 w/ @etagust, @mtschannen, R. Timofte, L. Van Gool, @ETH_en. github.com/fab-jul/L3C-PyTor…
2
11
I’ll present our paper on extreme image compression today 10.30 at #iccv2019! Come by Poster 23. We compress 1024x512px images to only 2kB, with superior visual quality compared to prior approaches at double the size! arXiv: arxiv.org/pdf/1804.02958.pdf
3
12
This is joint work with @george_toderici, @mtschannen, and @etagust, done while I was interning at Google! Thanks for an awesome collaboration. (4/4)
1
12
the world if CMT would show @CVPR rebuttal/reviewer info in a concise overview. Imagine if the rebuttal PDF would be inlined! I now have to open four windows (reviews, rebuttal PDF, paper PDF, discussion) whenever someone posts a disucssion.
1
11
🧑‍🏫⏩📁 Check out this nice summary of M2T! I'll present this work in the Wednesday afternoon session at #ICCV2023 👇👇👇
How to combine the strengths of masked and autoregressive sequence models? M2T is one way to do this, including 🧑‍🏫 teacher-forced training ⏩ predicting multiple tokens per inference step according to a schedule 📁 activation caching #ICCV2023 1/4
11
983
Just chilling
11
597
Zurich in winter in a nutshell ☁️
9
Tasteless joke. Literally nobody is "feeling it"
Who's feeling it? ❤️
2
10
2,437
And a lot of other folks for this revision :) Special shout out to @achinksinghal @a7b2_3 @robertriachi @nainar92 for the fast turn around 🚀
10
277
Replying to @BlackHC
It’s plain quantization with STE but you bound each of the d channels to L values you get an implicit codebook w L^d entries. by chosing L, d such that it matches VQ codebook sizes we show that you can use it as a drop-in in vision tasks - thats the contribution :) (not the SQ)
1
9
1,340
4/ This has been a great colab with @etagust and @mtschannen. Looking forward to share more @ICCVConference, in the mean time check out arxiv.org/abs/2304.07313
2
2
10
1,557
Everyone in ML should read about ✨ Broadcasting ✨. The tl;dr: array shapes are compared from the right. Any dim that is the same in both is compatible. 1 is always compatible, and missing dims on the left are implicitly one. numpy.org/doc/stable/user/ba…
1
8
824
Reviving an old hobby: Blender. The programm evolved A LOT since I last played with it, loving it. The Eevee renderer is mindblowingly fast. This is a WIP:
10
as the number of submission rises, the rooms get smaller it seems #iccv2023
1
8
951
We quantitatively evaluate with the perceptual scores FID, KID, NIQE, in addition to PSNR, MS-SSIM, and LPIPS. We find that no metric perfectly predicts the ranking of the user study, but that FID and KID are useful in guiding model development. (3/4)
1
9
I'll present this in Tel Aviv at @eccvconf on Wednesday afternoon, step by!
Our paper on ✨Neural Video Compression with GANs✨ was accepted at #ECCV2022 🎉🥳. We just uploaded the final version to arXiv, check it out: arxiv.org/abs/2107.12038
1
8
to calm down from the LLM hype it can help to twist the knobs of some (mostly) analog circuits
1
9
761
Completed 1/6 CVPR reviews. Noticed that figure captions were very short and light on details. ✨Unsolicited tip: Make your figure captions tell the story! ✨ It’s what people read first, sometimes before the abstract.
7
Check out the paper here: arxiv.org/abs/2312.02116 This was a great collaboration with @mtschannen and @CianEastwood. Stay tuned for code!
8
742
I‘m at the @Google Booth on Tuesday 10:30 at #CVPR2022 — come by and say hi 🙋‍♂️
8
Replying to @internetope
yeah somehow it is better! little details look cooler
2
221