Carl Vondrick · Jun 27, 2018 · 5:20 PM UTC

Carl Vondrick

27 Jun 2018

Our latest work shows that learning to colorize videos causes visual tracking to emerge automatically! Blog: ai.googleblog.com/2018/06/se… Paper: arxiv.org/abs/1806.09594 @alirezafathi @kevskibombom @sguada @abhi2610

137

441

Carl Vondrick · Nov 22, 2016 · 2:25 AM UTC

Carl Vondrick @cvondrick

22 Nov 2016

Fantastic conditional GAN results by Isola et al phillipi.github.io/pix2pix/

236

401

Carl Vondrick · Mar 2, 2021 · 4:49 PM UTC

Carl Vondrick @cvondrick

2 Mar 2021

The future is hard to anticipate! In our latest #CVPR2021 paper, we introduce a framework for learning *what* is predictable in the future. Rather than committing up front to categories to predict, our approach learns how to hedge the bet. hyperfuture.cs.columbia.edu

269

Carl Vondrick · Dec 15, 2016 · 1:55 AM UTC

Carl Vondrick @cvondrick

15 Dec 2016

Finding Tiny Faces -- had to zoom in quite a bit to parse how cool the results are! arxiv.org/pdf/1612.04402.pdf

136

246

Carl Vondrick · Dec 19, 2020 · 7:42 PM UTC

Carl Vondrick @cvondrick

19 Dec 2020

Learning unsupervised machine translation is easier if you open your eyes! Image distributions create transitive relations between languages. This creates incidental supervision for learning multilingual representations on 50 unpaired languages arxiv.org/pdf/2012.04631.pdf @Surisdi

236

Carl Vondrick · Nov 29, 2018 · 4:24 AM UTC

Carl Vondrick @cvondrick

29 Nov 2018

Neural networks fooled by unusual poses arxiv.org/pdf/1811.11553.pdf

211

Carl Vondrick · Dec 20, 2016 · 8:36 PM UTC

Carl Vondrick @cvondrick

20 Dec 2016

Learning Features by Watching Objects Move by Pathak et al arxiv.org/pdf/1612.06370.pdf

100

217

Carl Vondrick · Apr 20, 2017 · 2:33 PM UTC

Carl Vondrick @cvondrick

20 Apr 2017

amazing generations of the video future! Red border means output, green is input. sites.google.com/a/umich.edu…

205

Carl Vondrick · Oct 28, 2016 · 5:42 PM UTC

Carl Vondrick @cvondrick

28 Oct 2016

SoundNet: Learning natural sound representations with convnets and 2 million unlabeled videos. web.mit.edu/vondrick/soundne…

165

Carl Vondrick · Dec 2, 2016 · 2:38 PM UTC

Carl Vondrick @cvondrick

2 Dec 2016

Recognizing objects and scenes from sound only. Turn on your speakers! More visualizations: projects.csail.mit.edu/sound…

162

Carl Vondrick · Apr 19, 2017 · 3:26 AM UTC

Carl Vondrick @cvondrick

19 Apr 2017

Unsupervised Learning by Predicting Noise by Bojanowski and Joulin. Cool yet simple idea that works quite well!! arxiv.org/pdf/1704.05310.pdf

143

Carl Vondrick · Jul 15, 2020 · 2:39 AM UTC

Carl Vondrick @cvondrick

15 Jul 2020

What causes adversarial examples? Latest #ECCV2020 paper from @ChengzhiM and Amogh shows that deep networks are vulnerable partly because they are trained on too few tasks. Just by increasing tasks, we strengthen robustness for each task individually. arxiv.org/pdf/2007.07236.pdf

141

Carl Vondrick · Jun 15, 2020 · 7:40 PM UTC

Carl Vondrick @cvondrick

15 Jun 2020

Oops! Dave+Bo introduce a dataset of unconstrained videos showing unintentional action. We study self-supervised approaches for learning video representations of intentionality. #CVPR2020 Poster 93, Tue 10am PST Website: oops.cs.columbia.edu Paper: arxiv.org/abs/1911.11206

114

Carl Vondrick · Jun 20, 2021 · 3:21 PM UTC

Carl Vondrick @cvondrick

20 Jun 2021

Learning from Unlabeled Video (#LUV❤️‍🔥) starts today at 1:50pm EDT / 10:50am PDT! sites.google.com/view/luv202… You will LUV the speaker lineup and the curated papers! 😍 featuring @pathak2206 @akanazawa @SongShuran and more #CVPR2021 #CVPR21 #CVPR

109

Carl Vondrick · Mar 23, 2017 · 6:28 PM UTC

Carl Vondrick @cvondrick

23 Mar 2017

predicting the future, with semantic segmentations! by Neverova and Luc arxiv.org/pdf/1703.07684.pdf

Carl Vondrick · Oct 18, 2016 · 4:06 AM UTC

Carl Vondrick @cvondrick

18 Oct 2016

Cross-Modal Scene Networks: learning aligned representations across several different modalities cmplaces.csail.mit.edu/

Carl Vondrick · Mar 14, 2017 · 5:31 AM UTC

Carl Vondrick @cvondrick

14 Mar 2017

CVPR workshop on negative results! negative.vision

Carl Vondrick · Mar 2, 2021 · 4:50 PM UTC

Carl Vondrick @cvondrick

2 Mar 2021

Our predictive model is hyperbolic, which naturally encodes hierarchical structure. When the model is most confident, it will predict at a concrete level of the hierarchy. But when not confident, the *mean* solution automatically selects a higher level!

Carl Vondrick · Jun 19, 2020 · 3:07 PM UTC

Carl Vondrick @cvondrick

19 Jun 2020

Learning from Unlabeled Video Workshop -- starting now! First up: Andrea Vedaldi (Oxford) on Learning Representations and Geometry from Unlabelled Videos. sites.google.com/view/luv202…

Carl Vondrick · Nov 30, 2018 · 12:13 AM UTC

Carl Vondrick @cvondrick

30 Nov 2018

Got many replies. I don't believe the problem has to do with neural nets. The problem is the paradigm of supervised classification and closed datasets. We need models that learn from an open world, with self-supervision, never stop learning, and transfer between tasks.

Carl Vondrick @cvondrick

29 Nov 2018

Neural networks fooled by unusual poses arxiv.org/pdf/1811.11553.pdf

Carl Vondrick · Jun 6, 2017 · 12:52 AM UTC

Carl Vondrick @cvondrick

6 Jun 2017

See, hear, read: deep representations shared over 3 natural modalities. Units activate on objects in each modality. goo.gl/kNVJr4

Carl Vondrick · Mar 15, 2024 · 11:05 PM UTC

Carl Vondrick @cvondrick

15 Mar 2024

With just a few hours of experimentation in the physical world, a robot can learn on its own to design and throw paper airplanes further than a person, and even learn to build robot grippers out of cheap paper. No foundation models. No simulation. No language.

Ruoshi Liu @ruoshi_liu

14 Mar 2024

Humans can design tools to solve various real-world tasks, and so should embodied agents. We introduce PaperBot, a framework for learning to create and utilize paper-based tools directly in the real world. paperbot.cs.columbia.edu/

6,477

Carl Vondrick · Jun 22, 2020 · 12:58 PM UTC

Carl Vondrick @cvondrick

22 Jun 2020

Videos of the full workshop are now available on YouTube: sites.google.com/view/luv202…. Thanks everyone, especially the speakers, for a great workshop!

LUV 2020

News & Updates Jun 19, 2020: The workshop was a great success! Thank you everyone. Jun 6, 2020: Workshop program is available. The workshop will be fully virtual with an assortment of live and...

sites.google.com

Carl Vondrick @cvondrick

19 Jun 2020

Learning from Unlabeled Video Workshop -- starting now! First up: Andrea Vedaldi (Oxford) on Learning Representations and Geometry from Unlabelled Videos. sites.google.com/view/luv202…

Carl Vondrick · Jun 13, 2020 · 3:12 PM UTC

Carl Vondrick @cvondrick

13 Jun 2020

Our new paper (w/@Surisdi,Dave) shows Transformers can meta-learn a process for language acquisition from vision. At inference, the policy adapts to new words and generalizes better. #CVPR2020 Paper: arxiv.org/abs/1911.11237 Talk: Mon 11:40am PST mindsvsmachines.com

Carl Vondrick · Dec 21, 2020 · 6:21 PM UTC

Carl Vondrick @cvondrick

21 Dec 2020

Sssshhh!! There is so much noise in cities today. Ruilin and Rundi introduce a new approach that removes ambient noise from audio, letting the speech come through loud and clear. Let's have a listen... 🔊Turn on your speakers! 🔗cs.columbia.edu/cg/listen_to…

Carl Vondrick · Sep 10, 2021 · 12:55 PM UTC

Carl Vondrick @cvondrick

10 Sep 2021

I am so excited to be part of this dream team. We will be investigating the next generation of ML and predictive models for truly planetary scale problems. If you are passionate about cutting-edge ML coupled with societal impact, please apply to Columbia for various positions!

Columbia University

@Columbia

9 Sep 2021

Hurricane Ida made one thing clear: we are not prepared for the extreme weather caused by #climatechange. A new climate modeling center is designed to improve climate projections and encourage societies to plan for the inevitable disruptions ahead. bit.ly/3hhXJil @NSF

ALT Columbia researchers Pierre Gentine (left) and Galen McKinley will lead the new climate modeling center. (Image: Marley Bauce)

Carl Vondrick · Apr 19, 2017 · 4:57 PM UTC

Carl Vondrick @cvondrick

19 Apr 2017

great high-res image manipulation by interpolating in feature space -- simple, no GAN required (Upchurch et al) arxiv.org/pdf/1611.05507.pdf

Carl Vondrick · Apr 28, 2017 · 2:18 PM UTC

Carl Vondrick @cvondrick

28 Apr 2017

Network visualization, dissection, and interpretability by David Bau and Bolei Zhou at MIT! netdissect.csail.mit.edu/ @zhoubolei

Carl Vondrick · Jun 16, 2019 · 5:17 AM UTC

Carl Vondrick @cvondrick

16 Jun 2019

Learn about Learning from Unlabeled Videos at #CVPR2019, Sunday in Room E, 9:00am Fresh posters and keynotes: Antonio Torralba, Noah Snavely, Andrew Zisserman, Bill Freeman, Abhinav Gupta, Kristen Grauman sites.google.com/view/luv201…

LUV 2019

All invited talks and oral sessions will be in Room E in Hyatt Regency. Morning posters will be in the Pacific Arena Ballroom (main convention center). Afternoon posters will be in Room E in Hyatt...

sites.google.com

Carl Vondrick · Jul 26, 2020 · 1:00 AM UTC

Carl Vondrick @cvondrick

26 Jul 2020

I had a joke about homography, but it was too plane.

Carl Vondrick · Feb 20, 2019 · 4:40 AM UTC

Carl Vondrick @cvondrick

20 Feb 2019

Announcing the Workshop on Learning from Unlabeled Video at CVPR 2019. Come for dynamite speakers, and stay for the abstracts! Abstract deadline is March 4. Topics include self-supervised learning, sound and vision, visual anticipation, active vision, etc sites.google.com/view/luv201…

LUV 2019

All invited talks and oral sessions will be in Room E in Hyatt Regency. Morning posters will be in the Pacific Arena Ballroom (main convention center). Afternoon posters will be in Room E in Hyatt...

sites.google.com

Carl Vondrick · Jul 15, 2019 · 3:27 PM UTC

Carl Vondrick @cvondrick

15 Jul 2019

Self-supervised learning is prediction, and unsupervised learning is compression (in my view)

Charles Sutton @RandomlyWalking

15 Jul 2019

Replying to @jmhessel

“Self-supervised” is a rebranding for “unsupervised” to avoid confusing people who ask Qs like “how can LMs be unsupervised if you give them the next token to predict”? I dislike rebranding, but I dislike even more arguing about whether LMs are unsupervised. So,🤷‍♂️?

Carl Vondrick · Jul 26, 2018 · 12:20 AM UTC

Carl Vondrick @cvondrick

26 Jul 2018

The "deep inversion" quiz by Oxford: how well do you understand neural network visualizations? robots.ox.ac.uk/~vgg/researc…

Carl Vondrick · Mar 15, 2021 · 1:59 AM UTC

Carl Vondrick @cvondrick

15 Mar 2021

Don’t want pi day to end? Come to the hyperbolic world! In hyperbolic space, pi has no upper bound. You can eat pie for the rest of the year.

Carl Vondrick @cvondrick

2 Mar 2021

Replying to @cvondrick

Carl Vondrick · Jun 10, 2016 · 3:44 PM UTC

Carl Vondrick @cvondrick

10 Jun 2016

learning to recognize objects with only a few examples -- exciting 'low data' paradigm arxiv.org/pdf/1606.02819.pdf

Carl Vondrick · May 18, 2021 · 5:59 PM UTC

Carl Vondrick @cvondrick

18 May 2021

Turn any container into a smart container — all you need is noise!

Boyuan Chen

@Boyuan__Chen

18 May 2021

How can we tell "what is where" inside a container, after dropping something into it? Can we generate visual scenes from sound? Excited to share our latest work: The Boombox: Visual Reconstruction from Acoustic Vibrations. (boombox.cs.columbia.edu)

Carl Vondrick · Sep 30, 2017 · 10:58 PM UTC

Carl Vondrick @cvondrick

30 Sep 2017

Important warning of non-peer reviewed papers: public can lose trust in science and research if too much low-quality work is posted.

Kosta Derpanis (sabbatical in Zurich)

@CSProfKGD

30 Sep 2017

To preprint or not. This debate sounds strangely familiar #ComputerVision gizmodo.com/should-scientist…

Carl Vondrick · Jan 29, 2021 · 6:17 PM UTC

Carl Vondrick @cvondrick

29 Jan 2021

Replying to @jbhuang0604

Conclusion is where you accidentally tell reviewers how to reject your paper!

Carl Vondrick · Mar 23, 2017 · 6:15 PM UTC

Carl Vondrick @cvondrick

23 Mar 2017

"Do Good Research" by Fredo Durand thecomputationalphotographer…

Carl Vondrick · Jun 22, 2018 · 2:20 AM UTC

Carl Vondrick @cvondrick

22 Jun 2018

CVPR should be "Computer Vision, Prediction, and Robotics"

Carl Vondrick · Dec 16, 2016 · 2:33 AM UTC

Carl Vondrick @cvondrick

16 Dec 2016

cool idea to interactively reconfigure pretrained CNNs in order to recognize unseen classes, by Krishnan and Ramanan arxiv.org/pdf/1612.04901.pdf

Carl Vondrick · May 24, 2019 · 2:48 PM UTC

Carl Vondrick @cvondrick

24 May 2019

The computer vision group at Columbia is looking for a postdoctoral fellow. Come wrangle pixels with us in the big city. More info: cs.columbia.edu/~vondrick/po…

Carl Vondrick · Nov 6, 2018 · 4:11 AM UTC

Carl Vondrick @cvondrick

6 Nov 2018

Excellent piece, but I disagree we should give up our datasets. To get commonsense and generalization, we should create rich & diverse multi-modal datasets that span huge number of tasks. Probably need new data collection means, eg interaction and self-supervision (not MTurk)

Melanie Mitchell @MelMitchell1

5 Nov 2018

My opinion piece in the NY Times. nytimes.com/2018/11/05/opini…

Carl Vondrick · Oct 5, 2016 · 3:24 AM UTC

Carl Vondrick @cvondrick

5 Oct 2016

released 35 million video clips! stabilized, natural video. 1 year! fun dataset for generative video models goo.gl/3CnFKR

Carl Vondrick · Oct 31, 2016 · 1:22 PM UTC

Carl Vondrick @cvondrick

31 Oct 2016

Learning camouflaged QR codes

Dmitry Ulyanov @DmitryUlyanovML

31 Oct 2016

A nice paper from our lab on learning visual codes, to appear at NIPS sites.skoltech.ru/compvision…

Carl Vondrick · Mar 31, 2021 · 5:11 PM UTC

Carl Vondrick @cvondrick

31 Mar 2021

Replying to @rzhang88

I should see a doctor ASAP! AI is going to save my life!

Carl Vondrick · Mar 2, 2021 · 4:57 PM UTC

Carl Vondrick @cvondrick

2 Mar 2021

Congratulations to @Surisdi and Ruoshi Li on their #CVPR2021 paper!! and checkout the video below for an hour long talk with all the details and results! piped.video/watch?v=-Uy92jvT…

Dídac Surís - Learning the Predictability of the Future

February 16th, 2021. MIT, CSAILAbstract: Not everything in the fu...

youtube.com

Carl Vondrick · May 24, 2017 · 4:15 AM UTC

Carl Vondrick @cvondrick

24 May 2017

Learning visual and auditory representations simultaneously from video!

Relja Arandjelović @relja_work

24 May 2017

My first paper at DeepMind: What can be learnt by looking at and listening to a large amount of unlabelled videos? arxiv.org/abs/1705.08168

Carl Vondrick · Dec 23, 2016 · 5:39 PM UTC

Carl Vondrick @cvondrick

23 Dec 2016

Learning to find moving objects irrespective of camera motion, by Tokmakov et al. arxiv.org/pdf/1612.07217v1.p…

Carl Vondrick · Apr 20, 2017 · 2:33 PM UTC

Carl Vondrick @cvondrick

20 Apr 2017

Carl Vondrick · Jan 11, 2021 · 7:23 PM UTC

Carl Vondrick @cvondrick

11 Jan 2021

Predictive models on physical robots learn rich features about their surroundings -- they learn about obstacles and even the policy of other robots. Latest paper with @BoyuanChen1 and @hodlipson, out today!

Columbia Engineering @CUSEAS

11 Jan 2021

Can a robot be empathetic? @MechCU Prof @hodlipson thinks so: his lab has created a robot that learns to visually predict how its partner robot will behave. engineering.columbia.edu/pre… @Columbia

Carl Vondrick · Mar 2, 2021 · 4:49 PM UTC

Carl Vondrick @cvondrick

2 Mar 2021

Most predictive models operate in Euclidean space. However, when there is uncertainty or multiple modes, the optimal solution is to regress the mean, which often lacks any interpretation. Our idea: Let’s make the mean mean something!

Carl Vondrick · Apr 20, 2017 · 2:34 PM UTC

Carl Vondrick @cvondrick

20 Apr 2017

Carl Vondrick · Feb 10, 2019 · 4:49 PM UTC

Carl Vondrick @cvondrick

10 Feb 2019

Map of emotional responses versus audible gasps: s3-us-west-1.amazonaws.com/v… news.berkeley.edu/2019/02/04…

Carl Vondrick · Mar 2, 2021 · 4:55 PM UTC

Carl Vondrick @cvondrick

2 Mar 2021

Hyperbolic geometry for machine learning and computer vision is a young and rapidly growing area. We are not the first to work with this geometry, and we will not be the last! Code, models, data, visuals, and links to tutorials are on our project website: hyperfuture.cs.columbia.edu

Carl Vondrick · Jun 16, 2016 · 12:53 AM UTC

Carl Vondrick @cvondrick

16 Jun 2016

what do we visualize when we visualize ConvNets? important question: arxiv.org/pdf/1606.04801.pdf

Carl Vondrick · Oct 12, 2017 · 3:46 AM UTC

Carl Vondrick @cvondrick

12 Oct 2017

Photos with the smart phone removed ericpickersgill.com/Removed

Carl Vondrick · Jan 6, 2017 · 6:25 PM UTC

Carl Vondrick @cvondrick

6 Jan 2017

"Person analysis using cheap and large-scale synthetic data" in Learning from Synthetic Humans by Varol et al arxiv.org/pdf/1701.01370.pdf

Carl Vondrick · Sep 8, 2016 · 2:50 AM UTC

Carl Vondrick @cvondrick

8 Sep 2016

preprint for "Generating Videos with Scene Dynamics": adversarial nets for video generation & learning & prediction web.mit.edu/vondrick/tinyvid…

Carl Vondrick · May 15, 2017 · 2:16 AM UTC

Carl Vondrick @cvondrick

15 May 2017

Low-power vision mode that produces lower-quality image data suitable only for computer vision -- by Buckler et al arxiv.org/pdf/1705.04352.pdf

Carl Vondrick · Sep 28, 2024 · 3:04 PM UTC

Carl Vondrick @cvondrick

28 Sep 2024

Replying to @tetraduzione

We are optimistic we can offer reviewers a relatively low load, which we hope translates into high quality reviews.

659

Carl Vondrick · Apr 27, 2018 · 11:50 PM UTC

Carl Vondrick @cvondrick

27 Apr 2018

Rocks that convert CO2 into stone: nytimes.com/interactive/2018…

How Oman’s Rocks Could Help Save the Planet (Published 2018)

The rocks in this part of the world have a special ability: They can turn carbon dioxide into stone.

nytimes.com

Carl Vondrick · Aug 8, 2018 · 9:09 PM UTC

Carl Vondrick @cvondrick

8 Aug 2018

Replying to @haldaume3

you can silently and randomly add questions that have a single, well-defined answer that you also know. then, discard all workers that fail those "quiz" questions

Carl Vondrick · Jul 20, 2016 · 1:53 AM UTC

Carl Vondrick @cvondrick

20 Jul 2016

NIPS 2014: 3 reviews, 6k char rebut. 2015: 4 reviews, 5k char rebut. 2016: 6 reviews, 3k char rebut. 2017: 9 reviews, tweet rebuttal ?!

Carl Vondrick · Oct 28, 2016 · 5:48 PM UTC

Carl Vondrick @cvondrick

28 Oct 2016

learns nice convolutional filters for raw waveforms, without ground truth labels

Carl Vondrick · Jun 19, 2020 · 9:54 PM UTC

Carl Vondrick @cvondrick

19 Jun 2020

Last keynote: Alyosha Efros and Allan Jabri on learning space-time correspondences, starting in 5 min sites.google.com/view/luv202…

Carl Vondrick · Jun 26, 2018 · 2:57 AM UTC

Carl Vondrick @cvondrick

26 Jun 2018

Replying to @dimadamen

Thank you for the kind words!! We finally got the paper up on arxiv tonight: arxiv.org/abs/1806.09594 Video of results are coming soon...

Tracking Emerges by Colorizing Videos

We use large amounts of unlabeled video to learn models for visual tracking without manual human supervision. We leverage the natural temporal coherency of color to create a model that learns to...

arxiv.org

Carl Vondrick · Jun 6, 2017 · 12:54 AM UTC

Carl Vondrick @cvondrick

6 Jun 2017

For example, a hidden unit automatically emerges for dogs. It activates on images of dogs, sentences about dogs, or sounds of barking

Carl Vondrick · Apr 17, 2018 · 5:21 AM UTC

Carl Vondrick @cvondrick

17 Apr 2018

Dear twitter, how do you take notes and jot down ideas for research? Do you use an app, pen/paper, memory?

Carl Vondrick · Jun 15, 2019 · 5:26 AM UTC

Carl Vondrick @cvondrick

15 Jun 2019

Saturday at #ICML2019!

Aäron van den Oord

@avdnoord

30 Mar 2019

Excited to announce our #ICML2019 Workshop on Self-Supervised Learning! Covering- Vision, NLP, Audio, Robotics, RL ... sites.google.com/view/self-s… Submissions now open - deadline April 25! Speakers: @ylecun, @chelseabfinn, Andrew Zisserman, Alexei Efros, Jacob Devlin, Abhinav Gupta

Carl Vondrick · Dec 21, 2020 · 6:25 PM UTC

Carl Vondrick @cvondrick

21 Dec 2020

The main idea: Natural audio will contain intervals of silence, which we can leverage as incidental supervision for learning to denoise. By learning to first detect these pauses, we can estimate a profile for the noise, and suppress it throughout the audio.

Carl Vondrick · Mar 2, 2021 · 4:51 PM UTC

Carl Vondrick @cvondrick

2 Mar 2021

Here’s an example. As the model observes more of the video, the future becomes more and more predictable. Our model makes increasingly specific forecasts of the future.

Carl Vondrick · Jun 16, 2019 · 6:04 PM UTC

Carl Vondrick @cvondrick

16 Jun 2019

Andrew Zisserman on leveraging temporal coherence and sound to learn from video!

Carl Vondrick · Jan 11, 2017 · 4:24 PM UTC

Carl Vondrick @cvondrick

11 Jan 2017

Fine-grained sound recognition, plus a fun dataset collection!

Kosta Derpanis (sabbatical in Zurich)

@CSProfKGD

11 Jan 2017

Why an #ArtificialIntelligence firm is busy smashing thousands of windows bbc.com/future/story/2017011…

Carl Vondrick · Aug 14, 2021 · 12:17 PM UTC

Carl Vondrick @cvondrick

14 Aug 2021

Replying to @keenanisalive

What about IoU? arxiv.org/abs/1902.09630

Generalized Intersection over Union: A Metric and A Loss for...

Intersection over Union (IoU) is the most popular evaluation metric used in the object detection benchmarks. However, there is a gap between optimizing the commonly used distance losses for...

arxiv.org

Carl Vondrick · May 8, 2018 · 8:38 PM UTC

Carl Vondrick @cvondrick

8 May 2018

Learning to see in the dark: super cool results! piped.video/watch?v=qWKUFK7M…

Carl Vondrick · Mar 2, 2021 · 4:54 PM UTC

Carl Vondrick @cvondrick

2 Mar 2021

Just by changing predictive models to work in hyperbolic space instead of Euclidean space, the model automatically learns to select the right level of abstraction under uncertainty!

Carl Vondrick · Sep 2, 2020 · 11:32 PM UTC

Carl Vondrick @cvondrick

2 Sep 2020

Replying to @dimadamen

There’s also multiple modes, eg multimodal prediction

Carl Vondrick · Oct 23, 2016 · 10:30 PM UTC

Carl Vondrick @cvondrick

23 Oct 2016

Great results from the Scene Parsing Challenge with Places Database csail.mit.edu/csail_computer…

Carl Vondrick · Jun 13, 2024 · 7:22 PM UTC

Carl Vondrick @cvondrick

13 Jun 2024

Generative video models facing the physical world 👇 #CVPR2024

Ruoshi Liu @ruoshi_liu

12 Jun 2024

Recently released video generation models are amazing😍 How can we use them in robotics to learn generalizable visuomotor policies? Come find out in my talks at these 4 CVPR workshops next week, where I will talk about recent works in 3D, generative models, and robotics! RHOBIN (rhobin-challenge.github.io/s…) Holistic Video Understanding (holistic-video-understanding…) AI3DG (ai3dg.github.io/index.html) 3D Foundation Model (cvpr.thecvf.com/virtual/2024…) Shout out to all the amazing organizers!

2,517

Carl Vondrick · Jun 19, 2019 · 10:53 PM UTC

Carl Vondrick @cvondrick

19 Jun 2019

Cool paper from Berkeley: learn 3D flow from unlabeled stereo videos

Carl Vondrick · Oct 28, 2016 · 5:48 PM UTC

Carl Vondrick @cvondrick

28 Oct 2016

code and pre-trained models available on github: github.com/cvondrick/soundne…

GitHub - cvondrick/soundnet: SoundNet: Learning Sound Representations from Unlabeled Video. NIPS...

SoundNet: Learning Sound Representations from Unlabeled Video. NIPS 2016 - cvondrick/soundnet

github.com

Carl Vondrick · Mar 9, 2022 · 2:09 AM UTC

Carl Vondrick @cvondrick

9 Mar 2022

Congratulations Dídac!! @Surisdi

ColumbiaCompSci @ColumbiaCompSci

8 Mar 2022

Didac Suris (@Surisdi), one of our PhD students, won a Microsoft Research Fellowship (@MSFTResearch)! Learn more about him and his PhD experience here - bit.ly/PhDDidacS

Carl Vondrick · Jun 26, 2018 · 8:52 PM UTC

Carl Vondrick @cvondrick

26 Jun 2018

Replying to @farhanhubble @quantombone

Videos are coming soon!!

Carl Vondrick · Dec 19, 2020 · 7:45 PM UTC

Carl Vondrick @cvondrick

19 Dec 2020

While each language represents a bicycle with a different word, the underlying visual representation remains consistent. A bicycle has a similar appearance in the UK, France, Japan, and India. We leverage this natural property for translating unpaired languages.

Carl Vondrick · Jan 22, 2025 · 12:22 AM UTC

Carl Vondrick @cvondrick

22 Jan 2025

Good bye X 👋. Join me on BlueSky! bsky.app/profile/cvondrick.b…

3,750

Carl Vondrick · Oct 28, 2016 · 5:44 PM UTC

Carl Vondrick @cvondrick

28 Oct 2016

basic idea is: visual recognition networks teach networks for sound, enabling learning from tons of unlabeled video

Carl Vondrick · Jun 16, 2019 · 4:26 PM UTC

Carl Vondrick @cvondrick

16 Jun 2019

Antonio Torralba on multi-modal learning and self-supervised learning

Carl Vondrick · Jun 27, 2018 · 5:09 PM UTC

Carl Vondrick @cvondrick

27 Jun 2018

Replying to @CSProfKGD @quantombone @farhanhubble

Ok, videos are finally out!! ai.googleblog.com/2018/06/se…

Carl Vondrick · Jun 20, 2020 · 6:08 PM UTC

Carl Vondrick @cvondrick

20 Jun 2020

Replying to @dimadamen @fdellaert @Oxford_VGG

Video should be on YouTube next week. Thanks everyone for attending and great questions, and especially Yale Song for leading the behind the scenes!

Carl Vondrick · Oct 19, 2019 · 2:55 PM UTC

Carl Vondrick @cvondrick

19 Oct 2019

Replying to @LakeBrenden @washingtonpost

To be fair, I’m a human who also needs instructions to solve a Rubik cube!

Carl Vondrick · Dec 19, 2020 · 7:47 PM UTC

Carl Vondrick @cvondrick

19 Dec 2020

It learns how to translate individual words across 50 fifty languages... even without paired language supervision

Carl Vondrick · Mar 2, 2021 · 4:53 PM UTC

Carl Vondrick @cvondrick

2 Mar 2021

Since hyperbolic space is continuous, the hierarchy is actually continuous as well! This lets us work with hierarchies of any depth. Here’s 3 levels deep.

Carl Vondrick · Dec 19, 2020 · 7:55 PM UTC

Carl Vondrick @cvondrick

19 Dec 2020

The approach finds very interesting transitive paths between languages via vision, which we show below. When there is a strong path, the final score is high (top row), and it's low when the path is not aligned well (bottom row)

Carl Vondrick · Jun 19, 2020 · 8:03 PM UTC

Carl Vondrick @cvondrick

19 Jun 2020

Starting now: Ivan Laptev from INRIA sites.google.com/view/luv202…

Carl Vondrick · Jun 16, 2020 · 6:08 PM UTC

Carl Vondrick @cvondrick

16 Jun 2020

Poster happening now on Zoom! cvpr20.com/event/oops-predic…

Carl Vondrick @cvondrick

15 Jun 2020

Carl Vondrick · Dec 19, 2020 · 7:50 PM UTC

Carl Vondrick @cvondrick

19 Dec 2020

We can also translate sentences, not just individual words! Of course, it works best on concrete visual concepts

Carl Vondrick · Sep 29, 2020 · 1:00 AM UTC

Carl Vondrick @cvondrick

29 Sep 2020

Replying to @CSProfKGD @alfcnz

I ended up using Screenflow, and I found it fantastic. It jointly records your screen, audio, and webcam. There is a simple UI to create different scenes.

Carl Vondrick · Dec 19, 2020 · 8:03 PM UTC

Carl Vondrick @cvondrick

19 Dec 2020

We show pairwise performance between source and target languages. As you might expect, languages within the same family are easier to translate between. But our approach is language agnostic, and makes no assumptions on grammar or vocab. The full dataset is available online!

Carl Vondrick · Mar 2, 2021 · 4:51 PM UTC

Carl Vondrick @cvondrick

2 Mar 2021

The predictions are initially near the origin of the space, which corresponds to predicting the “root” node of the hierarchy. But over time, the prediction moves closer to the boundary of the space, corresponding to more specific forecasts.