Hello, view synthesis devotees. I invite you to some new work at @eccvconf. We gather tourist photos of famous landmarks and learn a new neural 3D representation that can synthesize new views with natural, modifyable lighting. We call it "Crowdsampling the Plenoptic Function".
It turns out that YouTube has tons of videos of people pretending to be statues. This is great for learning about the 3D shape of people! Cool new work from @zl548 at CVPR19 from his Google internship. mannequin-depth.github.io/
Attention all looking glass lovers: This tweet is a shameless plug for a CVPR 2020 paper that asks a dumb question and finds an interesting answer. Can you tell if an image has been horizontally flipped or not?
We couldn't find a Fundamental matrix visualizer online, so we made one for our vision course. If you are an F-matrix fan, take a look & tell us if you find any problems. And please send pointers to other demos! (Credits: Alek Curless, Sri Chakra Kumar)
cs.cornell.edu/courses/cs567…
Dear typesetting fanatics: I wrote a short Latex style guide with some tips & tricks that I find useful for making short and nice-looking papers. If you are working on ECCV papers or the like, maybe it will be useful to you, too. bit.ly/latex-style
For stylization fans, @KaiZhang9546's work called ARF: Artistic Radiance Fields is on Tuesday's docket at @eccvconf. It achieves nice, view-consistent 2D-to-3D style transfer results by fine-tuning a radiance field so that projections resemble the style of an input source image.
Do you have the blues because you are getting broken 3D models from COLMAP or other 3D reconstruction pipelines? Ruojin has a nice new paper and codebase that can help! We invite you to check out our work on doppelganger images here:
doppelgangers-3d.github.io/
Check out our #ICCV203 paper called Doppelgangers. We train a classifier to detect distinct but visually similar image pairs ("doppelgangers") and apply it to SfM disambiguation, enabling COLMAP to create correct 3D models in hard cases.
Project page: doppelgangers-3d.github.io/
Dear city lovers: here's new work at @eccvconf on observing many images of a city over time, and learning to factor lighting effects from scene appearance. This factorization lets us relight new images, even from new cities. Here we learn from NYC and create a full day in Paris.
To all the CVPR-heads out there -- check out @KaiZhang9546's work on inverse rendering in this morning's oral session! Relightable 3D meshes from photos, with really beautiful results.
Learned about Notre Dame Cathedral through computer vision and structure from motion, of all things, many years before I ever got a chance to visit. Very sad day.
Got an urge to render the world from Internet photo collections? The source code for @moustafaMeshry's CVPR2019 best paper finalist is now available: moustafameshry.github.io/neu…. Have fun out there!
Zhengqi’s new work is a very cool approach to single-image animation—these videos are really nifty!
This work turns a still image into a looping video by predicting frequency-space motion. It can also make your image interactive. The demo is really nice! generative-dynamics.github.i…
Excited to share our work on Generative Image Dynamics!
We learn a generative image-space prior for scene dynamics, which can turn a still photo into a seamless looping video or let you interact with objects in the picture. Check out the interactive demo:
generative-dynamics.github.i…
Hello to all you light field lovers out there! We have new work with John Flynn and others on high-quality view synthesis from a camera array. We use soft layers to make nice pictures. Presented in Tuesday's afternoon oral session at @cvpr19. augmentedperception.github.i…
This is so cool! Check out Richard Bowen's work today at #3DV2022. It considers what possible flow fields could arise if you were to hit a hypothetical "play" button on a still image. @3DVconfdimensions-of-motion.github.…
Hey there—code and data for our Crowdsampling the Plenoptic Function paper from @eccvconf is now available for all you tourism-heads out there.
github link: github.com/zhengqili/Crowdsa…
Hello, view synthesis devotees. I invite you to some new work at @eccvconf. We gather tourist photos of famous landmarks and learn a new neural 3D representation that can synthesize new views with natural, modifyable lighting. We call it "Crowdsampling the Plenoptic Function".
I have a real soft spot for epipolar geometry—and so this tweet is a crass advertisement for some work of ours at @eccvconf that I think is nice. The idea is to learn local feature descriptors from pairs of images with known camera poses—no ground truth correspondence required.
This is so cool! Check out @boyang_deng's wonderful work on generating Streetscapes -- tours through imaginary street scenes, conditioned on a desired city layout and a text description. I like this wintry result a lot!
Thought about generating realistic 3D urban neighbourhoods from maps, dawn to dusk, rain or shine? Putting heavy snow on the streets of Barcelona? Or making Paris look like NYC? We built a Streetscapes system that does all these. See boyangdeng.com/streetscapes. (Showreel w/ 🔊 ↓)
I think it is pretty neat. This is work from Cornell Tech with @zl548, @XianWenqi, and @AbeDavis. You can find out more at crowdsampling.io, or watch this wonderful teaser video made by @AbeDavis.
Maybe it's just me, but for me, the award for the computer vision project whose webpage has survived for the longest time without breaking is "3D Photography on your Desk" by Jean-Yves Bouguet and Pietro Perona (1998).
vision.caltech.edu/bouguetj/…
In need of many examples of camera trajectories from videos? Check out our new RealEstate10K dataset! google.github.io/realestate1…. This is the kind of data we used in our recent Stereo Magnification work on view synthesis (with Tinghui Zhou). people.eecs.berkeley.edu/~ti…
I'm really proud of @zhengqi_li, who put his heart into the DynIBaR work that got the Best Paper Honorable Mention nod at CVPR. And I'm really sad that he couldn't be there to experience it due to circumstances beyond his control.
Thanks for the nice photo and note, @jon_barron!
Hi everyone. I'm helping to organize tomorrow's ECCV 4D Vision Workshop. We have a lineup of great papers and speakers—some real vision enthusiasts—including @RaquelUrtasun, Michael Ryoo, @davsca1, Drago Anguelov, @mapo1, @xiaolonw, & Tom Funkhouser. sites.google.com/view/4dvisi…
Zhengqi Li (Cornell PhD student) presents: MegaDepth! Big (100K+), diverse dataset of RGBD images derived from Internet multi-view stereo. Good for training RGB -> depth, generalizable to other datasets (e.g. KITTI).
Web: cs.cornell.edu/projects/mega…, arXiv: arxiv.org/abs/1804.00607/
This is work with Zhiqiu Lin, Jin Sun, and @abedavis. You can check it out at visual-chirality.io or visit the CVPR Q&A on "Visual Chirality" on Thursday. Or watch this nice teaser video from @AbeDavis. Thanks! Now back to your timeline.
Shamelessly plugging this talk tomorrow (Wednesday). My hat is off to the 3DGV organizers for putting together a great series of talks on cool 3D vision-style work!
@3_dgv Seminar in 2 days!
@Jimantha
Noah Snavely will talk about "The Plenoptic Camera", joined by
@_pratul_
Pratul Srinivasan and Rick Szeliski. Please distribute the news to students/members in your groups.
Youtube: piped.video/ToyBeaOUUFI
3/10 11am Pacific
3/10 19:00 UK
The Google Ph.D. Fellowship Program has selected @QianqianWang5 as one of its 2022 fellows. “I hope that my technology can enable us to create a rich and realistic virtual world,” - Qianqian Wang, computer science Ph.D. student at @Cornell_tech
Read more: bit.ly/3Ts3Gul
Hello to all you fashion-heads out there! We invite you to our new ICCV paper on analyzing clothing in millions of photos around the world. We can discover world events and festivals purely from apparel! With U. Mall, K. Matzen, B. Hariharan & K. Bala. geostyle.cs.cornell.edu
For all you inverse rendering fanatics out there, some great work on recovering shape, glossy material, and lighting from multiple photos.
This is this work of @KaiZhang9546, Fujun Luan, @QianqianWang5, and Kavita Bala from @cs_cornell, @CornellECE, and @cornell_tech.
We invite you to check out this nice work on view synthesis for dynamic scenes! Work from @zl548 during his Adobe internship with @oliverwang81 and @simon_niklaus.
Really great work we did with @zl548 on practical novel view synthesis in space and time. Take any video and move the camera, or slow down the time, or both!
Website: cs.cornell.edu/~zl548/NSFF/
With: @zl548, @oliverwang81, @Jimantha
Very nice work led by @QianqianWang5 on generalizing NeRF by incorporating principles from classic image-based rendering. Along with other work like pixelNeRF and GRF, I'm excited by these demonstrations of cross-scene generalization. (And I love this example miniature scene!)
Training NeRFs per-scene is so 2020. Inspired by image based rendering, IBRNet does amortized inference for view synthesis by learning how to look at input images at render time. 15% drop in error, 80% fewer FLOPs than NeRF. Great work @QianqianWang5! ibrnet.github.io
In case you missed it at ECCV: @zl548 has a new dataset called CGIntrinsics. Ludicrously high-quality CG renderings for learning intrinsic images. You can predict state-of-the-art intrinsic images on real photos just by training on CG data! cs.cornell.edu/projects/cgin…#ECCV2018
It feels like CVPR20 ended 9 years ago—but I'm only now checking it out. I recommend the great FATE tutorial. On top of the challenges outlined, I imagine there are hurdles even in spearheading such a tutorial—thank you, @timnitGebru & @cephaloponderer! sites.google.com/view/fatecv…
I got the chance to read this paper in detail recently, and it is really cool, especially for all you feature matching–heads out there! I love the idea of computing descriptors on the basis of two images at once. Nice work, @oliviawiles1, Sebastien Ehrhardt, and Andrew Zisserman!
D2D: Learning to find good correspondences for image matching and manipulation
@oliviawiles1, Sebastien Ehrhardt, Andrew Zisserman, @Oxford_VGG
Idea: extract features conditionally on 2nd image.
1/
arxiv.org/abs/2007.08480
Glad to share our work “Neural 3D Reconstruction in the Wild” in SIGGRAPH 2022! We show that with a clever sampling strategy, neural-based 3D reconstruction can be better and faster than COLMAP. Check out the project page at: zju3dv.github.io/neuralrecon….
Our paper, “NeRF in the Wild”, is out! NeRF-W is a method for reconstructing 3D scenes from internet photography. We apply it to the kinds of photos you might take on vacation: tourists, poor lighting, filters, and all. nerf-w.github.io (1/n)
View synthesis is super cool! How can we push it further to generate the world *far* beyond the edges of an image? We present Infinite Nature, a method that combines image synthesis and 3D to generate long videos of natural scenes from a single image. infinite-nature.github.io
This is a great program for folks interested in postdocs @Cornell. For vision/graphics folks interested in NYC, @ElorHadar, @andrewhowens, and I are potentially recruiting a joint postdoc. Please apply!
.@Cornell is recruiting for multiple postdoctoral positions in AI as part of two programs: Empire AI Fellows and Foundational AI Fellows. Positions are available in NYC and Ithaca.
Deadline for full consideration is Nov 20, 2025!
academicjobsonline.org/ajo/j…
Let's turn photos of ancient "revolutionary" (rotationally symmetric) artefacts into 3D and rotate them, or even change the lighting!
Our model learns to de-render a single image of a vase into shape, albedo, material & lighting, from just a single-image collection. #CVPR2021
Totally agree with Beth: "The violence directed at Asian Americans, especially women, children and elderly, is against the very core values America is built on. This is why I am standing up and speaking up today." linkedin.com/posts/bethxie_b…
So excited to share that I’ve been awarded the Google PhD Fellowship in Machine Perception!
Huge thanks to my PhD advisor @Jimantha and all my amazing collaborators for their support and inspiration along the way.
Check out our CVPR 2023 Award Candidate paper, DynIBaR! dynibar.github.io/
DynIBaR takes monocular videos of dynamic scenes and renders novel views in space and time. It addresses limitations of prior dynamic NeRF methods, rendering much higher quality views.
From our latest project, an homage to the original Photo Tourism visualizations by @Jimantha et al. - interpolating between camera pose, focal length, aspect ratio, and scene appearance from different tourist images. More details at tancik.com/learnit@_pratul_@jon_barron
There are few things I find more terror-inducing than cold calling people -- but I am finding that making US election-related volunteer calls leads to some pretty nice conversations. Some folks just want to chat right now.
Hi all, please consider nominating yourself to be reviewer for #CVPR2022. And please pass the word along, especially to those whose voices are not well represented in the vision community. This is one way to help guide the field.
#CVPR2022 is seeking additional reviewers. If interested or you want to nominate someone, please fill out the following reviewer nomination form:
docs.google.com/forms/d/e/1F…
I'm a big logo head! I keep seeing ads for Zenni on the train. I'm intrigued by how the stylized Z and N are exact mirror images here, but not in "real life" - one has horizontal lines, the other vertical. Yet no problem interpreting the logo. Cool Gestalt-style logic at work!
Organizing the first NYC vision workshop was super fun! Shout out to other organizers @elliottszwu@Haian_Jin and especially @Jimantha for the generous support!
This work led by @Haian_Jin is really nice. It takes text-to-image models and teases out their capability to light objects in a controllable way, much like Zero123 does for camera viewpoint. I'm really surprised that conditioning on environment maps can work this well!
Check out our recent work “Neural Gaffer: Relighting Any Object via Diffusion” 📷🌈, an end-to-end 2D relighting diffusion model that accurately relights any object in a single image under various lighting conditions.
🧵1/N:
Website: neural-gaffer.github.io/
View synthesis is super cool! How can we push it further to generate the world *far* beyond the edges of an image? We present Infinite Nature, a method that combines image synthesis and 3D to generate long videos of natural scenes from a single image. infinite-nature.github.io
A new follow up to infinite nature is out! This time we show how an infinite nature model can be trained on *single image* collections, without any multi-view or video supervision at training time!
We call it infinite nature 𝘻𝘦𝘳𝘰 since it requires no video 🙂 #ECCV2022 oral
Reminder about this CVPR registration support program -- please apply by April 15, 2022 if you'd like to be considered for a registration fee waiver! I hope that this effort can help increase inclusivity of CVPR. Application is here: forms.gle/RNEbbrqt9qpxs9oF7
#CVPR2022 is committed to supporting students from communities that do not traditionally attend CVPR through waived registration fees, to foster a more inclusive, diverse and equitable conference.
1/2
Happy Lunar New Year, and to all you astronomy fanatics a question -- If you and your family lived for generations in a village on the far side of the Moon, would you realize that the Earth existed?
I think I'm not mistaken that ICCV camera ready papers this year can be 9 pages+references (not 8)? If that is right, that is a first, and a very welcome, cool, and nice change!
Hi there. Code for our @eccvconf work from @cornelltech on learning where people could appear in an image is now online. Our (cool) method learns to predict potential people purely from observing data like Waymo's Open Dataset.
Code below—have a good day!
github.com/jinsungit/hiddenf…
For all you photography fanatics out there, a nice blog post about the photo with the longest known sightline captured to date -- 443km, from the Pyrenees to the French Alps.
beyondrange.wordpress.com/20…
I am sorry for buzz marketing, but if you are looking for a TV show for a 3-5 year old, I recommend a math-themed PBS program called Peg + Cat. Our 4 year old loves it (and the songs are catchy).
This workshop was so cool! My live talk had some technical difficulties, so if you want to see a clean version of a talk on how to tell if you are in a mirror universe (and a bunch of other great talks on less obscure topics), please check this out!
The upshot is that images seem to be full of low- and high-level chirality cues, and deep networks are pretty good at guessing when an image has been flipped. You might care if you're into data augmentation, image forensics, or self-supervision (or if you are a huge mirror-head).
For Lunar New Year/Spring Festival, another Moon-based note to all you Moon lunatics out there. One of my favorite gifs on Wikipedia is this one illustrating the apparent wobble of the Moon over the course of a month, called libration.
en.m.wikipedia.org/wiki/Libr…
Tomorrow is Election Tuesday in the US. I have tons of extra candy. If you see me and show me a "I Voted" sticker, I will try to give you some candy! I have Skittles and Baby Ruths.
This looks like a wonderful program for computer graphics PhD students and postdocs! Application deadline on April 4, 2022 at easychair.org/cfp/RSCG2022
Announcing WiGRAPH's Rising Stars in Computer Graphics! Ph.D. students and postdocs of underrepresented genders: apply for a two-year program of mentorship and workshops co-located with SIGGRAPH 2022&2023. Travel support provided. wigraph.org/events/2022-risi…#WiGRAPHRisingStars [1/6]
I've personally benefited a ton from @timnitGebru and her work. Her earlier work on estimating demographics at scale from Street View has been a big inspiration to me. Her more recent work in ethics is truly foundational, and has helped me think about the world differently.
If you're a big lover of video decomposition, check out @vickie_ye_'s nice paper on Deformable Sprites in the CVPR afternoon oral session. We are huge fans of layered video representations!
deformable-sprites.github.io
A deep network takes two images, learns to search for 2D matches between them, and then a loss function decides how much it likes the matches based on how much they deviate from the epipolar constraints derived from the camera poses, as in the visualization below.
Thanks, Kosta! Yes, there was an appearance from a 4-year-old who wasn't happy that I wasn't in play mode. (I moved to a different room, but forgot my trackball, so she started controlling my computer remotely.) I'm glad people seemed to be understanding of the heightened chaos.
Introducing Eclipse, a method for recovering lighting and materials even from diffuse objects!
The key idea is that standard "NeRF-like" data has all we need: a photographer moving around a scene to capture it causes "accidental" lighting variations. dorverbin.github.io/eclipse/ (1/3)
Hi all, ICCV21 isn't even over yet, but #CVPR2022 will be here before we know it, and deadlines for proposing workshops & tutorials are coming up soon. It would be great if the organizers had a diverse set of proposals on a range of topics, including societal impacts of CV.
Thanks, Ben! And I should note that the original idea for multiplane images came from John Flynn, working with Graham Fyffe and @debfx. That idea was also presaged in John's prior DeepStereo view synthesis method, as well as Soft3D from Penner and Zhang.
That page also has a stunning photo of the Earth that looks like CGI but is actually a photo from the Lunar Reconnaissance Orbiter. This photo is new to me but is really amazing!
Reviewers rock! Thank you so much for your hard work. Some didn't chime into discussions, but I think because CVPR and life was happening. Also, I initiated discussions late... between CVPR and sick kids/self, I was a bad AC☹️. But many reviewers chimed in anyway. Thank you!
And for all y'all intrinsic image fanatics out there -- Zhengqi also has a cool new paper on learning intrinsic images supervised with time-lapse data: "Learning Intrinsic Image Decomposition by Watching the World".
Web: cs.cornell.edu/projects/bigt…, arXiv: arxiv.org/abs/1804.00582
Deciphering some people’s writing can be a major challenge – especially when that writing is cuneiform characters imprinted into 3,000-year-old tablets.
Now, researchers from @CornellCIS have developed an approach called ProtoSnap that “snaps” into place a prototype of a character to fit the individual variations imprinted on a tablet.
More at news.cornell.edu/stories/202….
ALT Close-up view of an ancient Sumerian cuneiform tablet with neatly arranged inscriptions.
A reminder that @CVPR 2022 workshop proposals are due tomorrow, October 19, at 11:59pm Pacific. Thank you! More info here: cvpr2022.thecvf.com/call-wor…
Layers! We love layers. On that note, a shameless plug for Shubham Tulsiani's work on *layered scene inference* -- predicting geometry in the form of *layered* depth maps from single images. shubhtuls.github.io/lsi/#ECCV2018#layers
Hello, view synthesis devotees. I invite you to some new work at @eccvconf. We gather tourist photos of famous landmarks and learn a new neural 3D representation that can synthesize new views with natural, modifyable lighting. We call it "Crowdsampling the Plenoptic Function".
I defended my PhD thesis last week 🥳 Thank you to everyone that made this possible, including my advisor @mapo1, examiners @Jimantha@quantombone and Daniel Cremers, and the amazing @cvg_ethz. As per the tradition, I received a nice commemorative hat 🎓 Now time for vacations 😎
Hope you are all doing well out there. ECCV22 workshop proposals are due tomorrow! it would be wonderful to see a diverse range of workshops at the conference.
For all you ECCV rebuttal writers out there... seems like reviewers can see your rebuttals in OpenReview even before the end of the rebuttal period, which may be surprising behavior to CMT-heads like me.
As a reviewer I saw initial rebuttal from authors, and then edited version. So, you can edit, but whatever you have entered is already visible to the reviewers.