Learning about our 4D world is hard. Real-world data is messy, with entangled scene geometry, motion, and camera movement.
Linyi just made a massive (100k+), diverse dataset with metric depth, long-term 3D motion, and camera poses---everything you need for real-world 3D learning
Introducing 👀Stereo4D👀
A method for mining 4D from internet stereo videos. It enables large-scale, high-quality, dynamic, *metric* 3D reconstructions, with camera poses and long-term 3D motion trajectories.
We used Stereo4D to make a dataset of over 100k real-world 4D scenes.