The unreasonable magic of simplicity! Meet DrivoR (Driving on Registers): our latest end2end autonomous driving model. We teared down complex dependencies & modules from current models to obtain a pure Transformer-based SOTA driving agent (NAVSIM v1 & v2, HUGSIM). Find out more👇
1/🧵 Q: Can we have both a simple and SOTA architecture in autonomous driving? R: Yes! 😍 Introducing Driving on Registers (DrivoR): a pure Transformer backbone that achieves SOTA results in NAVSIM v1 / v2 and closed-loop HUGSIM evaluation. Here is how 👇
2
15
4,408
A visual exploration of Gaussian Processes: beautiful interactive plots and a brief tutorial to make GPs more approachable jgoertler.com/visual-explora…
2
113
411
DINO and DINOv2 are surely amazing SSL approaches. Many assume that they're also very simple (in particular vs. other SSL methods), but they are actually a bit more elaborate and I've been in awe of the achievement of the authors. This diagram from SimDINO is more complete.
5
37
362
29,356
Self-supervised learning is fantastic for pretraining, but can we use it for other tasks (kNN classification, in-context learning) & modalities, w/o training & by simply using its gradients as features? Enter 🍄FUNGI - Features from UNsupervised GradIents #NeurIPS2024 🧵
9
49
326
46,005
Decoder Denoising Pretraining for Semantic Segmentation: A fun and simple idea for pre-training the decoder for semantic segmentation arxiv.org/abs/2205.11423 1/
4
54
302
An excellent poor man's visual prompt engineering strategy for CLIP: draw red circles on an object in an image auto-magically focuses its attention on that region, leading to a specific embedding #ICCV2023
2
36
268
535,038
Our work on learning to perform semantic segmentation without human supervision by driving around cities made it to #eccv2022 More info coming soon.
Drive&Segment: Unsupervised Semantic Segmentation of Urban Scenes via Cross-modal Distillation abs: arxiv.org/abs/2203.11160 project page: vobecant.github.io/DriveAndS… @Gradio Demo: huggingface.co/spaces/vobeca…
2
39
245
ICCV: International CVPR Corrected Versions #cvpr2021 #iccv2021
2
14
246
When life gives you lemons, Andrea makes lemonade 🍋 Kudos to Andrea Vedaldi doing an excellent work presenting his paper in spite of an incident w/ the poster #cvpr2025
2
11
251
22,280
An intriguing #iclr2020 paper on self-supervision studied over a single image: iclr.cc/virtual_2020/poster_… Starting from an image the authors generate a 1M images dataset of crops and augmentations from this image 1/
4
42
241
Slides for most talks at Good Citizen at CVPR workshop are up cc.gatech.edu/~parikh/citize… Lots of useful advice and experience for writing and reviewing papers, how to do good research and evaluation, talks, how to organise your time #CVPR2018
98
234
The brilliant Little Book of Deep Learning by @francoisfleuret is here! 🤩 Hoping now for an autograph session at a CV/ML venue soon.
4
20
216
16,851
Leave Those Nets Alone: Advances in Self-Supervised Learning. Join us this Sunday for our #cvpr2021 tutorial to discover what's cooking these days in the different flavors of self-supervised learning. Recordings and slides will be online right after. gidariss.github.io/self-supe…
3
40
193
On the CLIP vs. SSL image encoder debate: an overlooked aspect is how much fewer GPU resources SSL models need compared to CLIP ones for pretraining. Recent examples: - SSL: Franca - 128 H100, bsz 3K; DINOv3 - 256 H100, bsz 4K - CLIP: SigLIP2 - 2048 TPUv5, bsz 32K; PE - bsz 131K
9
7
197
26,495
Introducing new #cvpr2020 work with S. Gidaris and team on a new self-supervised task: Learning Representations by Predicting Bags of Visual Words arxiv.org/abs/2002.12247 1/
4
36
190
TiTok: An Image is Worth 32 Tokens for Reconstruction and Generation. tl;dr: ViT-based tokenization of images to 1D discrete sequences, which can later be decoded back to the image space with ViT decoder. Brilliant name! yucornetto.github.io/project…
3
21
181
14,127
Want to improve zero-shot performance of your CLIP model? Easy: just ask GPT-3 how it would recognize those objects and produce word embeddings from its descriptions. Bonus: you can get some explainability by analyzing decisions from embeddings #ICLR2023 arxiv.org/abs/2210.07183
3
25
179
24,670
New and not-so-new computer vision geeks on the block rejoice: the 2nd edition of Rick Szeliski's famous book on Computer Vision Algorithms and Applications is up and free to download as PDF szeliski.org/Book/
2
37
175
So far, @georgiagkioxari's talk on "Apples and Oranges: research in academia vs. industry" is the funniest and most thought-provoking talk at #CVPR2023
2
23
153
29,574
If you liked UNIC, check out Theia: a simple approach for distilling pretrained models, e.g.,CLIP, DINOv2, SAM, DepthAnything into a unified model with improved performance on policy learning theia.theaiinstitute.com/
Excellent work by @mbsariyildiz et al. on distilling multiple complementary visual encoders into a single one. This is particularly useful in the era of pretrained foundation models on different datasets and with different types of supervision 👇
17
152
12,868
New work spearheaded by S. Gidaris on self-supervised learning: OBoW - Online Bag-of-Visual-Words Generation for Unsupervised Representation Learning Paper: arxiv.org/abs/2012.11552 Code: github.com/valeoai/obow 🧵👇 1/N
6
31
150
CLIP-based research is moving so fast that even authors cannot keep pace with arXiv 🙃There are currently 3 MaskCLIP papers out there (all published in major conferences): - arxiv.org/abs/2112.01071 - arxiv.org/abs/2208.08984 - arxiv.org/abs/2208.12262
1
29
145
36,938
The Cosmos suite of neural tokenizers for images & videos is impressive. It's trained on diverse high-res imgs & long-vids, scales well for both discrete & continuous tokens, works on multiple domains (robotics, driving, egocentric), has excellent runtime research.nvidia.com/labs/dir…
1
23
147
11,252
I'm a fan of papers proposing new "baselines". The methods are usually simpler and reach decent performance (few points below the usually more complex SoTA methods), while enabling a different view over the problem at hand.
4
14
143
For those tuning from home, @YejinChoinka's excellent keynote at #CVPR2023 unravels and discusses many of the findings from this work on the limits of Transformers to Compositionality arxiv.org/abs/2305.18654
2
33
130
54,668
DINO learned excellent visual representations with impressive generalization. Ever since, many researchers have tried to outmatch (or at least match) that via various self-supervised and/or masked image modelling strategies without succeeding arxiv.org/abs/2104.14294
1
14
132
34,090
DepthAnythingV2 is up w/ code, ckpts and excellent performance. tl;dr pipeline: finetune DinoV2-G for depth estimation on synthetic data (595k images) -> use it as teacher to generate pseudo-labels on 62M real images-> train student model on pseudo-labels depth-anything-v2.github.io/
4
15
138
13,021
#ICLR2023 submissions are now visible or it's that time of the year when you realize that most of your #CVPR2023 ideas are already scooped 🙃 openreview.net/group?id=ICLR…
2
25
133
Interesting work by @endernewton et al. studying how & what pretraining knowledge is transfered downstream. It seems that representations are less important than attention patterns that can guide students to learn good features from scratch w/ good perfs arxiv.org/abs/2411.09702
2
24
135
85,918
TRADI: Tracking deep neural network weight distributions -- work with G. Franchi arxiv.org/abs/1912.11316 We’re proposing a cheap method for getting ensembles of networks from a single network training 1/
3
31
127
NeuroNCAP: photorealistic closed-loop safety testing for autonomous driving tl;dr: leverage NeRFs to realistically simulate safety-critical scenarios from a seq. of real-world data. One of my favorite papers at #ECCV2024 arxiv.org/abs/2404.07762
4
16
128
11,196
Got a submission rejected from ICML, then NeurIPS. We've improved it further and sent it to ICLR. I'm happy that the paper is even better and clearer now, but, boy, this game can be annoying. Did I miss any news lately? :)
1
127
There were so many cool works on multi-camera bird's-eye-view perception recently. If you want to catch-up or just starting in the area, this figure by @AdamWHarley does an effective summary of main approaches. Capture from the awesome Simple-BEV work: simple-bev.github.io/
25
113
Slides and most videos from @ICCVConference 🎭The Many Faces of Reliability of Deep Learning for Real-World Deployment🌍 tutorial are now up! Feel free to reach out w/ any questions and suggestions #ICCV2023 -📽️ videos: piped.video/playlist?list=PL… - 💻slides: abursuc.github.io/many-faces…
Join us this Tuesday for our @ICCVConference tutorial on "The Many Faces of Reliability of Deep Learning for Real-World Deployment" prepared by @SharonYixuanLi, @puneetdokania, @tuan_hung_vu, Dengxin Dai, Patrick Pérez and @abursuc #ICCV2023 abursuc.github.io/many-faces…
27
112
43,619
Nice writeup by Chelsea Finn on recent meta-learning and few-shot learning techniques bair.berkeley.edu/blog/2017/…
2
39
115
No accepted papers to announce, but I'm happy to see that as reviewer I contributed to the improvement and acceptance of a paper #NeurIPS2021
1
3
112
The outstanding #CVPR2023 keynote talk by @YejinChoinka on "2050: An AI odyssey: dark matter of intelligence" is up piped.video/watch?v=Q4fKKaT7…
For those tuning from home, @YejinChoinka's excellent keynote at #CVPR2023 unravels and discusses many of the findings from this work on the limits of Transformers to Compositionality arxiv.org/abs/2305.18654
1
25
109
31,223
Nice trick for initializing last layer of a CNN for fine-tuning or for adding an extra-class after training: add L2-norm layer and use CNN output from new class sample(s) as weights for new neuron. Results are good from first steps. arxiv.org/abs/1712.07136
1
45
111
FlexiViT: One Model for All Patch Sizes by @giffmana et al. may have passed unnoticed over December. Randomizing patch size at training makes a good ViT across a range of patch sizes + you can tune patch size at runtime according to hardware and/or KPIs arxiv.org/abs/2212.08013
3
20
106
13,269
Effective way to wrap-up the tl;dr in the title and causing some 🔥 in the same time: Learning by Reconstruction Produces Uninformative Features For Perception arxiv.org/abs/2402.11337
2
11
106
9,156
New #ICCV2019 work lead by S. Gidaris on boosting few-shot learning methods with self-supervision arxiv.org/abs/1906.05186 Few-shot learning and self-supervised learning address different facets of the same problem: how to train a model with little or no labeled data 1/
2
26
105
Slides and videos from our #eccv2022 tutorial "Self-Supervision on Wheels: Advances in Self-Supervised Learning from Autonomous Driving Data" are now available: piped.video/watch?v=RhNZUyOu… gidariss.github.io/ssl-on-wh…
Join us online this Monday for our #eccv2022 tutorial "Self-Supervision on Wheels: Advances in Self-Supervised Learning from Autonomous Driving Data" gidariss.github.io/ssl-on-wh…
2
26
100
25,675
While everyone was busy with ECCV, DepthPro, a foundation model for zero-shot metric monocular depth estimation in less than a second, has been released. Code and weights are up: arxiv.org/abs/2410.02073
2
13
97
7,674
"to arXiv or not (during review)" dilemma for less known labs: if you do it, you risk reviewers seeing it & hesitating due to lower fame, if you don't, you risk concurrent work from bigger lab on related idea or new SotA getting public & reducing your chances at resubmission 🤷‍♂️
4
8
97
The Grand Slam of the relentless computer vision researcher: submit a paper to CVPR -> ICCV/ECCV -> WACV/3DV hoping it will eventually get in that year.
3
6
98
28,903
Painfully accurate 🤣: YOLOv69, YOLOv69 (different group), new dataset from Meta, new dataset from unkown group, survey paper ... Quite a gem!
Computer Vision conference's acceptance criteria these days: #CVPR2024 #eccc2024 #AI #ComputerVision
1
10
100
9,531
New dataset INTERACTION: trajectories of traffic participants in interactive scenarios (eg roundabout, intersection, merging of lanes on highways) from US, Germany and China. It's useful for intention and behavior prediction, interactions between drivers. interaction-dataset.com/
2
43
96
Since 1 week I'm no longer leaving the house without my DIY mask. I had initially bought in the "masks are not helpful" argument that lead to a lack of action from my side. Thanks to @jeremyphoward and @math_rachel for convincing me otherwise #masks4all
2
14
94
tempoGAN: nice use of adversarial training for super-resolution of temporally consistent fluid flows using a Volumetric-GAN arxiv.org/abs/1801.09710
1
40
96
From this view, @ykilcher is currently the most selective venue in CV+ML with 1-2 selected papers per week out of a pool of many 100s arXiv papers every week. And he was saying that ICML acceptance rate was low 🙃
There’s a few other prestigious venues like @ykilcher YouTube, paperswithcode, @ak92501 et al tweet streams etc :) but yes. I rather like the emerging hybrid model where the new cheap low latency async distributed consensus layer coexists with the legacy “Layer 1 chain” (pubs)
3
3
98
Fun and insightful paper and post from Uber studying the Lottery Ticket Hypothesis eng.uber.com/deconstructing-… Signs of the weights count the most for the init after pruning and one can find a masking of the weights with good performances using random weights - the Supermasks.
21
97
We're releasing Franca: a new fully open-sourced (data, code, weights, logs) vision foundation model that finally matches & sometimes outperforms DINOv2, SigLIPv2 & CLIP on ViT-G. This is the fruit of a fun collaboration btwn @valeoai & @FunAILab spearheaded by @shawshank_v 👇
Can open-data models beat DINOv2? Today we release Franca, a fully open-sourced vision foundation model. Franca with ViT-G backbone matches (and often beats) proprietary models like SigLIPv2, CLIP, DINOv2 on various benchmarks setting a new standard for open-source research🧵
2
13
115
16,820
After a time of relatively slow progress in core ResNet backbone, we're seeing now a mini Cambrian explosion of architectures: ViT, NFNet, RepVGG, LambdaNet, CaiT RedNet, BotNet, HaloNet, MLP-Mixer, etc. Luckily tireless @wightmanr keeps track of all 🙏github.com/rwightman/pytorch…
Repeat after me, another day, another MLP architecture arxiv.org/abs/2105.03404
3
16
99
An excellent survey on image-based object localization w/o human supervision. It's my go-to to catch up on the many new works in the area, here clearly organized and analyzed. Well crafted, it can also serve as a template for next surveys. Nice work by @oriane_simeoni & team👇
📢[Survey 📚] Object localization in images with zero manual annotation?🤩 ➡️ We propose a survey discussing recent works exploiting ❄️self-supervised ViTs (incl. ICCV’23 & NeurIPS’23 works💫) w/ @EloiZablocki @SpyrosGidaris @gillespuy & P. Pérez. 📄arxiv.org/abs/2310.12904
1
17
93
15,734
Kudos to #cvpr2019 Program Chairs for preparing a super-handy reviewer tutorial, including a summary of the decision process (which is quite opaque to many reviewers), some annotated good and bad reviews, and a few tips dropbox.com/s/725p60wcajbb8x…
40
91
In the past years we've seen some successful transfers of ideas from NLP to CV: Bag-of-Visual Words, skip-connections in LSTMs to ResNets, ViT from Transformers. Do you have some good examples of idea transfer from CV to NLP?
10
9
93
How Much More Data Do I Need? This is a nice #cvpr2022 study and solution: do multiple acquisition and training rounds to solidify the estimates (regression, statistical laws) by sweeping over multiple target KPIs. Lots of results from several datasets openaccess.thecvf.com/conten…
1
21
88
Eye-candy posters of #cvpr2025 #startikz
2
7
86
11,476
New release of fast.ai's advanced DL course. Content is great w/ incredible amount of very recent stuff. I find myself recommending this course all the time to bootstrap people into DL and good coding practices for it. Congrats @math_rachel @jeremyphoward
Launching Cutting Edge Deep Learning for Coders: 2018 edition fast.ai/2018/05/07/part2-lau… I know a lot of folks have been waiting for this - hope it meets your expectations!
1
22
86
CVPR deadline week Stress level: 8/10 Home-alone father of 3 minions handling evening and morning routine, meals and taking kids to/back from school on time in different parts of town and then catch train to office Stress level: ... Ahahahahaha! 🥲
5
1
77
19,133
🚨New doctor in the house!🚨 Congrats to @TimDarcet for his tremendous work (DINOv2, registers, CAPI) & successful PhD defense followed by ~2 hrs of questions -- he's got stamina! Congrats to his incredible team of advisors from Inria & Meta: @julienmairal @p_bojanowski M. Oquab
6
3
83
15,771
The return of the Autoregressive Image Model: AIMv2 now going multimodal. Excellent work by @alaa_nouby @MustafaShukor1 @DonkeyShot21 & team with code and checkpoints already up: arxiv.org/abs/2411.14402
1
16
76
5,685
The video tutorial on Capsule Networks from @aureliengeron is so good that even Hinton praises it piped.video/watch?v=pPN8d0E3… slideshare.net/aureliengeron…
35
80
Implementing matrix decompositions (Cholesky, LQ, sym eigen) as differentiable operators arxiv.org/abs/1710.08717 github.com/ARCambridge/MXNet…
1
33
78
Join us online this Monday for our #eccv2022 tutorial "Self-Supervision on Wheels: Advances in Self-Supervised Learning from Autonomous Driving Data" gidariss.github.io/ssl-on-wh…
3
17
77
In spite of the drama around #NIPS2018 sold-out, the organisers did well to release only 2.5k spots and reserve the others (up to 6-8k) to authors of accepted papers, top reviewers and workshop papers. There was a hick-up though with the delayed announcement of accepted papers
4
24
76
What an incredible conference venue in Vancouver for #CVPR2023
1
6
75
9,423
Happy to announce relatively recent work with @GianniFranchi10 on efficient BNNs with fewer assumptions on weights, amenable to complex CV architectures (DeepLV3+ w/ ResNet50) and tasks (semantic segmentation) Paper: arxiv.org/abs/2012.02818 Code: github.com/giannifranchi/LP_… 🧵1/
2
19
74
The release of LLaMa-2 from @MetaAI with open weights and free for both research and commercial use, is definitely a seismic moment with ripples in academia, commercial applications, startups for LLMs or ML in general. I was not expecting such a release this soon.
Meta releases Llama 2: Open Foundation and Fine-Tuned Chat Models paper: ai.meta.com/research/publica… blog: ai.meta.com/llama/ develop and release Llama 2, a collection of pretrained and fine-tuned large language models (LLMs) ranging in scale from 7 billion to 70 billion parameters. Our fine-tuned LLMs, called Llama 2-Chat, are optimized for dialogue use cases. Our models outperform open-source chat models on most benchmarks we tested, and based on our human evaluations for helpfulness and safety, may be a suitable substitute for closedsource models. We provide a detailed description of our approach to fine-tuning and safety improvements of Llama 2-Chat in order to enable the community to build on our work and contribute to the responsible development of LLMs.
2
13
74
21,713
Unsupervised 3D perception (object detection) w/ 2D vision-language distillation #ICCV2023 tl;dr: generate amodal 3D boxes and tracklets (for static and moving objects) + distill VLM features from images to point clouds. Works well for closed & open set arxiv.org/abs/2309.14491
16
74
10,630
Battle of Backbones (BoB): Besides the awesome name, this paper is highly insighftul in comparing lots and lots of pretrained models on several various computer vision downstream tasks (my kind favorite way to analyze and understand models in addition to many metrics).
🚨Excited to announce a large-scale comparison of pretrained vision backbones including SSL, vision-language models, and CNNs vs ViTs across diverse downstream tasks ranging from classification to detection to OOD generalization and more! NeurIPS 2023🚨🧵 arxiv.org/abs/2310.19909
1
10
78
18,072
Ithaca365: a new dataset for AD under various weather conditions: 40 recordings (~7k frames) of a 15 km route under various conditions: weather, time of day, traffic conditions #CVPR2022 ithaca365.mae.cornell.edu/
2
14
74
An open letter from Russian scientists and science journalists against the war with Ukraine (google translate capture 👇): "The responsibility for unleashing a new war in Europe lies entirely with Russia. There is no rational justification for this war." trv-science.ru/2022/02/we-ar…
12
72
I've just received a copy of @TacoCohen's super thesis and I'm looking forward to dive in. Such manuscripts are really handy for kickstarting you in a field and I'm sure I'll recommend it to students and collaborators in the future. Kudos!
1
4
73
An inspiring talk for all advisors from Kilian Weinberger @MLRetrospective workshop #neurips2020 Using 3 of his recent papers as example, he argues on the importance on focusing on the gained insight and not just settle for beating SoTA enough to get you a paper 1/
1
17
67
Our simple and cheap CLIP-DINOiser made it to #ECCV2024 ! Work spearheaded by @mkwysoczanska w/ help from @oriane_simeoni @MichaelRamamon @tomasztrzcinsk1 @ptrkprz and myself
🚨Happy to release on arXiv CLIP-DINOiser: Teaching CLIP a few DINO tricks🦖🎓 We obtain dense CLIP features in 1 forward pass w/o feature alteration and w/ almost no computational extra cost to facilitate open-vocabulary semantic segmentation 🧶 🖥️: wysoczanska.github.io/CLIP_D… [1/N]
1
15
72
6,662
Replying to @deliprao
2nd author of BoWNet here (1st author is on paternity leave). I reply right away as it seems this tweet is getting more impact that the original paper itself 🙃 1/
1
3
69
MAML(s) for the masses: a new pytorch library for implementing existing and developing new meta-learning algos github.com/facebookresearch/… The source papers features a pedagogical description of inner loop meta-learning algos
Happy to announce our paper on Generalized Inner Loop Meta Learning, aka Gimli (arxiv.org/abs/1910.01727), with @brandondamos, @denisyarats, Phu Mon Htut, Artem Molchanov, Franziska Meier, @douwekiela, @kchonyc, and @soumithchintala. THREAD [1/6]
14
75
New dataset of high quality 3D scans of ~10k real objects redwood-data.org/3dscan/
33
68
A good occasion to (re-)check out and consider @Michael_J_Black's excellent post on Scientific communication in the age of influencers #CVPR2024 perceiving-systems.blog/en/p…
3
14
69
18,799
DINOv2 strikes again! This time is for multi-camera BEV semantic segmentation. tl;dr: Using DINOv2 as image encoder instead of ImageNet pretrained one, and finetune with LoRA works quite well, in particular on robustness under distribution shift.
we replace the encoder of SimpleBEV with DINOv2 + LoRA and - it becomes more robust to various corruptions as we show on nuScenes-C (nuScenes with corruptions). - we can significantly reduce the encoder parameters that need to be trained (37M in ResNet101 vs 1 or 3M with DINO) and still reach the same performance. - if you can afford to scale up the model (size, input resolution), you can also reduce the number of training steps to 1/3rd and achieve similar or even better results. nice analysis by @MerveRabiaBarn1 and @gorkaydemir: arxiv.org/abs/2409.10228 to be presented at #ECCV2024 2nd Workshop on Vision-Centric Autonomous Driving (VCAD).
10
70
7,366
OpenOOD: Benchmarking Generalized Out-of-Distribution Detection by @JingkangY @SharonYixuanLi @hendrycks et al. #NeurIPS2022 What a crazy effort implement and compare 35 ODD methods under 9 benchmarks! paper: openreview.net/forum?id=gT6j… code: github.com/Jingkang50/OpenOO…
1
14
73
Student registration fees for ML conferences are a good bargain this year: ICML $25, COLT $30, AISTATS €40. While virtual conferences are less exciting than in-person ones, they can put you on track to watch talks, read papers and talk with authors (uncrowded poster sessions)
2
12
67
We're organizing a workshop on Uncertainty Quantification for Computer Vision @ICCVConference with a fantastic line-up of speakers, talks from contributed papers and a competition on uncertainty quantification for autonomous driving #iccv2023 Please spread the news.
We are happy to announce the #ICCV2023 #UNCV2023 Workshop: Workshop on Uncertainty Quantification for Computer Vision (uncv2023.github.io/). We welcome full papers and extended abstracts. Join also our ICCV competition for Uncertainty quantification muad-dataset.github.io/
17
68
22,541
🛟 Reliable & reliability researchers @CVPR! Join our workshop on Uncertainty Quantification for Computer Vision next week! We have a super lineup of speakers (from self-driving to LLMs) and cool posters. 🗓️ Day: Wed, Jun 11 📌Room: 102 B #CVPR2025 #UNCV2025
1
18
73
10,065
Replying to @fchollet
For that time-span I think trees are the most common project. Tech can't still accelerate it much and you know from the beginning that it's later generations that will benefit it.
3
59
If you're still on the fence about checking out the new Foundations of Computer Vision textbook by Antonio Torralba, @phillip_isola, and Bill Freeman, Alyosha's review will convince you. That conclusion is so Alyosha 🙃
3
7
67
6,248
Happy to see that Paris is becoming a hotspot for AI folks from academia, big tech or startups with a particular vision for openness and collaboration. Kudos to @ylecun his relentless quest over the years in advertising Paris as a excellent place for that.
Open source AI is the way to go! Proud to see @huggingface, @scaleway, & @meta joining to launch an AI startup accelerator at Station F. This will help concretize our common vision of an open and collaborative AI ecosystem. More from TechCrunch: techcrunch.com/2023/11/08/me…
6
66
13,149
#pytorch code for "Dynamic few-shot visual learning without forgetting" by Gidaris and Komodakis at #cvpr2018: github.com/gidariss/FewShotW… The authors share even configs and learning rate schedules for experiments in paper.
37
67
📢 We have a PR[AI]RIE PhD position opening @inria_paris co advised with R. de Charette & @tuan_hung_vu [please distribute] 💡Topic: Physics-Grounded Vision Foundation Models ⏳Application deadline: 20 May 2025 🗓️ Start date: Fall 2025 📝Detailed description: linked below
1
16
65
7,468
nvTorchCam: new lib for camera-agnostic differentiable geometric vision. It makes deep learning algos camera-model independent by abstracting projection+unprojection, generalize to multiple cameras (pinhole, fisheye, 360 panoramas), is batching compatible arxiv.org/abs/2410.12074
1
14
62
3,553
Great blog-post by @lilianweng on recent progress in meta-learning (aka learning to learn fast, low-shot learning) covering the terminology and main families of approaches lilianweng.github.io/lil-log…
23
65
Thrilled w/ CLIP-DINOiser by @mkwysoczanska et al. to compute dense CLIP features. CLIP's limitation to global features-only annoyed many vision folks. MaskCLIP (simple & fast) can get dense features, but noisy. We extend MaskCLIP w/ a few nudges from DINO localization priors 👇
🚨Happy to release on arXiv CLIP-DINOiser: Teaching CLIP a few DINO tricks🦖🎓 We obtain dense CLIP features in 1 forward pass w/o feature alteration and w/ almost no computational extra cost to facilitate open-vocabulary semantic segmentation 🧶 🖥️: wysoczanska.github.io/CLIP_D… [1/N]
1
14
63
7,564
The neural networks behind Google Voice transcription: from GMMs to LSTM RNNs googleresearch.blogspot.fr/2…
36
63
Interesting change for #CVPR2025 reviewing: The reviewers’ names will be visible on OpenReview to the other reviewers of the paper after the final paper decisions have been made. I think it's a good nudge to do the job well, but also to connect to other researchers in the field.
Replying to @abursuc @ducha_aiki
Overall, I appreciate seeing the identity of the AC and I would also appreciate seeing the ones of the reviewers (after the decision). I think it contributes to bringing the community together.
11
60
11,780