My thread yesterday could be seen as offensive and sparked concerns among many. Let me rephrase ⬇️
1/ My goal was not to publicly attack the authors of this paper, for which I have immense respect and sympathy, but instead to bring awareness on an incorrect evaluation protocol.
This paper is a great example of deliberate misleading evaluation of baselines. Why? ⬇️
1/ To localize with local features we need a 3D sparse map. How it's done in the paper: lift 2D keypoints to 3D using depth from OpenCV stereo block matcher. Not joking.