Hierarchical Metric Learning and Matching for
2D and 3D Geometric Correspondences

Mohammed E. Fathy1 Quoc-Huy Tran2 M. Zeeshan Zia3 Paul Vernaza2 Manmohan Chandraker2,4
1Google Cloud AI 2NEC Labs America 3Microsoft 4University of California, San Diego
European Conference on Computer Vision (ECCV), 2018
Our hierarchical metric learning retains the best properties of various levels of abstraction in CNN feature representations. For geometric matching, we combine the robustness of deep layers that imbibe greater invariance, with the localization sensitivity of shallow layers. This allows learning better features, as well as a better correspondence search strategy that progressively exploits features from higher recall (robustness) to higher precision (spatial discrimination).

Abstract

Interest point descriptors have fueled progress on almost every problem in computer vision. Recent advances in deep neural networks have enabled task-specific learned descriptors that outperform hand-crafted descriptors on many problems. We demonstrate that commonly used metric learning approaches do not optimally leverage the feature hierarchies learned in a Convolutional Neural Network (CNN), especially when applied to the task of geometric feature matching. While a metric loss applied to the deepest layer of a CNN, is often expected to yield ideal features irrespective of the task, in fact the growing receptive field as well as striding effects cause shallower features to be better at high precision matching tasks. We leverage this insight together with explicit supervision at multiple levels of the feature hierarchy for better regularization, to learn more effective descriptors in the context of geometric matching tasks. Further, we propose to use activation maps at different layers of a CNN, as an effective and principled replacement for the multi-resolution image pyramids often used for matching tasks. We propose concrete CNN architectures employing these ideas, and evaluate them on multiple datasets for 2D and 3D geometric matching as well as optical flow, demonstrating state-of-the-art results and generalization across datasets.

Paper

Hierarchical Metric Learning and Matching for 2D and 3D Geometric Correspondences
Mohammed E. Fathy, Quoc-Huy Tran, M. Zeeshan Zia, Paul Vernaza, Manmohan Chandraker
European Conference on Computer Vision (ECCV), 2018
[PDF]  [Supp]  [Bibtex]

2D Correspondence Results

Accuracy of different CNN-based methods for 2D correspondence estimation on KITTI Flow 2015.
Accuracy of CNN-based and hand-crafted methods for 2D correspondence estimation on KITTI Flow 2015.

Optical Flow Results

Qualitative results on KITTI Flow 2015. First row: input images. Second row: DeepFlow2. Third row: EpicFlow. Forth row: SPM-BP. Fifth row: HiLM. Red colors mean high errors while blue colors mean low errors.
Quantitative results on KITTI Flow 2015. Following KITTI convention: Fl-bl, Fl-fg, and Fl-all represent the outlier percentage on background pixels, foreground pixels and all pixels respectively. The methods are ranked by their Fl-all errors. Bold numbers represent best results, while underlined numbers are second best ones. Note that FlowNet2 optimizes flow metric directly, while SDF and SOF require semantic knowledge.

3D Correspondence Results

Accuracy of different CNN-based methods for 3D correspondence estimation.

Acknowledgements

Part of this work was done during Mohammed E. Fathy’s internship at NEC Labs America. The authors thank Christopher B. Choy and Andy Zeng for their help with the code of UCN and 3DMatch respectively.This website template is inspired by this website.