Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Local positional graphs and attentive local features for a data and runtime-efficient hierarchical place recognition pipeline (2403.10283v1)

Published 15 Mar 2024 in cs.CV

Abstract: Large-scale applications of Visual Place Recognition (VPR) require computationally efficient approaches. Further, a well-balanced combination of data-based and training-free approaches can decrease the required amount of training data and effort and can reduce the influence of distribution shifts between the training and application phases. This paper proposes a runtime and data-efficient hierarchical VPR pipeline that extends existing approaches and presents novel ideas. There are three main contributions: First, we propose Local Positional Graphs (LPG), a training-free and runtime-efficient approach to encode spatial context information of local image features. LPG can be combined with existing local feature detectors and descriptors and considerably improves the image-matching quality compared to existing techniques in our experiments. Second, we present Attentive Local SPED (ATLAS), an extension of our previous local features approach with an attention module that improves the feature quality while maintaining high data efficiency. The influence of the proposed modifications is evaluated in an extensive ablation study. Third, we present a hierarchical pipeline that exploits hyperdimensional computing to use the same local features as holistic HDC-descriptors for fast candidate selection and for candidate reranking. We combine all contributions in a runtime and data-efficient VPR pipeline that shows benefits over the state-of-the-art method Patch-NetVLAD on a large collection of standard place recognition datasets with 15$\%$ better performance in VPR accuracy, 54$\times$ faster feature comparison speed, and 55$\times$ less descriptor storage occupancy, making our method promising for real-world high-performance large-scale VPR in changing environments. Code will be made available with publication of this paper.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (37)
  1. Z. Chen, A. Jacobson, N. Sünderhauf, B. Upcroft, L. Liu, C. Shen, I. Reid, and M. Milford, “Deep learning features at scale for visual place recognition,” in International Conference on Robotics and Automation (ICRA), 2017, pp. 3223–3230.
  2. M. Dusmanu, I. Rocco, T. Pajdla, M. Pollefeys, J. Sivic, A. Torii, and T. Sattler, “D2-Net: A trainable CNN for joint description and detection of local features,” in Conference on Computer Vision and Pattern Recognition (CVPR), 2019, pp. 8084–8093.
  3. S. Hausler, S. Garg, M. Xu, M. Milford, and T. Fischer, “Patch-NetVLAD: Multi-scale fusion of locally-global descriptors for place recognition,” in Conference on Computer Vision and Pattern Recognition (CVPR), 2021, pp. 14 141–14 152.
  4. L. G. Camara and L. Přeučil, “Spatio-semantic ConvNet-based visual place recognition,” in European Conference on Mobile Robots (ECMR), 2019, pp. 1–8.
  5. H. Noh, A. Araujo, J. Sim, T. Weyand, and B. Han, “Large-scale image retrieval with attentive deep local features,” in International Conference on Computer Vision (ICCV), 2017, pp. 3476–3485.
  6. P. Neubert and P. Protzel, “Local region detector + CNN based landmarks for practical place recognition in changing environments,” in European Conference on Mobile Robots (ECMR), 2015, pp. 1–6.
  7. P. Lindenberger, P.-E. Sarlin, and M. Pollefeys, “LightGlue: Local feature matching at light speed,” in International Conference on Computer Vision (ICCV), 2023, pp. 17 627–17 638.
  8. J. Sun, Z. Shen, Y. Wang, H. Bao, and X. Zhou, “LoFTR: Detector-free local feature matching with transformers,” in Conference on Computer Vision and Pattern Recognition (CVPR), 2021, pp. 8922–8931.
  9. B. Cao, A. Araujo, and J. Sim, “Unifying deep local and global features for image search,” in European Conf. on Computer Vision (ECCV), 2020.
  10. R. Wang, Y. Shen, W. Zuo, S. Zhou, and N. Zheng, “TransVPR: Transformer-based place recognition with multi-level attention aggregation,” in Conference on Computer Vision and Pattern Recognition (CVPR), 2022, pp. 13 638–13 647.
  11. F. Yuan, P. Neubert, S. Schubert, and P. Protzel, “SoftMP: Attentive feature pooling for joint local feature detection and description for place recognition in changing environments,” in International Conference on Robotics and Automation (ICRA), 2021, pp. 5847–5853.
  12. P. Neubert and S. Schubert, “Hyperdimensional computing as a framework for systematic aggregation of image descriptors,” in Conference on Computer Vision and Pattern Recognition (CVPR), 2021.
  13. S. Schubert, P. Neubert, S. Garg, M. Milford, and T. Fischer, “Visual place recognition: A tutorial,” IEEE Robotics and Automation Magazine (RAM) (to appear), 2023.
  14. C. Masone and B. Caputo, “A survey on deep visual place recognition,” IEEE Access, vol. 9, pp. 19 516–19 547, 2021.
  15. N. Sünderhauf, S. Shirazi, A. Jacobson, F. Dayoub, E. Pepperell, B. Upcroft, and M. Milford, “Place recognition with ConvNet landmarks: Viewpoint-robust, condition-robust, training-free,” in Robotics: Science and Systems (RSS), 2015.
  16. Z. Chen, F. Maffra, I. Sa, and M. Chli, “Only look once, mining distinctive landmarks from ConvNet for visual place recognition,” in International Conference on Intelligent Robots and Systems (IROS), 2017, pp. 9–16.
  17. N. Sünderhauf, S. Shirazi, F. Dayoub, B. Upcroft, and M. Milford, “On the performance of ConvNet features for place recognition,” in International Conference on Intelligent Robots and Systems (IROS), 2015.
  18. P. Neubert, “Superpixels and their application for visual place recognition in changing environments,” PhD Thesis, Chemnitz University of Technology, 2015. [Online]. Available: http://nbn-resolving.de/urn:nbn:de:bsz:ch1-qucosa-190241
  19. A. Khaliq, S. Ehsan, Z. Chen, M. Milford, and K. McDonald-Maier, “A holistic visual place recognition approach using lightweight CNNs for significant viewpoint and appearance changes,” IEEE Transactions on Robotics, vol. 36, no. 2, pp. 561–569, 2020.
  20. Z. Chen, L. Liu, I. Sa, Z. Ge, and M. Chli, “Learning context flexible attention model for long-term visual place recognition,” IEEE Robotics and Automation Letters (RA-L), vol. 3, no. 4, pp. 4015–4022, 2018.
  21. D. DeTone, T. Malisiewicz, and A. Rabinovich, “SuperPoint: Self-supervised interest point detection and description,” in Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 2018, pp. 337–349.
  22. F. Yuan, P. Neubert, and P. Protzel, “LocalSPED: A classification pipeline that can learn local features for place recognition using a small training set,” in Towards Autonomous Robotic Systems (TAROS), 2020.
  23. E. Stumm, C. Mei, S. Lacroix, J. Nieto, M. Hutter, and R. Siegwart, “Robust visual place recognition with graph kernels,” in Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 4535–4544.
  24. P. Gao and H. Zhang, “Long-term place recognition through worst-case graph matching to integrate landmark appearances and spatial relationships,” in International Conference on Robotics and Automation (ICRA), 2020, pp. 1070–1076.
  25. Z. Luo, T. Shen, L. Zhou, J. Zhang, Y. Yao, S. Li, T. Fang, and L. Quan, “ContextDesc: Local descriptor augmentation with cross-modality context,” in Conference on Computer Vision and Pattern Recognition (CVPR), 2019, pp. 2522–2531.
  26. A. Gawel, C. D. Don, R. Siegwart, J. Nieto, and C. Cadena, “X-View: Graph-based semantic multi-view localization,” IEEE Robotics and Automation Letters (RA-L), vol. 3, no. 3, pp. 1687–1694, 2018.
  27. A. Ali-bey, B. Chaib-draa, and P. Giguère, “MixVPR: Feature mixing for visual place recognition,” in Winter Conference on Applications of Computer Vision, 2023, pp. 2998–3007.
  28. R. Arandjelović, P. Gronat, A. Torii, T. Pajdla, and J. Sivic, “NetVLAD: CNN architecture for weakly supervised place recognition,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 40, no. 6, pp. 1437–1451, 2018.
  29. P. Neubert, S. Schubert, and P. Protzel, “An introduction to hyperdimensional computing for robotics,” KI - Künstliche Intelligenz, vol. 33, no. 4, pp. 319–330, 2019.
  30. K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” in International Conference on Learning Representations (ICLR), 2015.
  31. J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei, “ImageNet: A large-scale hierarchical image database,” in Conference on Computer Vision and Pattern Recognition (CVPR), 2009, pp. 248–255.
  32. A. Glover, “Day and night with lateral pose change datasets,” 2014.
  33. A. J. Glover, W. P. Maddern, M. J. Milford, and G. F. Wyeth, “FAB-MAP + RatSLAM: Appearance-based SLAM for multiple times of day,” in International Conference on Robotics and Automation (ICRA), 2010.
  34. W. Maddern, G. Pascoe, C. Linegar, and P. Newman, “1 year, 1000km: The oxford RobotCar dataset,” The International Journal of Robotics Research, vol. 36, no. 1, pp. 3–15, 2017.
  35. H. Badino, D. Huber, and T. Kanade, “Visual topometric localization,” in Intelligent Vehicles Symposium (IV), 2011, pp. 794–799.
  36. N. Sünderhauf, P. Neubert, and P. Protzel, “Are we there yet? Challenging SeqSLAM on a 3000 km journey across all four seasons,” in International Conference on Robotics and Automation (ICRA) Workshop on Long-Term Autonomy, 2013, pp. 1–3.
  37. J. Bruce, J. Wawerla, and R. Vaughan, “The SFU mountain dataset: Semi-structured woodland trails under changing environmental conditions,” in International Conference on Robotics and Automation (ICRA) Workshop on Visual Place Recognition in Changing Environments, 2015.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Fangming Yuan (1 paper)
  2. Stefan Schubert (8 papers)
  3. Peter Protzel (14 papers)
  4. Peer Neubert (14 papers)
Citations (1)

Summary

We haven't generated a summary for this paper yet.