Emergent Mind

BEV-CV: Birds-Eye-View Transform for Cross-View Geo-Localisation

(2312.15363)
Published Dec 23, 2023 in cs.CV and cs.LG

Abstract

Cross-view image matching for geo-localisation is a challenging problem due to the significant visual difference between aerial and ground-level viewpoints. The method provides localisation capabilities from geo-referenced images, eliminating the need for external devices or costly equipment. This enhances the capacity of agents to autonomously determine their position, navigate, and operate effectively in environments where GPS signals are unavailable. Current research employs a variety of techniques to reduce the domain gap such as applying polar transforms to aerial images or synthesising between perspectives. However, these approaches generally rely on having a 360{\deg} field of view, limiting real-world feasibility. We propose BEV-CV, an approach which introduces two key novelties. Firstly we bring ground-level images into a semantic Birds-Eye-View before matching embeddings, allowing for direct comparison with aerial segmentation representations. Secondly, we introduce the use of a Normalised Temperature-scaled Cross Entropy Loss to the sub-field, achieving faster convergence than with the standard triplet loss. BEV-CV achieves state-of-the-art recall accuracies, improving feature extraction Top-1 rates by more than 300%, and Top-1% rates by approximately 150% for 70{\deg} crops, and for orientation-aware application we achieve a 35% Top-1 accuracy increase with 70{\deg} crops.

We're not able to analyze this paper right now due to high demand.

Please check back later (sorry!).

Generate a summary of this paper on our Pro plan:

We ran into a problem analyzing this paper.

Newsletter

Get summaries of trending comp sci papers delivered straight to your inbox:

Unsubscribe anytime.