F$^3$Loc: Fusion and Filtering for Floorplan Localization (2403.03370v2)
Abstract: In this paper we propose an efficient data-driven solution to self-localization within a floorplan. Floorplan data is readily available, long-term persistent and inherently robust to changes in the visual appearance. Our method does not require retraining per map and location or demand a large database of images of the area of interest. We propose a novel probabilistic model consisting of an observation and a novel temporal filtering module. Operating internally with an efficient ray-based representation, the observation module consists of a single and a multiview module to predict horizontal depth from images and fuses their results to benefit from advantages offered by either methodology. Our method operates on conventional consumer hardware and overcomes a common limitation of competing methods that often demand upright images. Our full system meets real-time requirements, while outperforming the state-of-the-art by a significant margin.
- Netvlad: Cnn architecture for weakly supervised place recognitio. In CVPR, pages 5297–5307, 2016.
- Relocnet: Continuous metric learning relocalisation using neural nets. In ECCV, pages 751–767, 2018.
- Robust lidar-based localization in architectural floor plans. In IROS, pages 3318–3324, 2017.
- A pose graph-based localization system for long-term navigation in cad floor plans. pages 84–97, 2019a.
- Robot localization in floor plans using a room layout edge extraction network. In IROS, pages 5291–5297, 2019b.
- Dsac-differentiable ransac for camera localization. In CVPR, pages 6684–6692, 2017.
- Deep stereo using adaptive thin volume representation with uncertainty awareness. In CVPR, pages 2524–2534, 2020.
- You are here: Mimicking the human thinking process in reading floor-plans. In ICCV, pages 2210–2218, 2015.
- Robert T Collins. A space-sweep approach to true multi-image matching. In CVPR, pages 358–363, 1996.
- Monte carlo localization for mobile robots. In ICRA, pages 1322–1328, 1999.
- The current state and future outlook of rescue robotics. Journal of Field Robotics, 36(7):1171–1191, 2019.
- Predicting depth, surface normals and semantic labels with a common multi-scale convolutional architecture. In NeurIPS, pages 2650–2658, 2015.
- Depth map prediction from a single image using a multi-scale deep network. 2014.
- Unsupervised monocular depth estimation with left-right consistency. In CVPR, pages 270–279, 2017.
- Deep residual learning for image recognition. In CVPR, pages 770–778, 2016.
- Lalaloc++: Global floor plan comprehension for layout localisation in unvisited environments. In ECCV, pages 693–709, 2022.
- Lalaloc: Latent layout localisation in dynamic, unvisited environments. In ICCV, pages 10107–10116, 2021.
- W-rgb-d: floor-plan-based indoor global localization using a depth camera and wifi. In ICRA, pages 417–422, 2014.
- End-to-end learnable histogram filters. In Workshop on Deep Learning for Action and Interaction at NIPS, 2016.
- Particle filter networks with application to visual localization. In CoRL, pages 169–178, 2018.
- Posenet: A convolutional network for real-time 6-dof camera relocalization. In ICCV, pages 2938–2946, 2015.
- Imagenet classification with deep convolutional neural networks. In NeurIPS, 2012.
- Online localization with imprecise floor space maps using stochastic gradient descent. In IROS, pages 8571–8578.
- Efficient global 2d-3d matching for camera localization in a large-scale 3d map. In ICCV, pages 2372–2381, 2017.
- P-mvsnet: Learning patch-wise matching confidence aggregation for multi-view stereo. In ICCV, pages 10452–10461, 2019.
- Attention-aware multi-view stereo. In CVPR, pages 1590–1599, 2020.
- Sedar: Reading floorplans like a human—using deep learning to enable human-inspired localisation. IJCV, 128:1286–1310, 2020.
- Laser: Latent space rendering for 2d visual localization. In CVPR, pages 11122–11131, 2022.
- Rethinking depth estimation for multi-view stereo: A unified representation. In CVPR, pages 8645–8654, 2022.
- PointNet: Deep learning on point sets for 3d classification and segmentation. In CVPR, pages 652–660, 2017.
- Towards robust monocular depth estimation: Mixing datasets for zero-shot cross-dataset transfer. IEEE TPAMI, 44(3):1623–1637, 2020.
- Vision transformers for dense prediction. In ICCV, pages 12179–12188, 2021.
- You are here: Geolocation by embedding maps and images. In ECCV, pages 502–518, 2020.
- From coarse to fine: Robust hierarchical localization at large scale. In CVPR, pages 12716–12725, 2019.
- Lamar: Benchmarking localization and mapping for augmented reality. In ECCV, pages 686–704, 2022.
- Orienternet: Visual localization in 2d public maps with neural matching. In CVPR, pages 21632–21642, 2023.
- Fast image-based localization using direct 2d-to-3d matching. In ICCV, pages 667–674, 2011.
- Improving image-based localization by active correspondence search. In ECCV, pages 752–765, 2012.
- Efficient & effective prioritized matching for large-scale image-based localization. PAMI, 39(9):1744–1756, 2016.
- City-scale location recognition. In CVPR, pages 1–7, 2007.
- igibson 1.0: A simulation environment for interactive tasks in large realistic scenes. In IROS.
- Scene coordinate regression forests for camera relocalization in rgb-d images. In CVPR, pages 2930–2937, 2013.
- DeepV2D: Video to depth with differentiable structure from motion. In ICLR, 2020.
- Exploiting uncertainty in regression forests for accurate camera relocalization. In CVPR, pages 4400–4408, 2015.
- The unscented particle filter. In NeurIPS, 2000.
- Attention is all you need. 2017.
- Image-based localization using lstms for structured feature correlation. In ICCV, pages 627–637, 2017.
- Glfp: Global localization from a floor plan. In IROS, pages 1627–1632, 2019.
- An introduction to the kalman filter. Technical Report 95-041, University of North Carolina at Chapel Hill, 1995.
- Delving deeper into convolutional neural networks for camera relocalization. In ICRA, pages 5644–5651, 2017.
- Visual cross-view metric localization with dense uncertainty estimates. In ECCV, pages 90–106, 2022.
- Mvsnet: Depth inference for unstructured multi-view stereo. In ECCV, pages 767–783, 2018.
- Recurrent mvsnet for high-resolution multi-view stereo depth inference. In CVPR, pages 5525–5534, 2019.
- Structured3d: A large photo-realistic dataset for structured 3d modeling. In ECCV, pages 519–535, 2020.
- Deeptam: Deep tracking and mapping. In ECCV, pages 822–838, 2018.
- Vigor: Cross-view image geo-localization beyond one-to-one retrieval. In CVPR, pages 3640–3649, 2021.
Collections
Sign up for free to add this paper to one or more collections.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.