PathFinder: Attention-Driven Dynamic Non-Line-of-Sight Tracking with a Mobile Robot (2404.05024v1)
Abstract: The study of non-line-of-sight (NLOS) imaging is growing due to its many potential applications, including rescue operations and pedestrian detection by self-driving cars. However, implementing NLOS imaging on a moving camera remains an open area of research. Existing NLOS imaging methods rely on time-resolved detectors and laser configurations that require precise optical alignment, making it difficult to deploy them in dynamic environments. This work proposes a data-driven approach to NLOS imaging, PathFinder, that can be used with a standard RGB camera mounted on a small, power-constrained mobile robot, such as an aerial drone. Our experimental pipeline is designed to accurately estimate the 2D trajectory of a person who moves in a Manhattan-world environment while remaining hidden from the camera's field-of-view. We introduce a novel approach to process a sequence of dynamic successive frames in a line-of-sight (LOS) video using an attention-based neural network that performs inference in real-time. The method also includes a preprocessing selection metric that analyzes images from a moving camera which contain multiple vertical planar surfaces, such as walls and building facades, and extracts planes that return maximum NLOS information. We validate the approach on in-the-wild scenes using a drone for video capture, thus demonstrating low-cost NLOS imaging in dynamic capture environments.
- R. Geng, Y. Hu, Y. Chen, et al., “Recent advances on non-line-of-sight imaging: Conventional physical models, deep learning, and new scenes,” APSIPA Transactions on Signal and Information Processing, vol. 11, no. 1, 2021.
- T. Maeda, G. Satat, T. Swedish, L. Sinha, and R. Raskar, “Recent advances in imaging around corners,” arXiv preprint arXiv:1910.05613, 2019.
- P. V. Borges, A. Tews, and D. Haddon, “Pedestrian detection in industrial environments: Seeing around corners,” in IEEE/RSJ International Conference on Intelligent Robots and Systems, 2012, pp. 4231–4232.
- O. Rabaste, J. Bosse, D. Poullin, I. Hinostroza, T. Letertre, T. Chonavel, et al., “Around-the-corner radar: Detection and localization of a target in non-line of sight,” in IEEE Radar Conference (RadarConf), 2017, pp. 0842–0847.
- N. Scheiner, F. Kraus, F. Wei, B. Phan, F. Mannan, N. Appenrodt, W. Ritter, J. Dickmann, K. Dietmayer, B. Sick, et al., “Seeing around street corners: Non-line-of-sight detection and tracking in-the-wild using doppler radar,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 2068–2077.
- A. Velten, T. Willwacher, O. Gupta, A. Veeraraghavan, M. G. Bawendi, and R. Raskar, “Recovering three-dimensional shape around a corner using ultrafast time-of-flight imaging,” Nature Communications, vol. 3, no. 1, p. 745, 2012.
- M. Buttafava, J. Zeman, A. Tosi, K. Eliceiri, and A. Velten, “Non-line-of-sight imaging using a time-gated single photon avalanche diode,” Optics Express, vol. 23, no. 16, pp. 20 997–21 011, 2015.
- C. Wu, J. Liu, X. Huang, Z.-P. Li, C. Yu, J.-T. Ye, J. Zhang, Q. Zhang, X. Dou, V. K. Goyal, et al., “Non–line-of-sight imaging over 1.43 km,” Proceedings of the National Academy of Sciences, vol. 118, no. 10, p. e2024468118, 2021.
- A. Kirmani, T. Hutchison, J. Davis, and R. Raskar, “Looking around the corner using ultrafast transient imaging,” International Journal of Computer Vision, vol. 95, pp. 13–28, 2011.
- C.-Y. Tsai, K. N. Kutulakos, S. G. Narasimhan, and A. C. Sankaranarayanan, “The geometry of first-returning photons for non-line-of-sight imaging,” in IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 7216–7224.
- S. Xin, S. Nousias, K. N. Kutulakos, A. C. Sankaranarayanan, S. G. Narasimhan, and I. Gkioulekas, “A theory of Fermat paths for non-line-of-sight shape reconstruction,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 6800–6809.
- M. O’Toole, D. B. Lindell, and G. Wetzstein, “Confocal non-line-of-sight imaging based on the light-cone transform,” Nature, vol. 555, no. 7696, pp. 338–341, 2018.
- D. B. Lindell, G. Wetzstein, and M. O’Toole, “Wave-based non-line-of-sight imaging using fast f-k migration,” ACM Transactions on Graphics (ToG), vol. 38, no. 4, pp. 1–13, 2019.
- J. Klein, C. Peters, J. Martín, M. Laurenzis, and M. B. Hullin, “Tracking objects outside the line of sight using 2D intensity images,” Scientific Reports, vol. 6, no. 1, p. 32491, 2016.
- W. Chen, S. Daneau, F. Mannan, and F. Heide, “Steady-state non-line-of-sight imaging,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 6790–6799.
- S. Chandran and S. Jayasuriya, “Adaptive lighting for data-driven non-line-of-sight 3d localization and object identification,” British Machine Vision Conference (BMVC), 2019.
- S. Chandran, T. Yatagawa, H. Kubo, and S. Jayasuriya, “Learning-based spotlight position optimization for non-line-of-sight human localization and posture classification,” in IEEE/CVF Winter Conference on Applications of Computer Vision, 2024, pp. 4218–4227.
- M. Baradad, V. Ye, A. B. Yedidia, F. Durand, W. T. Freeman, G. W. Wornell, and A. Torralba, “Inferring light fields from shadows,” in IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 6267–6275.
- K. L. Bouman, V. Ye, A. B. Yedidia, F. Durand, G. W. Wornell, A. Torralba, and W. T. Freeman, “Turning corners into cameras: Principles and methods,” in IEEE International Conference on Computer Vision, 2017, pp. 2270–2278.
- Y. Wang, Z. Wang, B. Zhao, D. Wang, M. Chen, and X. Li, “Propagate and calibrate: Real-time passive non-line-of-sight tracking,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 972–981.
- P. Sharma, M. Aittala, Y. Y. Schechner, A. Torralba, G. W. Wornell, W. T. Freeman, and F. Durand, “What you can learn by staring at a blank wall,” in IEEE/CVF International Conference on Computer Vision, 2021, pp. 2330–2339.
- W. Krska, S. W. Seidel, C. Saunders, R. Czajkowski, C. Yu, J. Murray-Bruce, and V. Goyal, “Double your corners, double your fun: The doorway camera,” in IEEE International Conference on Computational Photography (ICCP), 2022, pp. 1–12.
- Y. Cao, R. Liang, J. Yang, Y. Cao, Z. He, J. Chen, and X. Li, “Computational framework for steady-state NLOS localization under changing ambient illumination conditions,” Optics Express, vol. 30, no. 2, pp. 2438–2452, 2022.
- A. B. Yedidia, M. Baradad, C. Thrampoulidis, W. T. Freeman, and G. W. Wornell, “Using unknown occluders to recover hidden scenes,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 12 231–12 239.
- R. Geng, Y. Hu, Z. Lu, C. Yu, H. Li, H. Zhang, and Y. Chen, “Passive non-line-of-sight imaging using optimal transport,” IEEE Transactions on Image Processing, vol. 31, pp. 110–124, 2021.
- S. W. Seidel, J. Murray-Bruce, Y. Ma, C. Yu, W. T. Freeman, and V. K. Goyal, “Two-dimensional non-line-of-sight scene estimation from a single edge occluder,” IEEE Transactions on Computational Imaging, vol. 7, pp. 58–72, 2020.
- A. Torralba and W. T. Freeman, “Accidental pinhole and pinspeck cameras: Revealing the scene outside the picture,” International Journal of Computer Vision, vol. 110, pp. 92–112, 2014.
- A. Beckus, A. Tamasan, and G. K. Atia, “Multi-modal non-line-of-sight passive imaging,” IEEE Transactions on Image Processing, vol. 28, no. 7, pp. 3372–3382, 2019.
- M. Tancik, G. Satat, and R. Raskar, “Flash photography for data-driven hidden scene recovery,” arXiv preprint arXiv:1810.11710, 2018.
- M. Aittala, P. Sharma, L. Murmann, A. Yedidia, G. Wornell, B. Freeman, and F. Durand, “Computational mirrors: Blind inverse light transport by deep matrix factorization,” Advances in Neural Information Processing Systems, vol. 32, 2019.
- J. He, S. Wu, R. Wei, and Y. Zhang, “Non-line-of-sight imaging and tracking of moving objects based on deep learning,” Optics Express, vol. 30, no. 10, pp. 16 758–16 772, 2022.
- K. Sun, K. Mohta, B. Pfrommer, M. Watterson, S. Liu, Y. Mulgaonkar, C. J. Taylor, and V. Kumar, “Robust stereo visual inertial odometry for fast autonomous flight,” IEEE Robotics and Automation Letters, vol. 3, no. 2, pp. 965–972, 2018.
- Y. Xie, F. Shu, J. Rambach, A. Pagani, and D. Stricker, “PlaneRecNet: multi-task learning with cross-task consistency for piece-wise plane detection and reconstruction from a single RGB image,” arXiv preprint arXiv:2110.11219, 2021.
- A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly, et al., “An image is worth 16x16 words: Transformers for image recognition at scale,” arXiv preprint arXiv:2010.11929, 2020.
- N. Carion, F. Massa, G. Synnaeve, N. Usunier, A. Kirillov, and S. Zagoruyko, “End-to-end object detection with transformers,” in European Conference on Computer Vision, 2020, pp. 213–229.
- A. Kirillov, E. Mintun, N. Ravi, H. Mao, C. Rolland, L. Gustafson, T. Xiao, S. Whitehead, A. C. Berg, W.-Y. Lo, et al., “Segment anything,” arXiv preprint arXiv:2304.02643, 2023.
- M. Dehghani, B. Mustafa, J. Djolonga, J. Heek, M. Minderer, M. Caron, A. Steiner, J. Puigcerver, R. Geirhos, I. Alabdulmohsin, et al., “Patch n’Pack: NaViT, a vision transformer for any aspect ratio and resolution,” arXiv preprint arXiv:2307.06304, 2023.
- J. Rehder, J. Nikolic, T. Schneider, T. Hinzmann, and R. Siegwart, “Extending kalibr: Calibrating the extrinsics of multiple IMUs and of individual axes,” in IEEE International Conference on Robotics and Automation, 2016, pp. 4304–4311.