Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
139 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Gated Cross-Attention Network for Depth Completion (2309.16301v2)

Published 28 Sep 2023 in cs.CV

Abstract: Depth completion is a popular research direction in the field of depth estimation. The fusion of color and depth features is the current critical challenge in this task, mainly due to the asymmetry between the rich scene details in color images and the sparse pixels in depth maps. To tackle this issue, we design an efficient Gated Cross-Attention Network that propagates confidence via a gating mechanism, simultaneously extracting and refining key information in both color and depth branches to achieve local spatial feature fusion. Additionally, we employ an attention network based on the Transformer in low-dimensional space to effectively fuse global features and increase the network's receptive field. With a simple yet efficient gating mechanism, our proposed method achieves fast and accurate depth completion without the need for additional branches or post-processing steps. At the same time, we use the Ray Tune mechanism with the AsyncHyperBandScheduler scheduler and the HyperOptSearch algorithm to automatically search for the optimal number of module iterations, which also allows us to achieve performance comparable to state-of-the-art methods. We conduct experiments on both indoor and outdoor scene datasets. Our fast network achieves Pareto-optimal solutions in terms of time and accuracy, and at the time of submission, our accurate network ranks first among all published papers on the KITTI official website in terms of accuracy.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (46)
  1. Learning joint 2d-3d representations for depth completion. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 10023–10032, 2019.
  2. Cspn++: Learning context and resource aware convolutional spatial propagation networks for depth completion. In Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 10615–10622, 2020.
  3. Depth estimation via affinity learned with convolutional spatial propagation network. In Proceedings of the European conference on computer vision (ECCV), pp. 103–119, 2018.
  4. Deformable convolutional networks. In Proceedings of the IEEE international conference on computer vision, pp. 764–773, 2017.
  5. Learning morphological operators for depth completion. In Advanced Concepts for Intelligent Vision Systems: 19th International Conference, ACIVS 2018, Poitiers, France, September 24–27, 2018, Proceedings 19, pp. 450–461. Springer, 2018.
  6. Propagating confidences through cnns for sparse data regression. arXiv preprint arXiv:1805.11913, 2018.
  7. Confidence propagation through cnns for guided sparse depth regression. IEEE transactions on pattern analysis and machine intelligence, 42(10):2423–2436, 2019.
  8. A cascade dense connection fusion network for depth completion. In The 33rd British Machine Vision Conference, vol. 1, p. 2, 2022.
  9. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770–778, 2016.
  10. Penet: Towards precise and efficient image guided depth completion. In 2021 IEEE International Conference on Robotics and Automation (ICRA), pp. 13656–13662. IEEE, 2021.
  11. Sparse and dense data with cnns: Depth completion and semantic segmentation. In 2018 International Conference on 3D Vision (3DV), pp. 52–60. IEEE, 2018.
  12. Abcd: Attentive bilateral convolutional network for robust depth completion. IEEE Robotics and Automation Letters, 7(1):81–87, 2021.
  13. Mdanet: Multi-modal deep aggregation network for depth completion. In 2021 IEEE International Conference on Robotics and Automation (ICRA), pp. 4288–4294. IEEE, 2021.
  14. Deep architecture with cross guidance between single image and sparse lidar data for depth completion. IEEE Access, 8:79801–79810, 2020.
  15. X. Liang and C. Jung. Selective progressive learning for sparse depth completion. In 2022 26th International Conference on Pattern Recognition (ICPR), pp. 4132–4138. IEEE, 2022.
  16. Dynamic spatial propagation network for depth completion. arxiv 2022. arXiv preprint arXiv:2202.09769.
  17. J. Liu and C. Jung. Nnnet: New normal guided depth completion from sparse lidar data and single color image. IEEE Access, 10:114252–114261, 2022.
  18. An adaptive converged depth completion network based on efficient rgb guidance. Multimedia Tools and Applications, 81(25):35915–35933, 2022.
  19. Fcfr-net: Feature fusion based coarse-to-fine residual learning for depth completion. In proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, pp. 2136–2144, 2021.
  20. Mff-net: Towards efficient monocular depth completion with multi-modal feature fusion. IEEE Robotics and Automation Letters, 8(2):920–927, 2023.
  21. Self-supervised sparse-to-dense: Self-supervised depth completion from lidar and monocular camera. In 2019 International Conference on Robotics and Automation (ICRA), pp. 3288–3295. IEEE, 2019.
  22. D. Misra. Mish: A self regularized non-monotonic activation function. arXiv preprint arXiv:1908.08681, 2019.
  23. Semattnet: Toward attention-based semantic aware guided depth completion. IEEE Access, 10:120781–120791, 2022.
  24. Non-local spatial propagation network for depth completion. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XIII 16, pp. 120–136. Springer, 2020.
  25. Deeplidar: Deep surface normal guided depth prediction for outdoor scene from sparse lidar data and single color image. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3313–3322, 2019.
  26. U-net: Convolutional networks for biomedical image segmentation. In Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, October 5-9, 2015, Proceedings, Part III 18, pp. 234–241. Springer, 2015.
  27. Ssgp: Sparse spatial guided propagation for robust and generic interpolation. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 197–206, 2021.
  28. Dfusenet: Deep fusion of rgb and sparse depth information for image guided dense depth completion. In 2019 IEEE Intelligent Transportation Systems Conference (ITSC), pp. 13–20. IEEE, 2019.
  29. Indoor segmentation and support inference from rgbd images. In Computer Vision–ECCV 2012: 12th European Conference on Computer Vision, Florence, Italy, October 7-13, 2012, Proceedings, Part V 12, pp. 746–760. Springer, 2012.
  30. Revisiting deformable convolution for depth completion. In 2023 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 1300–1306. IEEE, 2023.
  31. Learning guided convolutional network for depth completion. IEEE Transactions on Image Processing, 30:1116–1129, 2020.
  32. Sparsity invariant cnns. In 2017 international conference on 3D Vision (3DV), pp. 11–20. IEEE, 2017.
  33. Sparsity invariant cnns. In International Conference on 3D Vision (3DV), 2017.
  34. Attention is all you need. Advances in neural information processing systems, 30, 2017.
  35. Lrru: Long-short range recurrent updating networks for depth completion. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 9422–9432, 2023.
  36. Decomposed guided dynamic filters for efficient rgb-guided depth completion. IEEE Transactions on Circuits and Systems for Video Technology, 2023.
  37. Depth completion from sparse lidar data with depth-normal constraints. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 2811–2820, 2019.
  38. Deformable spatial propagation networks for depth completion. In 2020 IEEE International Conference on Image Processing (ICIP), pp. 913–917. IEEE, 2020.
  39. Revisiting sparsity invariant convolution: A network for image guided depth completion. IEEE Access, 8:126323–126332, 2020.
  40. Dan-conv: Depth aware non-local convolution for lidar depth completion. Electronics Letters, 57(20):754–757, 2021.
  41. Rignet++: Efficient repetitive image guided network for depth completion. arXiv preprint arXiv:2309.00655, 2023.
  42. Rignet: Repetitive image guided network for depth completion. In European Conference on Computer Vision, pp. 214–230. Springer, 2022.
  43. Completionformer: Depth completion with convolutions and vision transformers. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 18527–18536, 2023.
  44. Adaptive context-aware multi-modal network for depth completion. IEEE Transactions on Image Processing, 30:5264–5276, 2021.
  45. Bev@ dc: Bird’s-eye view assisted training for depth completion. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9233–9242, 2023.
  46. Robust depth completion with uncertainty-driven loss functions. In Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 3626–3634, 2022.

Summary

We haven't generated a summary for this paper yet.