Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
139 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Scaling Diffusion Models to Real-World 3D LiDAR Scene Completion (2403.13470v1)

Published 20 Mar 2024 in cs.CV

Abstract: Computer vision techniques play a central role in the perception stack of autonomous vehicles. Such methods are employed to perceive the vehicle surroundings given sensor data. 3D LiDAR sensors are commonly used to collect sparse 3D point clouds from the scene. However, compared to human perception, such systems struggle to deduce the unseen parts of the scene given those sparse point clouds. In this matter, the scene completion task aims at predicting the gaps in the LiDAR measurements to achieve a more complete scene representation. Given the promising results of recent diffusion models as generative models for images, we propose extending them to achieve scene completion from a single 3D LiDAR scan. Previous works used diffusion models over range images extracted from LiDAR data, directly applying image-based diffusion methods. Distinctly, we propose to directly operate on the points, reformulating the noising and denoising diffusion process such that it can efficiently work at scene scale. Together with our approach, we propose a regularization loss to stabilize the noise predicted during the denoising process. Our experimental evaluation shows that our method can complete the scene given a single LiDAR scan as input, producing a scene with more details compared to state-of-the-art scene completion methods. We believe that our proposed diffusion process formulation can support further research in diffusion models applied to scene-scale point cloud data.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (8)
  1. 4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks. In Proc. of the IEEE/CVF Conf. on Computer Vision and Pattern Recognition (CVPR), 2019.
  2. Adam: A Method for Stochastic Optimization. In Proc. of the Int. Conf. on Learning Representations (ICLR), 2015.
  3. LODE Locally Conditioned Eikonal Implicit Scene Completion from Sparse LiDAR. In Proc. of the IEEE Intl. Conf. on Robotics & Automation (ICRA), 2023.
  4. DPM-solver: A fast ODE solver for diffusion probabilistic model sampling in around 10 steps. In Proc. of the Conf. on Neural Information Processing Systems (NeurIPS), 2022.
  5. A conditional point diffusion-refinement paradigm for 3d point cloud completion. In Proc. of the Int. Conf. on Learning Representations (ICLR), 2022.
  6. LMSCNet: Lightweight Multiscale 3D Semantic Completion. In Proc. of the Intl. Conf. on 3D Vision (3DV), 2020.
  7. Make it dense: Self-supervised geometric scan completion of sparse 3d lidar scans in large outdoor environments. IEEE Robotics and Automation Letters (RA-L), 7(3):8534–8541, 2022.
  8. 3D Shape Generation and Completion Through Point-Voxel Diffusion. In Proc. of the IEEE/CVF Intl. Conf. on Computer Vision (ICCV), 2021.
Citations (7)

Summary

We haven't generated a summary for this paper yet.