DreamCar: Leveraging Car-specific Prior for in-the-wild 3D Car Reconstruction (2407.16988v2)

Published 24 Jul 2024 in cs.CV

Abstract: Self-driving industries usually employ professional artists to build exquisite 3D cars. However, it is expensive to craft large-scale digital assets. Since there are already numerous datasets available that contain a vast number of images of cars, we focus on reconstructing high-quality 3D car models from these datasets. However, these datasets only contain one side of cars in the forward-moving scene. We try to use the existing generative models to provide more supervision information, but they struggle to generalize well in cars since they are trained on synthetic datasets not car-specific. In addition, The reconstructed 3D car texture misaligns due to a large error in camera pose estimation when dealing with in-the-wild images. These restrictions make it challenging for previous methods to reconstruct complete 3D cars. To address these problems, we propose a novel method, named DreamCar, which can reconstruct high-quality 3D cars given a few images even a single image. To generalize the generative model, we collect a car dataset, named Car360, with over 5,600 vehicles. With this dataset, we make the generative model more robust to cars. We use this generative prior specific to the car to guide its reconstruction via Score Distillation Sampling. To further complement the supervision information, we utilize the geometric and appearance symmetry of cars. Finally, we propose a pose optimization method that rectifies poses to tackle texture misalignment. Extensive experiments demonstrate that our method significantly outperforms existing methods in reconstructing high-quality 3D cars. \href{https://xiaobiaodu.github.io/dreamcar-project/}{Our code is available.}

Summary

The paper introduces a novel car-specific generative model using the Car360 dataset and mirror symmetry to enhance 3D model fidelity.
It employs a robust pose optimization using an MLP to correct camera estimations and reduce texture misalignment.
The multi-stage reconstruction approach outperforms state-of-the-art metrics on both synthetic and real-world datasets.

DreamCar: Leveraging Car-specific Prior for in-the-wild 3D Car Reconstruction

The paper "DreamCar: Leveraging Car-specific Prior for in-the-wild 3D Car Reconstruction" addresses the challenging problem of reconstructing high-fidelity 3D car models from sparse and frequently low-quality image datasets, typical of those generated by self-driving vehicles in real-world scenarios. The authors introduce an innovative approach dubbed DreamCar, which synthesizes car-specific generation priors, exploit symmetry properties inherent in vehicular structures, and optimizes camera poses to enhance texture alignment.

Contributions

The DreamCar framework makes several significant contributions:

Introduction of Car-specific Generative Model: The authors introduce a novel dataset, Car360, comprising over 5,600 synthetic vehicles with photorealistic textures. This dataset is instrumental in fine-tuning a 3D-aware generative model, improving its ability to generalize to real-world car images.
Integration of Mirror Symmetry: By recognizing and leveraging the geometric symmetry of vehicles, the method effectively doubles the supervision data available for each vehicle, thereby enhancing the completeness and accuracy of the reconstructed 3D models.
Pose Optimization Technique: To minimize texture misalignment issues caused by erroneous camera pose estimations, the authors propose a robust pose optimization method. This method adjusts camera poses using a multilayer perceptron (MLP) that incorporates time-aware information.
Multi-stage Reconstruction Approach: DreamCar employs a progressive reconstruction strategy, beginning with coarse geometry sculpting using NeRF and Neus, and culminating in fine texture refinement using DMTET, along with the DreamBooth technique for higher resolution texture detail.

Experimental Results

The authors provide extensive experimental validation on both synthetic datasets (Car360) and real-world datasets (Nuscenes). Notably, the DreamCar method consistently outperforms existing state-of-the-art techniques in both image-based and lidar-based metrics. The evaluation metrics include MSE, PSNR, SSIM, LPIPS, and FID for image-based metrics, and per-ray $L_2$ error, hit rate, Chamfer distance, and Hausdorff distance for lidar-based metrics.

Quantitative Improvement: For the Car360 dataset, DreamCar achieved an MSE of 0.0297 and a PSNR of 15.44 in standard views, exhibiting a significant improvement over methods such as Instant-NGP, which had an MSE of 0.1127 and PSNR of 9.48. Similarly, in mirror views, DreamCar maintained its superior performance with an MSE of 0.0429 and PSNR of 14.71.
Lidar-based Metrics: DreamCar exhibited a lower $L_2$ error and Chamfer distance compared to other methods, confirming its ability to reconstruct more accurate and complete 3D geometries.

Implications and Future Work

The proposed DreamCar method has significant implications for the self-driving industry and other domains requiring high-quality 3D asset generation from sparse data. The method's ability to effectively generate high-fidelity 3D vehicle models from limited imagery can substantially reduce the costs and labor associated with manual 3D car modeling.

The introduction of the Car360 dataset underscores the value of car-specific pre-training in enhancing the generative model's realism and fidelity when applied to vehicular subjects. Moreover, the proposed pose optimization methodology can potentially be generalized to other dynamic scene capture scenarios, making it a substantial contribution to the field of computer vision and 3D reconstruction.

Looking forward, future developments could involve extending the DreamCar framework to handle occlusions and other challenging conditions more robustly. Additionally, integrating real-time reconstruction capabilities could further enhance its applicability for live applications in autonomous driving simulations and augmented reality environments.

In summary, the DreamCar method marks a notable advancement in 3D reconstruction from sparse views, specifically tailored for the self-driving context, and represents a practical approach towards scaling high-quality 3D asset generation for realistic simulations.

PDF Markdown

Related Papers

Tweets

https://twitter.com/taziku_co/status/1817490237108699534

https://twitter.com/_vztu/status/1816883179007152293

https://twitter.com/ai_bites/status/1816405428198146285

https://twitter.com/CSVisionPapers/status/1816485308562538718