Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

EucliDreamer: Fast and High-Quality Texturing for 3D Models with Depth-Conditioned Stable Diffusion (2404.10279v1)

Published 16 Apr 2024 in cs.CV

Abstract: We present EucliDreamer, a simple and effective method to generate textures for 3D models given text prompts and meshes. The texture is parametrized as an implicit function on the 3D surface, which is optimized with the Score Distillation Sampling (SDS) process and differentiable rendering. To generate high-quality textures, we leverage a depth-conditioned Stable Diffusion model guided by the depth image rendered from the mesh. We test our approach on 3D models in Objaverse and conducted a user study, which shows its superior quality compared to existing texturing methods like Text2Tex. In addition, our method converges 2 times faster than DreamFusion. Through text prompting, textures of diverse art styles can be produced. We hope Euclidreamer proides a viable solution to automate a labor-intensive stage in 3D content creation.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (37)
  1. Panohead: Geometry-aware 3d full-head synthesis in 360∘{}^{\circ}start_FLOATSUPERSCRIPT ∘ end_FLOATSUPERSCRIPT, 2023.
  2. Re-imagine the negative prompt algorithm: Transform 2d diffusion into 3d, alleviate janus problem and beyond, 2023.
  3. Text2tex: Text-driven texture synthesis via diffusion models, 2023.
  4. Objaverse: A universe of annotated 3d objects, 2022.
  5. Kaolin: A pytorch library for accelerating 3d deep learning research. https://github.com/NVIDIAGameWorks/kaolin, 2022.
  6. Depth-wise decomposition for accelerating separable convolutions in efficient convolutional neural networks. arXiv preprint arXiv:1910.09455, 2019.
  7. Delta denoising score, 2023.
  8. Congrui Hetang. Autonomous path generation with path optimization, 2022. US Patent App. 17/349,450.
  9. Novel view synthesis from a single rgbd image for indoor scenes. In 2023 International Conference on Image Processing, Computer Vision and Machine Learning (ICICML), pages 447–450. IEEE, 2023.
  10. Autonomous vehicle driving path label generation for machine learning models, 2023. US Patent App. 17/740,215.
  11. Impression network for video object detection. arXiv preprint arXiv:1712.05896, 2017.
  12. Implementing synthetic scenes for autonomous vehicles, 2022. US Patent App. 17/349,489.
  13. Segment anything model for road network graph extraction. arXiv preprint arXiv:2403.16051, 2024.
  14. Debiasing scores and prompts of 2d diffusion for robust text-to-3d generation. arXiv preprint arXiv:2303.15413, 2023.
  15. Total capture: A 3d deformation model for tracking faces, hands, and bodies. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 8320–8329, 2018.
  16. Modular primitives for high-performance differentiable rendering. ACM Transactions on Graphics (ToG), 39(6):1–14, 2020.
  17. Euclidreamer: Fast and high-quality texturing for 3d models with stable diffusion depth. arXiv preprint arXiv:2311.15573, 2023.
  18. Latent-nerf for shape-guided generation of 3d shapes and textures, 2022.
  19. Nerf: Representing scenes as neural radiance fields for view synthesis, 2020.
  20. Clip-mesh: Generating textured meshes from text using pretrained image-text models. In SIGGRAPH Asia 2022 Conference Papers. ACM, 2022.
  21. Instant neural graphics primitives with a multiresolution hash encoding. ACM Trans. Graph., 41(4):102:1–102:15, 2022.
  22. Dreamfusion: Text-to-3d using 2d diffusion, 2022.
  23. Synface: Face recognition with synthetic data. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 10880–10890, 2021.
  24. Dreambooth3d: Subject-driven text-to-3d generation, 2023.
  25. Accelerating 3d deep learning with pytorch3d, 2020.
  26. High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 10684–10695, 2022.
  27. Vox-e: Text-guided voxel editing of 3d objects, 2023.
  28. Region-based quality estimation network for large-scale person re-identification. In Proceedings of the AAAI conference on artificial intelligence, 2018.
  29. Synthetic datasets for autonomous driving: A survey. IEEE Transactions on Intelligent Vehicles, 2023.
  30. Methods and apparatuses for recognizing video and training, electronic device and medium, 2021. US Patent 10,909,380.
  31. Stop location change detection, 2023. US Patent 11,749,000.
  32. Prolificdreamer: High-fidelity and diverse text-to-3d generation with variational score distillation, 2023.
  33. Real-time shading-based refinement for consumer depth cameras. ACM Transactions on Graphics (ToG), 33(6):1–10, 2014.
  34. Coreference resolution helps visual dialogs to focus. High-Confidence Computing, page 100184, 2023.
  35. Adding conditional control to text-to-image diffusion models, 2023a.
  36. Sliding-bert: Striding towards conversational machine comprehension in long contex. Adv Artif Intell Mach Learn, 2023b.
  37. Unlocking everyday wisdom: Enhancing machine comprehension with script knowledge integration. Applied Sciences, 13(16):9461, 2023.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Cindy Le (5 papers)
  2. Congrui Hetang (7 papers)
  3. Chendi Lin (4 papers)
  4. Ang Cao (15 papers)
  5. Yihui He (25 papers)

Summary

We haven't generated a summary for this paper yet.