Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
126 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Wild2Avatar: Rendering Humans Behind Occlusions (2401.00431v1)

Published 31 Dec 2023 in cs.CV

Abstract: Rendering the visual appearance of moving humans from occluded monocular videos is a challenging task. Most existing research renders 3D humans under ideal conditions, requiring a clear and unobstructed scene. Those methods cannot be used to render humans in real-world scenes where obstacles may block the camera's view and lead to partial occlusions. In this work, we present Wild2Avatar, a neural rendering approach catered for occluded in-the-wild monocular videos. We propose occlusion-aware scene parameterization for decoupling the scene into three parts - occlusion, human, and background. Additionally, extensive objective functions are designed to help enforce the decoupling of the human from both the occlusion and the background and to ensure the completeness of the human model. We verify the effectiveness of our approach with experiments on in-the-wild videos.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (56)
  1. Detailed human avatars from monocular video, 2018a.
  2. Video based reconstruction of 3d people models, 2018b.
  3. Free-viewpoint video of human actors. ACM transactions on graphics (TOG), 22(3):569–577, 2003.
  4. Snarf: Differentiable forward skinning for animating non-rigid neural implicit shapes. In International Conference on Computer Vision (ICCV), 2021.
  5. Acquiring the reflectance field of a human face. In Proceedings of the 27th Annual Conference on Computer Graphics and Interactive Techniques, page 145–156, USA, 2000. ACM Press/Addison-Wesley Publishing Co.
  6. Depth-supervised nerf for multi-view rgb-d operating room images. arXiv preprint arXiv:2211.12436, 2022.
  7. Humans in 4D: Reconstructing and tracking humans with transformers. In International Conference on Computer Vision (ICCV), 2023.
  8. Implicit geometric regularization for learning shapes. arXiv preprint arXiv:2002.10099, 2020.
  9. Vid2avatar: 3d avatar reconstruction from videos in the wild via self-supervised scene decomposition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 12858–12868, 2023.
  10. The relightables: Volumetric performance capture of humans with realistic relighting. 38(6), 2019.
  11. Livecap: Real-time human performance capture from monocular video, 2019.
  12. Hdhumans: A hybrid approach for high-fidelity digital humans. Proceedings of the ACM on Computer Graphics and Interactive Techniques, 6(3):1–23, 2023.
  13. Occluded human body capture with self-supervised spatial-temporal motion prior. arXiv preprint arXiv:2207.05375, 2022.
  14. Deep volumetric video from very sparse multi-view performance capture. In Computer Vision – ECCV 2018. Springer International Publishing, 2018.
  15. Flexnerf: Photorealistic free-viewpoint rendering of moving humans from sparse views. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 21118–21127, 2023.
  16. Editable free-viewpoint video using a layered neural representation. In ACM SIGGRAPH, 2021.
  17. Selfrecon: Self reconstruction your digital avatar from monocular video, 2022a.
  18. Instantavatar: Learning avatars from monocular video in 60 seconds, 2022b.
  19. Neuman: Neural human radiance field from a single video, 2022c.
  20. Hifecap: Monocular high-fidelity and expressive capture of human performances, 2022d.
  21. Stereo image de-fencing using smartphones. In 2017 IEEE international conference on acoustics, speech and signal processing (ICASSP), pages 1792–1796. IEEE, 2017.
  22. 3d gaussian splatting for real-time radiance field rendering. ACM Transactions on Graphics (ToG), 42(4):1–14, 2023.
  23. Segment anything. arXiv preprint arXiv:2304.02643, 2023.
  24. Recurrent feature reasoning for image inpainting. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 7760–7768, 2020.
  25. Tava: Template-free animatable volumetric actors, 2022.
  26. Hosnerf: Dynamic human-object-scene neural radiance fields from a single video, 2023.
  27. Neural actor: Neural free-view synthesis of human actors with pose control, 2022a.
  28. Neural rays for occlusion-aware image-based rendering. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 7824–7833, 2022b.
  29. Neural rays for occlusion-aware image-based rendering. In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 7814–7823, 2022c.
  30. Neural volumes: Learning dynamic renderable volumes from images. arXiv preprint arXiv:1906.07751, 2019.
  31. SMPL: A skinned multi-person linear model. ACM Trans. Graphics (Proc. SIGGRAPH Asia), 34(6):248:1–248:16, 2015.
  32. Nerf: Representing scenes as neural radiance fields for view synthesis. Communications of the ACM, 65(1):99–106, 2021.
  33. Video de-fencing. IEEE Transactions on Circuits and Systems for Video Technology, 24(7):1111–1121, 2013.
  34. Instant neural graphics primitives with a multiresolution hash encoding. ACM Transactions on Graphics (ToG), 41(4):1–15, 2022.
  35. Edgeconnect: Generative image inpainting with adversarial edge learning. arXiv preprint arXiv:1901.00212, 2019.
  36. Giraffe: Representing scenes as compositional generative neural feature fields, 2021.
  37. OpenAI, 2023.
  38. Deepsdf: Learning continuous signed distance functions for shape representation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 165–174, 2019.
  39. Nerfies: Deformable neural radiance fields. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 5865–5874, 2021.
  40. Animatable neural radiance fields for modeling dynamic human bodies. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 14314–14323, 2021a.
  41. Neural body: Implicit neural representations with structured latent codes for novel view synthesis of dynamic humans, 2021b.
  42. D-nerf: Neural radiance fields for dynamic scenes. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 10318–10327, 2021.
  43. Neus: Learning neural implicit surfaces by volume rendering for multi-view reconstruction. Advances in Neural Information Processing Systems, 34:27171–27183, 2021.
  44. Humannerf: Free-viewpoint rendering of moving people from monocular video. In Proceedings of the IEEE/CVF conference on computer vision and pattern Recognition, pages 16210–16220, 2022.
  45. D22{}^{2}start_FLOATSUPERSCRIPT 2 end_FLOATSUPERSCRIPTnerf: Self-supervised decoupling of dynamic and static objects from a monocular video, 2022.
  46. Rendering humans from object-occluded monocular videos. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 3239–3250, 2023.
  47. H-nerf: Neural radiance fields for rendering and temporal reconstruction of humans in motion, 2021.
  48. A computational approach for obstruction-free photography. ACM Transactions on Graphics, 34(4):1–11, 2015.
  49. Volume rendering of neural implicit surfaces. In Thirty-Fifth Conference on Neural Information Processing Systems, 2021.
  50. Decoupling human and camera motion from videos in the wild. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
  51. Monohuman: Animatable human neural field from monocular video, 2023.
  52. Star: Self-supervised tracking and reconstruction of rigid objects in motion with neural rendering, 2020.
  53. Nerf++: Analyzing and improving neural radiance fields. arXiv preprint arXiv:2010.07492, 2020.
  54. Relightable neural human assets from multi-view gradient illuminations, 2023a.
  55. Hdhuman: High-quality human novel-view rendering from sparse views, 2023b.
  56. Occlusion-free scene recovery via neural radiance fields. In 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2023.
Citations (2)

Summary

We haven't generated a summary for this paper yet.