AniDress: Animatable Loose-Dressed Avatar from Sparse Views Using Garment Rigging Model (2401.15348v1)
Abstract: Recent communities have seen significant progress in building photo-realistic animatable avatars from sparse multi-view videos. However, current workflows struggle to render realistic garment dynamics for loose-fitting characters as they predominantly rely on naked body models for human modeling while leaving the garment part un-modeled. This is mainly due to that the deformations yielded by loose garments are highly non-rigid, and capturing such deformations often requires dense views as supervision. In this paper, we introduce AniDress, a novel method for generating animatable human avatars in loose clothes using very sparse multi-view videos (4-8 in our setting). To allow the capturing and appearance learning of loose garments in such a situation, we employ a virtual bone-based garment rigging model obtained from physics-based simulation data. Such a model allows us to capture and render complex garment dynamics through a set of low-dimensional bone transformations. Technically, we develop a novel method for estimating temporal coherent garment dynamics from a sparse multi-view video. To build a realistic rendering for unseen garment status using coarse estimations, a pose-driven deformable neural radiance field conditioned on both body and garment motions is introduced, providing explicit control of both parts. At test time, the new garment poses can be captured from unseen situations, derived from a physics-based or neural network-based simulator to drive unseen garment dynamics. To evaluate our approach, we create a multi-view dataset that captures loose-dressed performers with diverse motions. Experiments show that our method is able to render natural garment dynamics that deviate highly from the body and generalize well to both unseen views and poses, surpassing the performance of existing methods. The code and data will be publicly available.
- Easymocap - make human motion capture easier. Github, 2021.
- Large steps in cloth simulation. In Proceedings of the 25th annual conference on Computer graphics and interactive techniques, pages 43–54, 1998.
- Deepsd: Automatic deep skinning and pose space deformation for 3d garment animation. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 5471–5480, 2021.
- Snarf: Differentiable forward skinning for animating non-rigid neural implicit shapes. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 11594–11604, 2021.
- Uv volumes for real-time rendering of editable free-view human performance. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 16621–16631, 2023.
- Smplicit: Topology-aware generative model for clothed people. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 11875–11885, 2021.
- N-cloth: Predicting 3d cloth deformation with mesh-based networks. In Computer Graphics Forum, pages 547–558. Wiley Online Library, 2022.
- Stable spaces for real-time clothing. ACM Transactions on Graphics (TOG), 29(4):1–9, 2010.
- Image quality assessment: Unifying structure and texture similarity. IEEE transactions on pattern analysis and machine intelligence, 44(5):2567–2581, 2020.
- Capturing and animation of body and clothing from monocular video. In SIGGRAPH Asia 2022 Conference Papers, pages 1–9, 2022.
- Learning neural volumetric representations of dynamic humans in minutes. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 8759–8770, 2023.
- Deepcap: Monocular human performance capture using weak supervision. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5052–5063, 2020.
- Real-time deep dynamic characters. ACM Transactions on Graphics (ToG), 40(4):1–16, 2021.
- Hdhumans: A hybrid approach for high-fidelity digital humans. Proceedings of the ACM on Computer Graphics and Interactive Techniques, 6(3):1–23, 2023.
- Subspace neural physics: Fast data-driven interactive simulation. In Proceedings of the 18th annual ACM SIGGRAPH/Eurographics Symposium on Computer Animation, pages 1–12, 2019.
- Humanrf: High-fidelity neural radiance fields for humans in motion. ACM Trans. Graph., 42(4), 2023.
- Human3.6m: Large scale datasets and predictive methods for 3d human sensing in natural environments. IEEE Transactions on Pattern Analysis and Machine Intelligence, 36(7):1325–1339, 2014.
- Flexnerf: Photorealistic free-viewpoint rendering of moving humans from sparse views. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 21118–21127, 2023.
- Bcnet: Learning body and cloth shape from a single image. In European Conference on Computer Vision, pages 18–35, 2020.
- Deepwrinkles: Accurate and realistic clothing modeling. In Proceedings of the European conference on computer vision (ECCV), pages 667–684, 2018.
- Smooth skinning decomposition with rigid bones. ACM Transactions on Graphics (TOG), 31(6):1–10, 2012.
- Global correspondence optimization for non-rigid registration of depth scans. In Computer graphics forum, pages 1421–1430. Wiley Online Library, 2008.
- Self-correction for human parsing. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020.
- Tava: Template-free animatable volumetric actors. In European Conference on Computer Vision, pages 419–436. Springer, 2022a.
- Deep physics-aware inference of cloth deformation for monocular human performance capture. In 2021 International Conference on 3D Vision (3DV), pages 373–384. IEEE, 2021.
- Avatarcap: Animatable avatar conditioned monocular human volumetric capture. In European Conference on Computer Vision, pages 322–341. Springer, 2022b.
- Neural actor: Neural free-view synthesis of human actors with pose control. ACM transactions on graphics (TOG), 40(6):1–16, 2021.
- Soft rasterizer: A differentiable renderer for image-based 3d reasoning. The IEEE International Conference on Computer Vision (ICCV), 2019.
- Quasi-newton methods for real-time simulation of hyperelastic materials. Acm Transactions on Graphics (TOG), 36(3):1–16, 2017.
- Marching cubes: A high resolution 3d surface construction algorithm. In Seminal graphics: pioneering efforts that shaped the field, pages 347–353. 1998.
- Learning to dress 3d people in generative clothing. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 6469–6478, 2020.
- The power of points for modeling humans in clothing. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 10974–10984, 2021.
- Neural point-based shape modeling of humans in challenging clothing. In 2022 International Conference on 3D Vision (3DV), pages 679–689. IEEE, 2022.
- Amass: Archive of motion capture as surface shapes. In The IEEE International Conference on Computer Vision (ICCV), 2019.
- Leap: Learning articulated occupancy of people. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 10461–10471, 2021.
- Nerf: Representing scenes as neural radiance fields for view synthesis. In European conference on computer vision, 2020.
- Neural articulated radiance field. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 5762–5772, 2021.
- Npms: Neural parametric models for 3d deformable shapes. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 12695–12705, 2021.
- Predicting loose-fitting garment deformations using bone-driven motion networks. In ACM SIGGRAPH 2022 Conference Proceedings, pages 1–10, 2022.
- Tailornet: Predicting clothing in 3d as a function of human pose, shape and garment style. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 7365–7375, 2020.
- Animatable neural radiance fields for modeling dynamic human bodies. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 14314–14323, 2021a.
- Neural body: Implicit neural representations with structured latent codes for novel view synthesis of dynamic humans. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 9054–9063, 2021b.
- Animatable implicit neural representations for creating realistic avatars from videos. arXiv preprint arXiv:2203.08133, 2022.
- D-nerf: Neural radiance fields for dynamic scenes. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 10318–10327, 2021.
- Rec-mv: Reconstructing 3d dynamic cloth from monocular videos. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4637–4646, 2023.
- Accelerating 3d deep learning with pytorch3d. arXiv:2007.08501, 2020.
- Scanimate: Weakly supervised learning of skinned clothed avatar networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 2886–2897, 2021.
- Learning-based animation of clothing for virtual try-on. In Computer Graphics Forum, pages 355–366. Wiley Online Library, 2019.
- Structure-from-motion revisited. In Conference on Computer Vision and Pattern Recognition (CVPR), 2016.
- Pixelwise view selection for unstructured multi-view stereo. In European Conference on Computer Vision (ECCV), 2016.
- SideFX. Houdini vellum, 2021.
- xcloth: Extracting template-free textured 3d clothes from a monocular image. In Proceedings of the 30th ACM International Conference on Multimedia, pages 2504–2512, 2022.
- Embedded deformation for shape manipulation. In ACM siggraph 2007 papers, pages 80–es. 2007.
- A gpu-based streaming algorithm for high-resolution cloth simulation. In Computer Graphics Forum, pages 21–30. Wiley Online Library, 2013.
- Raft: Recurrent all-pairs field transforms for optical flow. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part II 16, pages 402–419. Springer, 2020.
- Neural-gif: Neural generalized implicit functions for animating people in clothing. In International Conference on Computer Vision (ICCV), 2021.
- Fast cloth animation on walking avatars. In Computer Graphics Forum, pages 260–267. Wiley Online Library, 2001.
- Fully convolutional graph neural networks for parametric virtual try-on. In Computer Graphics Forum, pages 145–156. Wiley Online Library, 2020.
- Neus: Learning neural implicit surfaces by volume rendering for multi-view reconstruction. arXiv preprint arXiv:2106.10689, 2021.
- Arah: Animatable volume rendering of articulated human sdfs. In European conference on computer vision, pages 1–19. Springer, 2022.
- Learning an intrinsic garment space for interactive authoring of garment animation. ACM Transactions on Graphics (TOG), 38(6):1–12, 2019.
- Humannerf: Free-viewpoint rendering of moving people from monocular video. In Proceedings of the IEEE/CVF conference on computer vision and pattern Recognition, pages 16210–16220, 2022.
- Personnerf: Personalized reconstruction from photo collections. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 524–533, 2023.
- Modeling clothing as a separate layer for an animatable human avatar. ACM Transactions on Graphics (TOG), 40(6):1–15, 2021.
- Dressing avatars: Deep photorealistic appearance for physically simulated clothing. ACM Transactions on Graphics (TOG), 41(6):1–15, 2022.
- Drivable avatar clothing: Faithful full-body telepresence with dynamic clothing driven by sparse rgb-d input. In ACM SIGGRAPH Asia 2023 Conference Proceedings, pages 1–9, 2023.
- ICON: Implicit Clothed humans Obtained from Normals. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 13296–13306, 2022.
- ECON: Explicit Clothed humans Optimized via Normal integration. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
- Monohuman: Animatable human neural field from monocular video. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 16943–16953, 2023.
- Cyril Zeller. Cloth simulation on the gpu. In ACM SIGGRAPH 2005 Sketches, pages 39–es. ACM New York, NY, USA, 2005.
- Closet: Modeling clothed humans on continuous surface with explicit template decomposition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 501–511, 2023.
- Dynamic neural garments. ACM Transactions on Graphics (TOG), 40(6):1–15, 2021.
- The unreasonable effectiveness of deep features as a perceptual metric. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 586–595, 2018.
- Human performance modeling and rendering via neural animated mesh. ACM Transactions on Graphics (TOG), 41(6):1–17, 2022.
- Structured local radiance fields for human avatar modeling. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 15893–15903, 2022.
- Avatarrex: Real-time expressive full-body avatars. ACM Transactions on Graphics (TOG), 42(4), 2023.
- Deep fashion3d: A dataset and benchmark for 3d garment reconstruction from single images. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part I 16, pages 512–530. Springer, 2020.
- Beijia Chen (2 papers)
- Yuefan Shen (8 papers)
- Qing Shuai (17 papers)
- Xiaowei Zhou (122 papers)
- Kun Zhou (217 papers)
- Youyi Zheng (26 papers)