Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

ShapeBoost: Boosting Human Shape Estimation with Part-Based Parameterization and Clothing-Preserving Augmentation (2403.01345v1)

Published 2 Mar 2024 in cs.CV

Abstract: Accurate human shape recovery from a monocular RGB image is a challenging task because humans come in different shapes and sizes and wear different clothes. In this paper, we propose ShapeBoost, a new human shape recovery framework that achieves pixel-level alignment even for rare body shapes and high accuracy for people wearing different types of clothes. Unlike previous approaches that rely on the use of PCA-based shape coefficients, we adopt a new human shape parameterization that decomposes the human shape into bone lengths and the mean width of each part slice. This part-based parameterization technique achieves a balance between flexibility and validity using a semi-analytical shape reconstruction algorithm. Based on this new parameterization, a clothing-preserving data augmentation module is proposed to generate realistic images with diverse body shapes and accurate annotations. Experimental results show that our method outperforms other state-of-the-art methods in diverse body shape situations as well as in varied clothing situations.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (52)
  1. Recovering 3D human pose from monocular images. TPAMI, 28(1): 44–58.
  2. CLOTH3D: clothed 3d humans. In ECCV, 344–359. Springer.
  3. BEDLAM: A Synthetic Dataset of Bodies Exhibiting Detailed Lifelike Animated Motion. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 8726–8737.
  4. Keep it SMPL: Automatic estimation of 3D human pose and shape from a single image. In ECCV.
  5. Pose2mesh: Graph convolutional network for 3d human pose and mesh recovery from a 2d human pose. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part VII 16, 769–787. Springer.
  6. Accurate 3D body shape regression using metric and semantic attributes. In CVPR, 2718–2728.
  7. Monocular expressive body regression through body-driven attention. In ECCV, 20–40. Springer.
  8. Learned Vertex Descent: A New Direction for 3D Human Model Fitting. In ECCV.
  9. Learning to regress bodies from images using differentiable semantic rendering. In ICCV, 11250–11259.
  10. Instaboost: Boosting instance segmentation via probability map guided copy-pasting. In ICCV, 682–691.
  11. Estimating human shape and pose from a single image. In ICCV, 1381–1388. IEEE.
  12. Learning an infant body model from RGB-D data for accurate full body motion analysis. In MICCAI, 792–800. Springer.
  13. Learning to train with synthetic humans. In Pattern Recognition, 609–623. Springer.
  14. Human3. 6m: Large scale datasets and predictive methods for 3d human sensing in natural environments. TPAMI.
  15. Exemplar fine-tuning for 3d human model fitting towards in-the-wild 3d human pose estimation. In 3DV.
  16. End-to-end recovery of human shape and pose. In CVPR.
  17. VIBE: Video inference for human body pose and shape estimation. In CVPR.
  18. PARE: Part attention regressor for 3D human body estimation. In ICCV, 11127–11137.
  19. Learning to reconstruct 3D human pose and shape via model-fitting in the loop. In ICCV.
  20. Convolutional mesh regression for single-image human shape reconstruction. In CVPR, 4501–4510.
  21. NIKI: Neural Inverse Kinematics with Invertible Neural Networks for 3D Human Pose and Shape Estimation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 12933–12942.
  22. HybrIK-X: Hybrid Analytical-Neural Inverse Kinematics for Whole-body Mesh Recovery. arXiv preprint arXiv:2304.05690.
  23. D&D: Learning Human Dynamics from Dynamic Camera. In ECCV.
  24. Hybrik: A hybrid analytical-neural inverse kinematics solution for 3d human pose and shape estimation. In CVPR, 3383–3393.
  25. Cliff: Carrying location information in full frames into human pose and shape estimation. In ECCV, 590–606. Springer.
  26. Shape-aware human pose and shape reconstruction using multi-view images. In ICCV, 4352–4362.
  27. End-to-end human pose and mesh reconstruction with transformers. In CVPR, 1954–1963.
  28. Mesh graphormer. In ICCV, 12939–12948.
  29. Microsoft coco: Common objects in context. In ECCV.
  30. SMPL: A skinned multi-person linear model. TOG.
  31. 3D Human Mesh Estimation from Virtual Markers. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 534–543.
  32. AMASS: Archive of motion capture as surface shapes. In ICCV, 5442–5451.
  33. I2l-meshnet: Image-to-lixel prediction network for accurate 3d human pose and mesh estimation from a single rgb image. In ECCV, 752–768. Springer.
  34. On self-contact and human pose. In CVPR, 9990–9999.
  35. Neural body fitting: Unifying deep learning and model based human pose and shape estimation. In 3DV, 484–494. IEEE.
  36. Star: Sparse trained articulated human body regressor. In ECCV, 598–613. Springer.
  37. AGORA: Avatars in geography optimized for regression analysis. In CVPR, 13468–13478.
  38. Expressive body capture: 3d hands, face, and body from a single image. In CVPR.
  39. Learning to estimate 3D human pose and shape from a single color image. In CVPR, 459–468.
  40. 3dpeople: Modeling the geometry of dressed humans. In ICCV, 2242–2251.
  41. Human body measurement estimation with adversarial augmentation. In 2022 International Conference on 3D Vision (3DV), 219–230. IEEE.
  42. Shape of You: Precise 3D shape estimations for diverse body types. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 3519–3523.
  43. Synthetic Training for Accurate 3D Human Pose and Shape Estimation in the Wild. In British Machine Vision Conference (BMVC).
  44. Hierarchical kinematic probability distributions for 3D human shape and pose estimation from images in the wild. In ICCV, 11219–11229.
  45. Probabilistic 3D human shape and pose estimation from multiple unconstrained images in the wild. In CVPR, 16094–16104.
  46. Self-supervised learning of motion capture. NeurIPS, 30.
  47. Bodynet: Volumetric inference of 3d human body shapes. In ECCV, 20–36.
  48. Learning from synthetic humans. In CVPR, 109–117.
  49. Recovering accurate 3d human pose in the wild using imus and a moving camera. In ECCV.
  50. Deep high-resolution representation learning for visual recognition. TPAMI, 43(10): 3349–3364.
  51. InfiniteForm: A synthetic, minimal bias dataset for fitness applications. arXiv preprint arXiv:2110.01330.
  52. PyMAF-X: Towards well-aligned full-body model regression from monocular images. arXiv preprint arXiv:2207.06400.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Siyuan Bian (9 papers)
  2. Jiefeng Li (22 papers)
  3. Jiasheng Tang (16 papers)
  4. Cewu Lu (203 papers)
Citations (2)

Summary

  • The paper introduces ShapeBoost, a novel framework that improves human shape estimation using part-based parameterization and realistic clothing-preserving data augmentation.
  • The new part-based parameterization segments the human body into distinct parts, enhancing local shape representation and mitigating overfitting compared to global PCA methods.
  • The clothing-preserving augmentation module generates diverse training images, significantly boosting accuracy in pixel-level alignment for extreme body types.

Advancing Human Shape Estimation through Part-based Parameterization and Data Augmentation

Overview

The field of computer vision has long sought to accurately recover human shapes from monocular RGB images. This task, while crucial for various applications such as virtual reality and augmented reality, presents significant challenges due to the diversity of human body shapes and the complexity introduced by clothing. This work introduces ShapeBoost, a novel framework that significantly enhances human shape recovery by deploying a unique part-based shape parameterization and a cutting-edge clothing-preserving data augmentation technique.

Part-based Shape Parameterization

Traditional methods often depend on PCA-based shape coefficients to describe human shapes, a technique that provides a global descriptor but lacks local specificity and interpretability. ShapeBoost departs from this approach by proposing a new way to parameterize human shapes. The novel parameterization segments the human body into distinct parts, describing each segment with bone lengths and the mean width of part slices. This method yields a more descriptive and locally relevant representation of human shapes, which is particularly advantageous for learning from local image features and mitigating overfitting issues.

Clothing-preserving Data Augmentation

One of the perpetual challenges in human shape estimation is the lack of image data featuring diverse body shapes, especially for individuals with extreme body types. ShapeBoost addresses this limitation with a clothing-preserving data augmentation module that generates realistic images of various human shapes without altering clothing details. The process involves segmenting the human body from an image, applying a transformation to change its shape, and then pasting it back onto the original background. This approach ensures the generation of accurate, diverse body shape data, improving the model's performance in estimating shapes under different clothing types.

Experimental Results

ShapeBoost's effectiveness is underscored by its performance on benchmark datasets, where it outperforms other state-of-the-art methods in estimating human shapes for people with extreme body types and those wearing various types of clothing. The framework demonstrates superior accuracy in pixel-level alignment for diverse body shapes, validating the potential of part-based parameterization and the proposed data augmentation technique.

Implications and Future Directions

The introduction of ShapeBoost heralds a significant step forward in the field of human shape estimation. The framework's ability to accurately estimate human shapes under challenging conditions—such as varied clothing and extreme body shapes—could have profound implications for the development of more immersive and interactive virtual and augmented reality experiences.

Moreover, the novelty of the part-based shape parameterization opens new avenues for research, suggesting that further exploration and refinement of this approach could lead to even more accurate human shape recovery techniques. The success of the clothing-preserving data augmentation module also indicates potential applications beyond human shape estimation, including generating synthetic datasets for training other computer vision models.

Given the promising results demonstrated by ShapeBoost, future work may explore extending the framework's capabilities to more complex human poses and interactions, improving the efficiency and robustness of the shape reconstruction algorithm, and further refining the data augmentation techniques to cover a broader spectrum of human shapes and clothing types.

Acknowledgments

The recognition of the supporting entities behind ShapeBoost, including research institutes and funding bodies, underscores the collaborative effort required to push the boundaries of what's achievable in human shape estimation and the broader field of computer vision.