Papers
Topics
Authors
Recent
Detailed Answer
Quick Answer
Concise responses based on abstracts only
Detailed Answer
Well-researched responses based on abstracts and relevant paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses
Gemini 2.5 Flash
Gemini 2.5 Flash 43 tok/s
Gemini 2.5 Pro 49 tok/s Pro
GPT-5 Medium 18 tok/s Pro
GPT-5 High 16 tok/s Pro
GPT-4o 95 tok/s Pro
Kimi K2 198 tok/s Pro
GPT OSS 120B 464 tok/s Pro
Claude Sonnet 4 37 tok/s Pro
2000 character limit reached

HMP: Hand Motion Priors for Pose and Shape Estimation from Video (2312.16737v1)

Published 27 Dec 2023 in cs.CV

Abstract: Understanding how humans interact with the world necessitates accurate 3D hand pose estimation, a task complicated by the hand's high degree of articulation, frequent occlusions, self-occlusions, and rapid motions. While most existing methods rely on single-image inputs, videos have useful cues to address aforementioned issues. However, existing video-based 3D hand datasets are insufficient for training feedforward models to generalize to in-the-wild scenarios. On the other hand, we have access to large human motion capture datasets which also include hand motions, e.g. AMASS. Therefore, we develop a generative motion prior specific for hands, trained on the AMASS dataset which features diverse and high-quality hand motions. This motion prior is then employed for video-based 3D hand motion estimation following a latent optimization approach. Our integration of a robust motion prior significantly enhances performance, especially in occluded scenarios. It produces stable, temporally consistent results that surpass conventional single-frame methods. We demonstrate our method's efficacy via qualitative and quantitative evaluations on the HO3D and DexYCB datasets, with special emphasis on an occlusion-focused subset of HO3D. Code is available at https://hmp.is.tue.mpg.de

Definition Search Book Streamline Icon: https://streamlinehq.com
References (15)
  1. Deformer: Dynamic fusion transformer for robust hand pose estimation. ArXiv, abs/2303.04991, 2023.
  2. Honnotate: A method for 3D annotation of hand and object poses. In CVPR, pages 3196–3206, 2020.
  3. HO-3D-v3: Improving the accuracy of hand-object annotations of the HO-3D dataset, 2021.
  4. Leveraging photometric consistency over time for sparsely supervised hand-object reconstruction. In CVPR, pages 571–580, 2020.
  5. Learning joint reconstruction of hands and manipulated objects. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019.
  6. NeMF: Neural motion fields for kinematic animation. In NeurIPS, 2022.
  7. Adam: A method for stochastic optimization. In ICLR, 2014.
  8. Semi-supervised 3D hand-object poses estimation with interactions in time. In CVPR, pages 14687–14697, 2021.
  9. MediaPipe: A framework for building perception pipelines, 2019.
  10. AMASS: Archive of motion capture as surface shapes. In ICCV, 2019.
  11. Handoccnet: Occlusion-robust 3D hand mesh estimation network. In CVPR, pages 1496–1505, 2022.
  12. HuMoR: 3d human motion model for robust pose estimation. In ICCV, 2021.
  13. Pymaf-x: Towards well-aligned full-body model regression from monocular images. IEEE TPAMI, 2023.
  14. On the continuity of rotation representations in neural networks. In CVPR, pages 5745–5753, 2019.
  15. TempCLR: Reconstructing hands via time-coherent contrastive learning. In 3DV, 2022.
Citations (6)
List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Follow-Up Questions

We haven't generated follow-up questions for this paper yet.