Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
158 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Hand-Object Interaction Controller (HOIC): Deep Reinforcement Learning for Reconstructing Interactions with Physics (2405.02676v1)

Published 4 May 2024 in cs.CV and cs.GR

Abstract: Hand manipulating objects is an important interaction motion in our daily activities. We faithfully reconstruct this motion with a single RGBD camera by a novel deep reinforcement learning method to leverage physics. Firstly, we propose object compensation control which establishes direct object control to make the network training more stable. Meanwhile, by leveraging the compensation force and torque, we seamlessly upgrade the simple point contact model to a more physical-plausible surface contact model, further improving the reconstruction accuracy and physical correctness. Experiments indicate that without involving any heuristic physical rules, this work still successfully involves physics in the reconstruction of hand-object interactions which are complex motions hard to imitate with deep reinforcement learning. Our code and data are available at https://github.com/hu-hy17/HOIC.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (61)
  1. Contact and friction simulation for computer graphics. In ACM SIGGRAPH 2022 Courses. 1–172.
  2. Learning dexterous in-hand manipulation. The International Journal of Robotics Research 39, 1 (2020), 3–20.
  3. Motion capture of hands in action using discriminative salient points. In European Conference on Computer Vision. Springer, 640–653.
  4. ContactPose: A Dataset of Grasps with Object Contact and Hand Pose. In The European Conference on Computer Vision (ECCV).
  5. Stability of surface contacts for humanoid robots: Closed-form formulae of the contact wrench cone for rectangular support areas. In 2015 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 5107–5112.
  6. DexYCB: A Benchmark for Capturing Hand Grasping of Objects. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
  7. Tracking and reconstructing hand object interactions from point cloud sequences in the wild. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 37. 304–312.
  8. Synthesizing Dexterous Nonprehensile Pregrasp for Ungraspable Objects. In ACM SIGGRAPH 2023 Conference Proceedings. 1–10.
  9. A system for general in-hand object re-orientation. In Conference on Robot Learning. PMLR, 297–307.
  10. Towards human-level bimanual dexterous manipulation with reinforcement learning. Advances in Neural Information Processing Systems 35 (2022), 5150–5163.
  11. gSDF: Geometry-Driven Signed Distance Functions for 3D Hand-Object Reconstruction. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 12890–12900.
  12. D-Grasp: Physically Plausible Dynamic Grasp Synthesis for Hand-Object Interactions. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 20577–20586.
  13. ARCTIC: A Dataset for Dexterous Bimanual Hand-Object Manipulation. In Proceedings IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
  14. Supertrack: Motion tracking for physically simulated characters using supervised learning. ACM Transactions on Graphics (TOG) 40, 6 (2021), 1–13.
  15. First-Person Hand Action Benchmark with RGB-D Videos and 3D Hand Pose Annotations. In Proceedings of Computer Vision and Pattern Recognition (CVPR).
  16. HOnnotate: A method for 3D Annotation of Hand and Object Poses. In CVPR.
  17. Learning joint reconstruction of hands and manipulated objects. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 11807–11816.
  18. Jonathan Ho and Stefano Ermon. 2016. Generative adversarial imitation learning. Advances in neural information processing systems 29 (2016).
  19. Physical interaction: Reconstructing hand-object interactions with physics. In SIGGRAPH Asia 2022 Conference Papers. 1–9.
  20. Sumit Jain and C Karen Liu. 2011. Controlling physics-based characters using soft contacts. In Proceedings of the 2011 SIGGRAPH Asia Conference. 1–10.
  21. Grasping field: Learning implicit representations for human grasps. In 2020 International Conference on 3D Vision (3DV). IEEE, 333–344.
  22. Paul G Kry and Dinesh K Pai. 2006. Interaction capture and synthesis. ACM Transactions on Graphics (TOG) 25, 3 (2006), 872–880.
  23. Nikolaos Kyriazis and Antonis Argyros. 2013. Physically plausible 3d scene : The single actor hypothesis. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 9–16.
  24. C Karen Liu. 2009. Dextrous manipulation from a grasping pose. In ACM SIGGRAPH 2009 papers. 1–6.
  25. Libin Liu and Jessica Hodgins. 2018. Learning basketball dribbling skills using trajectory optimization and deep reinforcement learning. ACM Transactions on Graphics (TOG) 37, 4 (2018), 1–14.
  26. Imitation from observation: Learning to imitate behaviors from raw video via context translation. In 2018 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 1118–1125.
  27. Stefan Luding. 2008. Cohesive, frictional powders: contact models for tension. Granular matter 10, 4 (2008), 235–246.
  28. Kevin M Lynch and Frank C Park. 2017. Modern robotics. Cambridge University Press.
  29. Contact-invariant optimization for hand manipulation. In Proceedings of the ACM SIGGRAPH/Eurographics symposium on computer animation. 137–144.
  30. Full dof tracking of a hand interacting with an object by modeling occlusions and physical constraints. In 2011 International Conference on Computer Vision. IEEE, 2088–2095.
  31. Deepmimic: Example-guided deep reinforcement learning of physics-based character skills. ACM Transactions On Graphics (TOG) 37, 4 (2018), 1–14.
  32. Ase: Large-scale reusable adversarial skill embeddings for physically simulated characters. ACM Transactions On Graphics (TOG) 41, 4 (2022), 1–17.
  33. Amp: Adversarial motion priors for stylized physics-based character control. ACM Transactions on Graphics (ToG) 40, 4 (2021), 1–20.
  34. Dexmv: Imitation learning for dexterous manipulation from human videos. In European Conference on Computer Vision. Springer, 570–587.
  35. State-only imitation learning for dexterous manipulation. In 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 7865–7871.
  36. Learning complex dexterous manipulation with deep reinforcement learning and demonstrations. arXiv preprint arXiv:1709.10087 (2017).
  37. Depth-based tracking with physical constraints for robot manipulation. In 2015 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 119–126.
  38. Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347 (2017).
  39. Time-contrastive networks: Self-supervised learning from video. In 2018 IEEE international conference on robotics and automation (ICRA). IEEE, 1134–1141.
  40. PhysCap: physically plausible monocular 3D motion capture in real time. ACM Transactions on Graphics 39 (dec 2020).
  41. Real-time joint tracking of a hand manipulating an object from rgb-d input. In European Conference on Computer Vision. Springer, 294–310.
  42. Introduction to reinforcement learning. Vol. 135. MIT press Cambridge.
  43. H+ o: Unified egocentric recognition of 3d hand-object poses and interactions. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 4511–4520.
  44. Calm: Conditional adversarial latent models for directable virtual characters. In ACM SIGGRAPH 2023 Conference Proceedings. 1–9.
  45. Mujoco: A physics engine for model-based control. In 2012 IEEE/RSJ international conference on intelligent robots and systems. IEEE, 5026–5033.
  46. Aggeliki Tsoli and Antonis A Argyros. 2018. Joint 3D tracking of a deformable object in interaction with a hand. In Proceedings of the European Conference on Computer Vision (ECCV). 484–500.
  47. Capturing hands in action using discriminative salient points and physics simulation. International Journal of Computer Vision 118, 2 (2016), 172–193.
  48. Video-based hand manipulation capture through composite motion control. ACM Transactions on Graphics (TOG) 32, 4 (2013), 1–14.
  49. Approximate convex decomposition for 3d meshes with collision-aware concavity and tree search. ACM Transactions on Graphics (TOG) 41, 4 (2022), 1–18.
  50. QuestSim: Human motion tracking from sparse sensors with simulated avatars. In SIGGRAPH Asia 2022 Conference Papers. 1–8.
  51. Learning Diverse and Physically Feasible Dexterous Grasps with Generative Model and Bilevel Optimization. In Conference on Robot Learning. PMLR, 1938–1948.
  52. Physics-based human motion estimation and synthesis from videos. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 11532–11541.
  53. Learning to use chopsticks in diverse gripping styles. ACM Transactions on Graphics (TOG) 41, 4 (2022), 1–17.
  54. Yuting Ye and C Karen Liu. 2012. Synthesis of detailed hand manipulations using contact sampling. ACM Transactions on Graphics (ToG) 31, 4 (2012), 1–10.
  55. Physical Inertial Poser (PIP): Physics-aware Real-time Human Motion Tracking from Sparse Inertial Sensors. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
  56. Ye Yuan and Kris Kitani. 2020. Residual force control for agile human behavior imitation and extended motion synthesis. Advances in Neural Information Processing Systems 33 (2020), 21763–21774.
  57. Simpoe: Simulated character control for 3d human pose estimation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 7159–7169.
  58. Manipnet: neural manipulation synthesis with a hand-object spatial representation. ACM Transactions on Graphics (ToG) 40, 4 (2021), 1–14.
  59. Single Depth View Based Real-Time Reconstruction of Hand-Object Interactions. ACM Transactions on Graphics (TOG) 40, 3 (2021), 1–12.
  60. Stability-driven contact reconstruction from monocular color images. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 1643–1653.
  61. Yu Zheng and Katsu Yamane. 2013. Human motion tracking control with strict contact force constraints for floating-base humanoid robots. In 2013 13th IEEE-RAS International Conference on Humanoid Robots (Humanoids). IEEE, 34–41.
Citations (2)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets