Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
139 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Dexterous Functional Grasping (2312.02975v1)

Published 5 Dec 2023 in cs.RO, cs.AI, cs.CV, cs.LG, cs.SY, and eess.SY

Abstract: While there have been significant strides in dexterous manipulation, most of it is limited to benchmark tasks like in-hand reorientation which are of limited utility in the real world. The main benefit of dexterous hands over two-fingered ones is their ability to pickup tools and other objects (including thin ones) and grasp them firmly to apply force. However, this task requires both a complex understanding of functional affordances as well as precise low-level control. While prior work obtains affordances from human data this approach doesn't scale to low-level control. Similarly, simulation training cannot give the robot an understanding of real-world semantics. In this paper, we aim to combine the best of both worlds to accomplish functional grasping for in-the-wild objects. We use a modular approach. First, affordances are obtained by matching corresponding regions of different objects and then a low-level policy trained in sim is run to grasp it. We propose a novel application of eigengrasps to reduce the search space of RL using a small amount of human data and find that it leads to more stable and physically realistic motion. We find that eigengrasp action space beats baselines in simulation and outperforms hardcoded grasping in real and matches or outperforms a trained human teleoperator. Results visualizations and videos at https://dexfunc.github.io/

Definition Search Book Streamline Icon: https://streamlinehq.com
References (64)
  1. Motor training at 3 months affects object exploration 12 months later. Developmental Science, 19(6):1058–1066, 2016.
  2. E. J. Gibson. Exploratory behavior in the development of perceiving, acting, and the acquiring of knowledge. Annual review of psychology, 39(1):1–42, 1988.
  3. Motor Development, chapter 4. John Wiley & Sons, Ltd, 2007. ISBN 9780470147658. doi:https://doi.org/10.1002/9780470147658.chpsy0204. URL https://onlinelibrary.wiley.com/doi/abs/10.1002/9780470147658.chpsy0204.
  4. T. Bruce. Learning through play, for babies, toddlers and young children. Hachette UK, 2012.
  5. Fine motor skills during early childhood predict visuospatial deductive reasoning in adolescence. Developmental Psychology, 2022.
  6. Learning dexterous in-hand manipulation. The International Journal of Robotics Research, 39(1):3–20, 2020.
  7. Visual dexterity: In-hand dexterous manipulation from depth. arXiv preprint arXiv:2211.11744, 2022.
  8. Dextreme: Transfer of agile in-hand manipulation from simulation to reality. arXiv preprint arXiv:2210.13702, 2022.
  9. Rotating without seeing: Towards in-hand dexterity through touch. Robotics: Science and Systems, 2023.
  10. Awac: Accelerating online reinforcement learning with offline datasets. arXiv preprint arXiv:2006.09359, 2020.
  11. Deep dynamics models for learning dexterous manipulation. In Conference on Robot Learning, pages 1101–1112. PMLR, 2020.
  12. Leap hand: Low-cost, efficient, and anthropomorphic hand for robot learning. In RSS: Robotics Science and Systems, 2023.
  13. Dinov2: Learning robust visual features without supervision, 2023.
  14. Legged locomotion in challenging terrains using egocentric vision. CoRL, 2022.
  15. Learning robust perceptive locomotion for quadrupedal robots in the wild. Science Robotics, 7(62):eabk2822, 2022.
  16. G. B. Margolis and P. Agrawal. Walk these ways: Tuning robot control for generalization with multiplicity of behavior. In K. Liu, D. Kulic, and J. Ichnowski, editors, Proceedings of The 6th Conference on Robot Learning, volume 205 of Proceedings of Machine Learning Research, pages 22–31. PMLR, 14–18 Dec 2023. URL https://proceedings.mlr.press/v205/margolis23a.html.
  17. Rma: Rapid motor adaptation for legged robots. RSS, 2021.
  18. One-shot transfer of affordance regions? affcorrs! In Conference on Robot Learning, 2022.
  19. Detecting twenty-thousand classes using image-level supervision. In ECCV, 2022.
  20. Conservative q-learning for offline reinforcement learning, 2020.
  21. Learning complex dexterous manipulation with deep reinforcement learning and demonstrations. arXiv preprint arXiv:1709.10087, 2017.
  22. Dexterous grasping via eigengrasps: A low-dimensional approach to a high-complexity problem. In Robotics: Science and systems manipulation workshop-sensing and adapting to the real world, 2007.
  23. Cliport: What and where pathways for robotic manipulation. In Conference on Robot Learning, pages 894–906. PMLR, 2022.
  24. T. Lüddecke and A. Ecker. Image segmentation using text and image prompts. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 7086–7096, 2022.
  25. On dexterity and dexterous manipulation. In 2011 15th International Conference on Advanced Robotics (ICAR), pages 1–7. IEEE, 2011.
  26. Patterns of static prehension in normal hands. The American journal of occupational therapy, 34(7):437–445, 1980.
  27. C. L. MacKenzie and T. Iberall. The grasping hand. Elsevier, 1994.
  28. In-Hand Object Rotation via Rapid Motor Adaptation. In Conference on Robot Learning (CoRL), 2022.
  29. Solving rubik’s cube with a robot hand. arXiv preprint arXiv:1910.07113, 2019.
  30. Dexpoint: Generalizable point cloud reinforcement learning for sim-to-real dexterous manipulation. In Conference on Robot Learning, 2022.
  31. Dexterous imitation made easy: A learning-based framework for efficient dexterous manipulation. arXiv preprint arXiv:2203.13251, 2022.
  32. A. Miller and P. Allen. Graspit! a versatile simulator for robotic grasping. IEEE Robotics & Automation Magazine, 11(4):110–122, 2004. doi:10.1109/MRA.2004.1371616.
  33. D. Berenson and S. S. Srinivasa. Grasp synthesis in cluttered environments for dexterous hands. In Humanoids 2008-8th IEEE-RAS International Conference on Humanoid Robots, pages 189–196. IEEE, 2008.
  34. Dexgraspnet: A large-scale robotic dexterous grasp dataset for general objects based on simulation. arXiv preprint arXiv:2210.02697, 2022.
  35. Gendexgrasp: Generalizable dexterous grasping. arXiv preprint arXiv:2210.00722, 2022.
  36. Modern robotics. Cambridge University Press, 2017.
  37. ContactOpt: Optimizing contact to improve grasps. In Conference on Computer Vision and Pattern Recognition (CVPR), 2021.
  38. P. Mandikal and K. Grauman. Learning dexterous grasping with object-centric visual affordances. In 2021 IEEE International Conference on Robotics and Automation (ICRA), pages 6169–6176, 2021. doi:10.1109/ICRA48506.2021.9561802.
  39. Contactdb: Analyzing and predicting grasp contact via thermal imaging. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2019a.
  40. Contactgrasp: Functional multi-finger grasp synthesis from contact. In 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 2386–2393, 2019b. doi:10.1109/IROS40897.2019.8967960.
  41. Unidexgrasp: Universal robotic dexterous grasping via learning diverse proposal generation and goal-conditioned policy. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 4737–4746, June 2023.
  42. Grasp’d: Differentiable contact-rich grasp synthesis for multi-fingered hands. In Computer Vision–ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part VI, pages 201–221. Springer, 2022.
  43. Generalizable point cloud reinforcement learning for sim-to-real dexterous manipulation. In Deep Reinforcement Learning Workshop NeurIPS 2022, 2022.
  44. VideoDex: Learning Dexterity from Internet Videos. In Conference on Robot Learning (CoRL), 2022.
  45. Dexmv: Imitation learning for dexterous manipulation from human videos. In Computer Vision–ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XXXIX, pages 570–587. Springer, 2022.
  46. Robotic telekinesis: Learning a robotic hand imitator by watching humans on youtube, 2022.
  47. Dexpilot: Vision-based teleoperation of dexterous robotic hand-arm system. In 2020 IEEE International Conference on Robotics and Automation (ICRA), pages 9164–9170. IEEE, 2020.
  48. ARCTIC: A dataset for dexterous bimanual hand-object manipulation. In Proceedings IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
  49. The” something something” video database for learning and evaluating visual common sense. In Proceedings of the IEEE international conference on computer vision, pages 5842–5850, 2017.
  50. Freihand: A dataset for markerless capture of hand pose and shape from single rgb images. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 813–822, 2019.
  51. GRAB: A dataset of whole-body human grasping of objects. In European Conference on Computer Vision (ECCV), 2020. URL https://grab.is.tue.mpg.de.
  52. Contactdb: Analyzing and predicting grasp contact via thermal imaging. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 8709–8719, 2019.
  53. Hoi4d: A 4d egocentric dataset for category-level human-object interaction, 2022.
  54. Grab: A dataset of whole-body human grasping of objects. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part IV 16, pages 581–600. Springer, 2020.
  55. Learning dexterous manipulation from exemplar object trajectories and pre-grasps. In IEEE International Conference on Robotics and Automation 2023, 2023.
  56. Learning to imitate object interactions from internet videos, 2022.
  57. Affordances from human videos as a versatile representation for robotics. In CVPR, 2023.
  58. Affordance diffusion: Synthesizing hand-object interactions. In CVPR, 2023.
  59. Frankmocap: A monocular 3d whole-body pose estimation system via regression and integration. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 1749–1759, 2021.
  60. Hand detection using multiple proposals. In Bmvc, volume 2, page 5, 2011.
  61. Learning robust real-world dexterous grasping policies via implicit shape augmentation. arXiv preprint arXiv:2210.13638, 2022.
  62. Learning continuous grasping function with a dexterous hand from human demonstrations. IEEE Robotics and Automation Letters, 8(5):2882–2889, 2023.
  63. Contactgrasp: Functional multi-finger grasp synthesis from contact. In 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 2386–2393. IEEE, 2019.
  64. Isaac gym: High performance gpu-based physics simulation for robot learning. arXiv preprint arXiv:2108.10470, 2021.
Citations (21)

Summary

  • The paper introduces a modular method that combines internet-based affordance models with eigengrasps to achieve functional tool grasping.
  • It leverages principal components of hand poses to constrain reinforcement learning, simplifying training and enhancing grasp realism.
  • Experimental results show the approach outperforms baselines and can match or exceed human teleoperation in diverse object manipulation tasks.

Abstract Overview

The paper addresses the field of dexterous robotic manipulation, particularly focusing on picking up and functionally grasping tools and objects of daily life such as hammers, drills, and saucepans. The aim is to enable robots to grasp objects in a manner that allows for their effective use afterward, which represents a much higher level of utility and complexity compared to basic in-hand object reorientation tasks that have been the focus of most research in robotic manipulation.

Introduction and Challenges

Dexterous manipulation, the skill humans regularly employ to use tools, has been a challenging arena for robotics. Machines generally use simple two-fingered grippers, which significantly limits the range of objects they can manipulate and the actions they can perform. To use tools effectively, robots need to perform functional grasping, where an object is picked up such that it can be utilized for its designed purpose. This process involves not only a complex series of movements but also the intersection of perception, reasoning, and control.

Methodology

The approach taken is modular, dealing with grasp prediction, execution, and post-grasp trajectory. For predicting pre-grasp poses, the research leverages an affordance model which is informed by large internet datasets to identify plausible contact points on various objects. A novel application of 'eigengrasps', which are essentially principal components of hand poses, is introduced to constrain the action space in reinforcement learning (RL) simulations, reducing complexity while ensuring realistic motion. This combination of internet data for affordance generalization and specialized simulation for policy training bridges the gap between virtual learning environments and real-world application.

Experimental Results

Experiments demonstrated that the proposed method outperformed baselines in simulation and could successfully translate to real-world object manipulation. Notably, it managed to match or even exceed the performance of a trained human teleoperator for some tasks, despite the robotic policy only being trained with hammers. The use of eigengrasps significantly stabilized RL training and improved the physical plausibility of learned grasps. Real-world experiments confirmed the effectiveness of the method on a broad spectrum of objects with varied shapes, sizes, and weights.

This investigation signifies a substantial step toward enabling robots to conduct functional grasping tasks, advancing the practical utility of robotic systems in everyday scenarios, including tool manipulation.

Github Logo Streamline Icon: https://streamlinehq.com