Grasp Anything: Combining Teacher-Augmented Policy Gradient Learning with Instance Segmentation to Grasp Arbitrary Objects (2403.10187v1)
Abstract: Interactive grasping from clutter, akin to human dexterity, is one of the longest-standing problems in robot learning. Challenges stem from the intricacies of visual perception, the demand for precise motor skills, and the complex interplay between the two. In this work, we present Teacher-Augmented Policy Gradient (TAPG), a novel two-stage learning framework that synergizes reinforcement learning and policy distillation. After training a teacher policy to master the motor control based on object pose information, TAPG facilitates guided, yet adaptive, learning of a sensorimotor policy, based on object segmentation. We zero-shot transfer from simulation to a real robot by using Segment Anything Model for promptable object segmentation. Our trained policies adeptly grasp a wide variety of objects from cluttered scenarios in simulation and the real world based on human-understandable prompts. Furthermore, we show robust zero-shot transfer to novel objects. Videos of our experiments are available at \url{https://maltemosbach.github.io/grasp_anything}.
- T. Chen, J. Xu, and P. Agrawal, “A system for general in-hand object re-orientation,” in Conference on Robot Learning (CoRL). PMLR, 2022, pp. 297–307.
- T. Chen, M. Tippur, S. Wu, V. Kumar, E. H. Adelson, and P. Agrawal, “Visual dexterity: In-hand reorientation of novel and complex object shapes,” Science Robotics, vol. 8, no. 84, 2023.
- A. Krizhevsky, I. Sutskever, and G. E. Hinton, “ImageNet classification with deep convolutional neural networks,” Advances in Neural Information Processing Systems (NeurIPS), vol. 25, 2012.
- A. Kirillov, E. Mintun, N. Ravi, H. Mao, C. Rolland, L. Gustafson, T. Xiao, S. Whitehead, A. C. Berg, W.-Y. Lo et al., “Segment anything,” arXiv preprint arXiv:2304.02643, 2023.
- X. Zou, Z.-Y. Dou, J. Yang, Z. Gan, L. Li, C. Li, X. Dai, H. Behl, J. Wang, L. Yuan et al., “Generalized decoding for pixel, image, and language,” in Conference on Computer Vision and Pattern Recognition (CVPR). IEEE/CVF, 2023, pp. 15 116–15 127.
- X. Wang, X. Zhang, Y. Cao, W. Wang, C. Shen, and T. Huang, “SegGPT: Segmenting everything in context,” arXiv preprint arXiv:2304.03284, 2023.
- Y. Cheng, L. Li, Y. Xu, X. Li, Z. Yang, W. Wang, and Y. Yang, “Segment and track anything,” arXiv preprint arXiv:2305.06558, 2023.
- S. Levine, P. Pastor, A. Krizhevsky, J. Ibarz, and D. Quillen, “Learning hand-eye coordination for robotic grasping with deep learning and large-scale data collection,” The International Journal of Robotics Research (IJRR), vol. 37, no. 4-5, pp. 421–436, 2018.
- D. Kalashnikov, A. Irpan, P. Pastor, J. Ibarz, A. Herzog, E. Jang, D. Quillen, E. Holly, M. Kalakrishnan, V. Vanhoucke, and S. Levine, “Scalable deep reinforcement learning for vision-based robotic manipulation,” in 2nd Annual Conference on Robot Learning (CoRL), ser. Proceedings of Machine Learning Research, vol. 87. PMLR, 2018, pp. 651–673.
- A. Zeng, S. Song, S. Welker, J. Lee, A. Rodriguez, and T. Funkhouser, “Learning synergies between pushing and grasping with self-supervised deep reinforcement learning,” in International Conference on Intelligent Robots and Systems (IROS). IEEE, 2018, pp. 4238–4245.
- M. Vecerik, T. Hester, J. Scholz, F. Wang, O. Pietquin, B. Piot, N. Heess, T. Rothörl, T. Lampe, and M. Riedmiller, “Leveraging demonstrations for deep reinforcement learning on robotics problems with sparse rewards,” arXiv preprint arXiv:1707.08817, 2017.
- A. Rajeswaran, V. Kumar, A. Gupta, G. Vezzani, J. Schulman, E. Todorov, and S. Levine, “Learning complex dexterous manipulation with deep reinforcement learning and demonstrations,” in Robotics: Science and Systems (RSS), 2018.
- D. Pavlichenko and S. Behnke, “Deep reinforcement learning of dexterous pre-grasp manipulation for human-like functional categorical grasping,” in 19th IEEE International Conference on Automation Science and Engineering (CASE), 2023.
- D. Quillen, E. Jang, O. Nachum, C. Finn, J. Ibarz, and S. Levine, “Deep reinforcement learning for vision-based robotic grasping: A simulated comparative evaluation of off-policy methods,” in International Conference on Robotics and Automation (ICRA). IEEE, 2018, pp. 6284–6291.
- A. Nair, B. McGrew, M. Andrychowicz, W. Zaremba, and P. Abbeel, “Overcoming exploration in reinforcement learning with demonstrations,” in International Conference on Robotics and Automation (ICRA). IEEE, 2018, pp. 6292–6299.
- D. A. Pomerleau, “ALVINN: An autonomous land vehicle in a neural network,” Advances in Neural Information Processing Systems (NeurIPS), vol. 1, 1988.
- M. Mosbach and S. Behnke, “Learning generalizable tool use with non-rigid grasp-pose registration,” in 19th IEEE International Conference on Automation Science and Engineering (CASE), 2023.
- W. M. Czarnecki, R. Pascanu, S. Osindero, S. Jayakumar, G. Swirszcz, and M. Jaderberg, “Distilling policy distillation,” in International Conference on Artificial Intelligence and Statistics (AISTATS). PMLR, 2019, pp. 1331–1340.
- S. Ross, G. Gordon, and D. Bagnell, “A reduction of imitation learning and structured prediction to no-regret online learning,” in International Conference on Artificial Intelligence and Statistics (AISTATS). JMLR Workshop and Conference Proceedings, 2011, pp. 627–635.
- D. Chen, B. Zhou, V. Koltun, and P. Krähenbühl, “Learning by cheating,” in Conference on Robot Learning (CoRL). PMLR, 2020, pp. 66–75.
- V. Makoviychuk, L. Wawrzyniak, Y. Guo, M. Lu, K. Storey, M. Macklin, D. Hoeller, N. Rudin, A. Allshire, A. Handa et al., “Isaac Gym: High performance GPU-based physics simulation for robot learning,” arXiv preprint arXiv:2108.10470, 2021.
- J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov, “Proximal policy optimization algorithms,” arXiv preprint arXiv:1707.06347, 2017.
- J. Schulman, S. Levine, P. Abbeel, M. Jordan, and P. Moritz, “Trust region policy optimization,” in International Conference on Machine Learning (ICML). PMLR, 2015, pp. 1889–1897.
- M. Mosbach and S. Behnke, “Efficient representations of object geometry for reinforcement learning of interactive grasping policies,” in International Conference on Robotic Computing (IRC), 2022.
- B. Calli, A. Singh, A. Walsman, S. Srinivasa, P. Abbeel, and A. M. Dollar, “The YCB object and model set: Towards common benchmarks for manipulation research,” in International Conference on Advanced Robotics (ICAR). IEEE, 2015, pp. 510–517.
- D.-A. Clevert, T. Unterthiner, and S. Hochreiter, “Fast and accurate deep network learning by exponential linear units (ELUs),” arXiv preprint arXiv:1511.07289, 2015.
- C. R. Qi, H. Su, K. Mo, and L. J. Guibas, “PointNet: Deep learning on point sets for 3D classification and segmentation,” in Conference on Computer Vision and Pattern Recognition (CVPR). IEEE/CVF, 2017, pp. 652–660.
- Malte Mosbach (7 papers)
- Sven Behnke (190 papers)