Prioritized Planning for Target-Oriented Manipulation via Hierarchical Stacking Relationship Prediction (2303.07828v2)
Abstract: In scenarios involving the grasping of multiple targets, the learning of stacking relationships between objects is fundamental for robots to execute safely and efficiently. However, current methods lack subdivision for the hierarchy of stacking relationship types. In scenes where objects are mostly stacked in an orderly manner, they are incapable of performing human-like and high-efficient grasping decisions. This paper proposes a perception-planning method to distinguish different stacking types between objects and generate prioritized manipulation order decisions based on given target designations. We utilize a Hierarchical Stacking Relationship Network (HSRN) to discriminate the hierarchy of stacking and generate a refined Stacking Relationship Tree (SRT) for relationship description. Considering that objects with high stacking stability can be grasped together if necessary, we introduce an elaborate decision-making planner based on the Partially Observable Markov Decision Process (POMDP), which leverages observations and generates the least grasp-consuming decision chain with robustness and is suitable for simultaneously specifying multiple targets. To verify our work, we set the scene to the dining table and augment the REGRAD dataset with a set of common tableware models for network training. Experiments show that our method effectively generates grasping decisions that conform to human requirements, and improves the implementation efficiency compared with existing methods on the basis of guaranteeing the success rate.
- M.-Y. Liu, O. Tuzel, A. Veeraraghavan, Y. Taguchi, T. K. Marks, and R. Chellappa, “Fast object localization and pose estimation in heavy clutter for robotic bin picking,” The International Journal of Robotics Research, vol. 31, no. 8, pp. 951–973, 2012.
- A. Murali, A. Mousavian, C. Eppner, C. Paxton, and D. Fox, “6-dof grasping for target-driven object manipulation in clutter,” in 2020 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2020, pp. 6232–6238.
- D. Fischinger, M. Vincze, and Y. Jiang, “Learning grasps for unknown objects in cluttered scenes,” in 2013 IEEE international conference on robotics and automation. IEEE, 2013, pp. 609–616.
- H. Zhang, X. Lan, X. Zhou, Z. Tian, Y. Zhang, and N. Zheng, “Visual manipulation relationship network for autonomous robotics,” in 2018 IEEE-RAS 18th International Conference on Humanoid Robots (Humanoids). IEEE, 2018, pp. 118–125.
- M. Ding, Y. Liu, C. Yang, and X. Lan, “Visual manipulation relationship detection based on gated graph neural network for robotic grasping,” in 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2022, pp. 1404–1410.
- V. Tchuiev, Y. Miron, and D. Di Castro, “Duqim-net: Probabilistic object hierarchy representation for multi-view manipulation,” in 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2022, pp. 10 470–10 477.
- H. Zhang, D. Yang, H. Wang, B. Zhao, X. Lan, J. Ding, and N. Zheng, “Regrad: A large-scale relational grasp dataset for safe and object-specific robotic grasping in clutter,” IEEE Robotics and Automation Letters, vol. 7, no. 2, pp. 2929–2936, 2022.
- A. Farhadi and A. Sadeghi, “Recognition using visual phrases,” in Computer Vision and Pattern Recognition (CVPR), 2011.
- S. K. Divvala, A. Farhadi, and C. Guestrin, “Learning everything about anything: Webly-supervised visual concept learning,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2014, pp. 3270–3277.
- C. Lu, R. Krishna, M. Bernstein, and L. Fei-Fei, “Visual relationship detection with language priors,” in Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part I 14. Springer, 2016, pp. 852–869.
- K. Liang, Y. Guo, H. Chang, and X. Chen, “Visual relationship detection with deep structural ranking,” in Thirty-Second AAAI Conference on Artificial Intelligence, 2018.
- S. Panda, A. A. Hafez, and C. Jawahar, “Learning support order for manipulation in clutter,” in 2013 IEEE/RSJ international conference on intelligent robots and systems. IEEE, 2013, pp. 809–815.
- C. Yang, X. Lan, H. Zhang, X. Zhou, and N. Zheng, “Visual manipulation relationship detection with fully connected crfs for autonomous robotic grasp,” in 2018 IEEE International Conference on Robotics and Biomimetics (ROBIO). IEEE, 2018, pp. 393–400.
- G. Zuo, J. Tong, H. Liu, W. Chen, and J. Li, “Graph-based visual manipulation relationship reasoning network for robotic grasping,” Frontiers in Neurorobotics, vol. 15, p. 719731, 2021.
- X. Zhu, W. Su, L. Lu, B. Li, X. Wang, and J. Dai, “Deformable detr: Deformable transformers for end-to-end object detection,” arXiv preprint arXiv:2010.04159, 2020.
- S.-K. Kim and M. Likhachev, “Planning for grasp selection of partially occluded objects,” in 2016 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2016, pp. 3971–3978.
- N. P. Garg, D. Hsu, and W. S. Lee, “Learning to grasp under uncertainty using pomdps,” in 2019 International Conference on Robotics and Automation (ICRA). IEEE, 2019, pp. 2751–2757.
- M. Lauri, D. Hsu, and J. Pajarinen, “Partially observable markov decision processes in robotics: A survey,” IEEE Transactions on Robotics, 2022.
- J. Pajarinen and V. Kyrki, “Robotic manipulation of multiple objects as a pomdp,” Artificial Intelligence, vol. 247, pp. 213–228, 2017.
- J. K. Li, D. Hsu, and W. S. Lee, “Act to see and see to act: Pomdp planning for objects search in clutter,” in 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2016, pp. 5701–5707.
- Y. Xiao, S. Katt, A. ten Pas, S. Chen, and C. Amato, “Online planning for target object search in clutter under partial observability,” in 2019 International Conference on Robotics and Automation (ICRA). IEEE, 2019, pp. 8241–8247.
- H. Zhang, Y. Lu, C. Yu, D. Hsu, X. La, and N. Zheng, “Invigorate: Interactive visual grounding and grasping in clutter,” arXiv preprint arXiv:2108.11092, 2021.
- Y. Yang, X. Lou, and C. Choi, “Interactive robotic grasping with attribute-guided disambiguation,” in 2022 International Conference on Robotics and Automation (ICRA). IEEE, 2022, pp. 8914–8920.
- K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770–778.
- S. Ren, K. He, R. Girshick, and J. Sun, “Faster r-cnn: Towards real-time object detection with region proposal networks,” Advances in neural information processing systems, vol. 28, 2015.
- R. Girshick, “Fast r-cnn,” in Proceedings of the IEEE international conference on computer vision, 2015, pp. 1440–1448.
- C. Diuk, A. Cohen, and M. L. Littman, “An object-oriented representation for efficient reinforcement learning,” in Proceedings of the 25th international conference on Machine learning, 2008, pp. 240–247.
- H. Zhang, X. Lan, S. Bai, X. Zhou, Z. Tian, and N. Zheng, “Roi-based robotic grasp detection for object overlapping scenes,” in 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2019, pp. 4768–4775.
- A. X. Chang, T. Funkhouser, L. Guibas, P. Hanrahan, Q. Huang, Z. Li, S. Savarese, M. Savva, S. Song, H. Su, et al., “Shapenet: An information-rich 3d model repository,” arXiv preprint arXiv:1512.03012, 2015.
- B. Calli, A. Singh, J. Bruce, A. Walsman, K. Konolige, S. Srinivasa, P. Abbeel, and A. M. Dollar, “Yale-cmu-berkeley dataset for robotic manipulation research,” The International Journal of Robotics Research, vol. 36, no. 3, pp. 261–268, 2017.
- R. Bridson, “Fast poisson disk sampling in arbitrary dimensions.” SIGGRAPH sketches, vol. 10, no. 1, p. 1, 2007.