SculptDiff: Learning Robotic Clay Sculpting from Humans with Goal Conditioned Diffusion Policy (2403.10401v1)
Abstract: Manipulating deformable objects remains a challenge within robotics due to the difficulties of state estimation, long-horizon planning, and predicting how the object will deform given an interaction. These challenges are the most pronounced with 3D deformable objects. We propose SculptDiff, a goal-conditioned diffusion-based imitation learning framework that works with point cloud state observations to directly learn clay sculpting policies for a variety of target shapes. To the best of our knowledge this is the first real-world method that successfully learns manipulation policies for 3D deformable objects. For sculpting videos and access to our dataset and hardware CAD models, see the project website: https://sites.google.com/andrew.cmu.edu/imitation-sculpting/home
- K. Kimble, J. Albrecht, M. Zimmerman, and J. Falco, “Performance measures to benchmark the grasping, manipulation, and assembly of deformable objects typical to manufacturing applications,” Frontiers in Robotics and AI, vol. 9, p. 999348, 2022.
- N. Lv, J. Liu, and Y. Jia, “Dynamic modeling and control of deformable linear objects for single-arm and dual-arm robot manipulations,” Transactions on Robotics, vol. 38, no. 4, pp. 2341–2353, 2022.
- F. Liu, Z. Li, Y. Han, J. Lu, F. Richter, and M. C. Yip, “Real-to-sim registration of deformable soft tissue with position-based dynamics for surgical robot autonomy,” in International Conference on Robotics and Automation. IEEE, 2021, pp. 12 328–12 334.
- A. Bartsch, C. Avra, and A. B. Farimani, “SculptBot: Pre-Trained Models for 3D Deformable Object Manipulation,” arXiv preprint arXiv:2309.08728, 2023.
- H. Shi, H. Xu, Z. Huang, Y. Li, and J. Wu, “RoboCraft: Learning to see, simulate, and shape elasto-plastic objects in 3D with graph networks,” The International Journal of Robotics Research, p. 02783649231219020, 2023.
- H. Shi, H. Xu, S. Clarke, Y. Li, and J. Wu, “RoboCook: Long-Horizon Elasto-Plastic Object Manipulation with Diverse Tools,” arXiv preprint arXiv:2306.14447, 2023.
- J. Liu, Z. Li, W. Lin, S. Calinon, K. C. Tan, and F. Chen, “Softgpt: Learn goal-oriented soft object manipulation skills by generative pre-trained heterogeneous graph transformer,” in International Conference on Intelligent Robots and Systems. IEEE, 2023, pp. 4920–4925.
- C. Qi, X. Lin, and D. Held, “Learning closed-loop dough manipulation using a differentiable reset module,” Robotics and Automation Letters, vol. 7, no. 4, pp. 9857–9864, 2022.
- D. A. Pomerleau, “ALVINN: An autonomous land vehicle in a neural network,” Advances in neural information processing systems, vol. 1, 1988.
- J. Pari, N. M. Shafiullah, S. P. Arunachalam, and L. Pinto, “The surprising effectiveness of representation learning for visual imitation,” arXiv preprint arXiv:2112.01511, 2021.
- A. Brohan, N. Brown, J. Carbajal, Y. Chebotar, J. Dabis, C. Finn, K. Gopalakrishnan, K. Hausman, A. Herzog, J. Hsu et al., “Rt-1: Robotics transformer for real-world control at scale,” arXiv preprint arXiv:2212.06817, 2022.
- E. Jang, A. Irpan, M. Khansari, D. Kappler, F. Ebert, C. Lynch, S. Levine, and C. Finn, “Bc-z: Zero-shot task generalization with robotic imitation learning,” in Conference on Robot Learning. PMLR, 2022, pp. 991–1002.
- N. M. Shafiullah, Z. Cui, A. A. Altanzaya, and L. Pinto, “Behavior Transformers: Cloning k𝑘kitalic_k modes with one stone,” Advances in neural information processing systems, vol. 35, pp. 22 955–22 968, 2022.
- T. Z. Zhao, V. Kumar, S. Levine, and C. Finn, “Learning fine-grained bimanual manipulation with low-cost hardware,” arXiv preprint arXiv:2304.13705, 2023.
- A. George and A. B. Farimani, “One ACT Play: Single Demonstration Behavior Cloning with Action Chunking Transformers,” arXiv preprint arXiv:2309.10175, 2023.
- H. Bharadhwaj, J. Vakil, M. Sharma, A. Gupta, S. Tulsiani, and V. Kumar, “Roboagent: Generalization and efficiency in robot manipulation via semantic augmentations and action chunking,” arXiv preprint arXiv:2309.01918, 2023.
- C. Chi, S. Feng, Y. Du, Z. Xu, E. Cousineau, B. Burchfiel, and S. Song, “Diffusion policy: Visuomotor policy learning via action diffusion,” arXiv preprint arXiv:2303.04137, 2023.
- X. Yu, L. Tang, Y. Rao, T. Huang, J. Zhou, and J. Lu, “Point-bert: Pre-training 3d point cloud transformers with masked point modeling,” in Conference on Computer Vision and Pattern Recognition, 2022, pp. 19 313–19 322.
- S. Huo, A. Duan, C. Li, P. Zhou, W. Ma, H. Wang, and D. Navarro-Alarcon, “Keypoint-based planar bimanual shaping of deformable linear objects under environmental constraints with hierarchical action framework,” Robotics and Automation Letters, vol. 7, no. 2, pp. 5222–5229, 2022.
- J. Xiang, H. Dinkel, H. Zhao, N. Gao, B. Coltin, T. Smith, and T. Bretl, “TrackDLO: Tracking Deformable Linear Objects Under Occlusion with Motion Coherence,” Robotics and Automation Letters, 2023.
- A. Caporali, K. Galassi, R. Zanella, and G. Palli, “FASTDLO: Fast deformable linear objects instance segmentation,” Robotics and Automation Letters, vol. 7, no. 4, pp. 9075–9082, 2022.
- J. Zhu, B. Navarro, R. Passama, P. Fraisse, A. Crosnier, and A. Cherubini, “Robotic manipulation planning for shaping deformable linear objects withenvironmental contacts,” Robotics and Automation Letters, vol. 5, no. 1, pp. 16–23, 2019.
- M. Yu, K. Lv, C. Wang, M. Tomizuka, and X. Li, “A coarse-to-fine framework for dual-arm manipulation of deformable linear objects with whole-body obstacle avoidance,” in International Conference on Robotics and Automation. IEEE, 2023, pp. 10 153–10 159.
- W. Zhang, K. Schmeckpeper, P. Chaudhari, and K. Daniilidis, “Deformable linear object prediction using locally linear latent dynamics,” in International Conference on Robotics and Automation. IEEE, 2021, pp. 13 503–13 509.
- W. Yan, A. Vangipuram, P. Abbeel, and L. Pinto, “Learning predictive representations for deformable objects using contrastive estimation,” in Conference on Robot Learning. PMLR, 2021, pp. 564–574.
- Z. Huang, X. Lin, and D. Held, “Mesh-based dynamics with occlusion reasoning for cloth manipulation,” arXiv preprint arXiv:2206.02881, 2022.
- T. Weng, S. M. Bajracharya, Y. Wang, K. Agrawal, and D. Held, “Fabricflownet: Bimanual cloth manipulation with a flow-based policy,” in Conference on Robot Learning. PMLR, 2022, pp. 192–202.
- X. Lin, Y. Wang, Z. Huang, and D. Held, “Learning visible connectivity dynamics for cloth smoothing,” in Conference on Robot Learning. PMLR, 2022, pp. 256–266.
- Z. Huang, Y. Hu, T. Du, S. Zhou, H. Su, J. B. Tenenbaum, and C. Gan, “Plasticinelab: A soft-body manipulation benchmark with differentiable physics,” arXiv preprint arXiv:2104.03311, 2021.
- S. Chen, Y. Liu, S. W. Yao, J. Li, T. Fan, and J. Pan, “Diffsrl: Learning dynamical state representation for deformable object manipulation with differentiable simulation,” Robotics and Automation Letters, vol. 7, no. 4, pp. 9533–9540, 2022.
- X. Lin, Z. Huang, Y. Li, J. B. Tenenbaum, D. Held, and C. Gan, “Diffskill: Skill abstraction from differentiable physics for deformable object manipulations with tools,” International Conference on Learning Representations, 2022.
- S. Li, Z. Huang, T. Chen, T. Du, H. Su, J. B. Tenenbaum, and C. Gan, “DexDeform: Dexterous Deformable Object Manipulation with Human Demonstrations and Differentiable Physics,” International Conference on Learning Representations, 2023.
- B. Thach, B. Y. Cho, A. Kuntz, and T. Hermans, “Learning visual shape control of novel 3D deformable objects from partial-view point clouds,” in International Conference on Robotics and Automation. IEEE, 2022, pp. 8274–8281.
- B. Shen, Z. Jiang, C. Choy, S. Savarese, L. J. Guibas, A. Anandkumar, and Y. Zhu, “Action-conditional implicit visual dynamics for deformable object manipulation,” The International Journal of Robotics Research, p. 02783649231191222, 2023.
- X. Lin, C. Qi, Y. Zhang, Z. Huang, K. Fragkiadaki, Y. Li, C. Gan, and D. Held, “Planning with spatial-temporal abstraction from point clouds for deformable object manipulation,” in Conference on Robot Learning, 2022.
- N. Xuejuan, L. Jingtai, S. Lei, L. Zheng, and C. Xinwei, “Robot 3D sculpturing based on extracted NURBS,” in International Conference on Robotics and Biomimetics. IEEE, 2007, pp. 1936–1941.
- Z. Ma, S. Duenser, C. Schumacher, R. Rust, M. Bächer, F. Gramazio, M. Kohler, and S. Coros, “Robotsculptor: Artist-directed robotic sculpting of clay,” in ACM Symposium on Computational Fabrication, 2020, pp. 1–12.
- M. Zhang, Z. Cheng, S. T. R. Shiu, J. Liang, C. Fang, Z. Ma, and S. J. Wang, “CoSculpt: An AI-Embedded Human-Robot Collaboration System for Sculptural Creation,” in Human Systems Engineering and Design: Future Trends and Applications. AHFE International, 2023.
- S. Li, Z. Huang, T. Du, H. Su, J. B. Tenenbaum, and C. Gan, “Contact points discovery for soft-body manipulations with differentiable physics,” International Conference on Learning Representations, 2022.
- H. Zhu, Y. Wang, D. Huang, W. Ye, W. Ouyang, and T. He, “Point Cloud Matters: Rethinking the Impact of Different Observation Spaces on Robot Learning,” arXiv preprint arXiv:2402.02500, 2024.
- D. Yarats, I. Kostrikov, and R. Fergus, “Image augmentation is all you need: Regularizing deep reinforcement learning from pixels,” in International conference on learning representations, 2020.
- J. Ho, A. Jain, and P. Abbeel, “Denoising diffusion probabilistic models,” Advances in neural information processing systems, vol. 33, pp. 6840–6851, 2020.
- X. Li, Y. Liu, L. Lian, H. Yang, Z. Dong, D. Kang, S. Zhang, and K. Keutzer, “Q-Diffusion: Quantizing Diffusion Models,” in International Conference on Computer Vision, October 2023, pp. 17 535–17 545.
- A. X. Chang, T. Funkhouser, L. Guibas, P. Hanrahan, Q. Huang, Z. Li, S. Savarese, M. Savva, S. Song, H. Su et al., “Shapenet: An information-rich 3d model repository,” arXiv preprint arXiv:1512.03012, 2015.