Mobile ALOHA: Learning Bimanual Mobile Manipulation with Low-Cost Whole-Body Teleoperation (2401.02117v1)
Abstract: Imitation learning from human demonstrations has shown impressive performance in robotics. However, most results focus on table-top manipulation, lacking the mobility and dexterity necessary for generally useful tasks. In this work, we develop a system for imitating mobile manipulation tasks that are bimanual and require whole-body control. We first present Mobile ALOHA, a low-cost and whole-body teleoperation system for data collection. It augments the ALOHA system with a mobile base, and a whole-body teleoperation interface. Using data collected with Mobile ALOHA, we then perform supervised behavior cloning and find that co-training with existing static ALOHA datasets boosts performance on mobile manipulation tasks. With 50 demonstrations for each task, co-training can increase success rates by up to 90%, allowing Mobile ALOHA to autonomously complete complex mobile manipulation tasks such as sauteing and serving a piece of shrimp, opening a two-door wall cabinet to store heavy cooking pots, calling and entering an elevator, and lightly rinsing a used pan using a kitchen faucet. Project website: https://mobile-aloha.github.io
- Fetch robot. https://docs.fetchrobotics.com/teleop.html.
- Hello robot stretch. https://github.com/hello-robot/stretch_fisheye_web_interface.
- Viperx 300 6dof. https://www.trossenrobotics.com/viperx-300-robot-arm.aspx.
- Do as i can and not as i say: Grounding language in robotic affordances. In arXiv preprint arXiv:2204.01691, 2022.
- Human to robot whole-body motion transfer. In 2020 IEEE-RAS 20th International Conference on Humanoid Robots (Humanoids), 2021.
- What happened at the darpa robotics challenge finals. The DARPA robotics challenge finals: Humanoid robots to the rescue.
- Hierarchical neural dynamic policies. RSS, 2021.
- Human-to-robot imitation in the wild. arXiv preprint arXiv:2207.09450, 2022.
- A mobile manipulation system for one-shot teaching of complex tasks in homes. In 2020 IEEE International Conference on Robotics and Automation (ICRA), 2020.
- Roboagent: Towards sample efficient robot manipulation with semantic augmentations and action chunking, 2023.
- Robocat: A self-improving foundation agent for robotic manipulation. arXiv preprint arXiv:2306.11706, 2023.
- Rt-1: Robotics transformer for real-world control at scale. In arXiv preprint arXiv:2212.06817, 2022.
- Rt-2: Vision-language-action models transfer web knowledge to robotic control. In arXiv preprint arXiv:2307.15818, 2023.
- Humanoid robot teleoperation with vibrotactile based balancing feedback. In Haptics: Neuroscience, Devices, Modeling, and Applications: 9th International Conference, EuroHaptics 2014, Versailles, France, June 24-26, 2014, Proceedings, Part II 9, 2014.
- Humanoid loco-manipulation of pushed carts utilizing virtual reality teleoperation. In ASME International Mechanical Engineering Congress and Exposition, 2021.
- Learning generalizable robotic reward functions from" in-the-wild" human videos. arXiv preprint arXiv:2103.16817, 2021.
- Footstep planning for the honda asimo humanoid. In ICRA, 2005.
- Diffusion policy: Visuomotor policy learning via action diffusion. In Proceedings of Robotics: Science and Systems (RSS), 2023.
- Team janus humanoid avatar: A cybernetic avatar to embody human telepresence. In Toward Robot Avatars: Perspectives on the ANA Avatar XPRIZE Competition, RSS Workshop, 2022.
- Open X-Embodiment: Robotic learning datasets and RT-X models. https://arxiv.org/abs/2310.08864, 2023.
- From play to policy: Conditional behavior generation from uncurated robot data. arXiv preprint arXiv:2210.10047, 2022.
- icub3 avatar system. arXiv preprint arXiv:2203.06972, 2022.
- Whole-body geometric retargeting for humanoid robots. In 2019 IEEE-RAS 19th International Conference on Humanoid Robots (Humanoids), 2019.
- Model-based inverse reinforcement learning from visual demonstrations. In Conference on Robot Learning, pages 1930–1942. PMLR, 2021.
- Transformers for one-shot visual imitation. In Conference on Robot Learning, 2020.
- Legibility and predictability of robot motion. In 2013 8th ACM/IEEE International Conference on Human-Robot Interaction (HRI), 2013.
- One-shot imitation learning. ArXiv, abs/1703.07326, 2017.
- Bridge data: Boosting generalization of robotic skills with cross-domain datasets. ArXiv, abs/2109.13396, 2021.
- Perceptual values from observation. arXiv preprint arXiv:1905.07861, 2019.
- Learning manipulation skills from a single demonstration. The International Journal of Robotics Research, 37(1):137–154, 2018.
- Rh20t: A comprehensive robotic dataset for learning diverse skills in one-shot. In Towards Generalist Robots: Learning Paradigms for Scalable Skill Acquisition@ CoRL2023, 2023a.
- Low-cost exoskeletons for learning whole-arm manipulation in the wild. arXiv preprint arXiv:2309.14975, 2023b.
- Optimization based full body control for the atlas robot. In International Conference on Humanoid Robots, 2014.
- One-shot visual imitation learning via meta-learning. In Conference on robot learning, 2017.
- Implicit behavioral cloning. ArXiv, abs/2109.00137, 2021.
- Deep whole-body control: learning a unified policy for manipulation and locomotion. In Conference on Robot Learning, 2022.
- Bootstrap your own latent-a new approach to self-supervised learning. Advances in neural information processing systems, 33:21271–21284, 2020.
- Multi-skill mobile manipulation for object rearrangement. ICLR, 2023.
- Robot learning in homes: Improving generalization and reducing dataset bias. Advances in neural information processing systems, 2018.
- Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 770–778, 2015.
- Vision-based manipulators need to also see from their hands. ArXiv, abs/2203.12677, 2022. URL https://api.semanticscholar.org/CorpusID:247628166.
- Causal policy gradient for whole-body mobile manipulation. arXiv preprint arXiv:2305.04866, 2023.
- Skill transformer: A monolithic policy for mobile manipulation. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023.
- Dynamical movement primitives: learning attractor models for motor behaviors. Neural computation, 2013.
- Bilateral humanoid teleoperation system using whole-body exoskeleton cockpit tablis. IEEE Robotics and Automation Letters, 2020.
- Task-embedded control networks for few-shot imitation learning. ArXiv, abs/1810.03237, 2018.
- Bc-z: Zero-shot task generalization with robotic imitation learning. In Conference on Robot Learning, 2022.
- Robot learning of mobile manipulation with reachability behavior priors. IEEE Robotics and Automation Letters, 2022.
- Edward Johns. Coarse-to-fine imitation learning: Robot manipulation from a single demonstration. 2021 IEEE International Conference on Robotics and Automation (ICRA), pages 4613–4619, 2021a.
- Edward Johns. Coarse-to-fine imitation learning: Robot manipulation from a single demonstration. In 2021 IEEE international conference on robotics and automation (ICRA), pages 4613–4619. IEEE, 2021b.
- Team ihmc’s lessons learned from the darpa robotics challenge trials. Journal of Field Robotics, 2015.
- Force strategies for cooperative tasks in multiple mobile manipulation systems. In Robotics Research: The Seventh International Symposium, 1996.
- Whole body motion control framework for arbitrarily and simultaneously assigned upper-body tasks and walking motion. Modeling, Simulation and Optimization of Bipedal Walking, 2013.
- Robot peels banana with goal-conditioned dual-action deep imitation learning. ArXiv, abs/2203.09749, 2022.
- Learning motor primitives for robotics. In 2009 IEEE International Conference on Robotics and Automation, 2009.
- The darpa robotics challenge finals: Results and perspectives. The DARPA Robotics Challenge Finals: Humanoid Robots To The Rescue, 2018.
- Learning latent plans from play. In Conference on robot learning, pages 1113–1132. PMLR, 2020.
- Combining learning-based locomotion policy with model-based manipulation for legged mobile manipulators. IEEE Robotics and Automation Letters, 2022.
- What matters in learning from offline human demonstrations for robot manipulation. In Conference on Robot Learning, 2021.
- R3m: A universal visual representation for robot manipulation. arXiv preprint arXiv:2203.12601, 2022.
- Octo: An open-source generalist robot policy. https://octo-models.github.io, 2023.
- Using probabilistic movement primitives in robotics. Autonomous Robots, 42:529–551, 2018.
- The surprising effectiveness of representation learning for visual imitation. arXiv preprint arXiv:2112.01511, 2021.
- Learning and generalization of motor skills by learning from demonstration. 2009 IEEE International Conference on Robotics and Automation, pages 763–768, 2009.
- A multimode teleoperation framework for humanoid loco-manipulation: An application for the icub robot. IEEE Robotics & Automation Magazine, 2019.
- Learning of compliant human–robot interaction using full-body haptic interface. Advanced Robotics, 2013.
- Dean A. Pomerleau. Alvinn: An autonomous land vehicle in a neural network. In NIPS, 1988.
- Dynamic mobile manipulation via whole-body bilateral teleoperation of a wheeled humanoid. arXiv preprint arXiv:2307.01350, 2023.
- Real-world robot learning with masked visual pre-training. CoRL, 2022.
- Robot learning with sensorimotor pre-training. arXiv preprint arXiv:2306.10007, 2023.
- Vision-based multi-task manipulation for inexpensive robots using end-to-end learning from demonstration. 2018 IEEE International Conference on Robotics and Automation (ICRA), pages 3758–3765, 2017.
- Humanoid dynamic synchronization through whole-body bilateral feedback teleoperation. IEEE Transactions on Robotics, 2018.
- U-net: Convolutional networks for biomedical image segmentation. ArXiv, abs/1505.04597, 2015. URL https://api.semanticscholar.org/CorpusID:3719281.
- Latent plans for task-agnostic offline reinforcement learning. In Conference on Robot Learning, pages 1838–1849. PMLR, 2023.
- Nimbro avatar: Interactive immersive telepresence with force-feedback telemanipulation. In 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 5312–5319, 2021.
- Deep imitation learning for humanoid loco-manipulation through human teleoperation. Humanoids, 2023.
- Behavior transformers: Cloning k modes with one stone. ArXiv, abs/2206.11251, 2022.
- On bringing robots home. arXiv preprint arXiv:2311.16098, 2023.
- Gnm: A general navigation model to drive any robot. In 2023 IEEE International Conference on Robotics and Automation (ICRA), pages 7226–7233. IEEE, 2023.
- Concept2robot: Learning manipulation concepts from instructions and human demonstrations. The International Journal of Robotics Research, 40(12-14):1419–1434, 2021.
- Waypoint-based imitation learning for robotic manipulation. CoRL, 2023.
- Cliport: What and where pathways for robotic manipulation. ArXiv, abs/2109.12098, 2021.
- Perceiver-actor: A multi-task transformer for robotic manipulation. ArXiv, abs/2209.05451, 2022.
- Avid: Learning multi-stage tasks via pixel-level translation of human videos. arXiv preprint arXiv:1912.04443, 2019.
- Denoising diffusion implicit models. arXiv preprint arXiv:2010.02502, 2020.
- Fully autonomous real-world reinforcement learning with applications to mobile manipulation. In Conference on Robot Learning, 2021.
- Telesar vi: Telexistence surrogate anthropomorphic robot vi. International Journal of Humanoid Robotics.
- Demonstrate once, imitate immediately (dome): Learning visual servoing for one-shot imitation learning. In 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2022.
- Mimicplay: Long-horizon imitation learning by watching human play. arXiv preprint arXiv:2302.12422, 2023.
- Error-aware imitation learning from teleoperation data for mobile manipulation. In Conference on Robot Learning, 2022.
- M-ember: Tackling long-horizon mobile manipulation via factorized domain transfer. ICRA, 2023a.
- Tidybot: Personalized robot assistance with large language models. IROS, 2023b.
- Towards a personal robotics development platform: Rationale and design of an intrinsically safe personal robot. In 2008 IEEE International Conference on Robotics and Automation, 2008.
- Relmogen: Integrating motion generation in reinforcement learning for mobile manipulation. In 2021 IEEE International Conference on Robotics and Automation (ICRA), 2021.
- Decomposing the generalization gap in imitation learning for visual robotic manipulation. arXiv preprint arXiv:2307.03659, 2023.
- Learning by watching: Physical imitation of manipulation skills from human videos. In 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 7827–7834. IEEE, 2021.
- Learning periodic tasks from human demonstrations. In 2022 International Conference on Robotics and Automation (ICRA), pages 8658–8665. IEEE, 2022.
- Polybot: Training one policy across robots while embracing variability. In Conference on Robot Learning, pages 2955–2974. PMLR, 2023a.
- Harmonic mobile manipulation. arXiv preprint arXiv:2312.06639, 2023b.
- Moma-force: Visual-force imitation for real-world mobile manipulation. arXiv preprint arXiv:2308.03624, 2023c.
- Adaptive skill coordination for robotic mobile manipulation. arXiv preprint arXiv:2304.00410, 2023.
- One-shot imitation from observing humans via domain-adaptive meta-learning. arXiv preprint arXiv:1802.01557, 2018.
- Transporter networks: Rearranging the visual world for robotic manipulation. In Conference on Robot Learning, 2020.
- Learning fine-grained bimanual manipulation with low-cost hardware. RSS, 2023.
Collections
Sign up for free to add this paper to one or more collections.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.