Papers
Topics
Authors
Recent
Detailed Answer
Quick Answer
Concise responses based on abstracts only
Detailed Answer
Well-researched responses based on abstracts and relevant paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses
Gemini 2.5 Flash
Gemini 2.5 Flash 42 tok/s
Gemini 2.5 Pro 53 tok/s Pro
GPT-5 Medium 17 tok/s Pro
GPT-5 High 13 tok/s Pro
GPT-4o 101 tok/s Pro
Kimi K2 217 tok/s Pro
GPT OSS 120B 474 tok/s Pro
Claude Sonnet 4 36 tok/s Pro
2000 character limit reached

Learning from Symmetry: Meta-Reinforcement Learning with Symmetrical Behaviors and Language Instructions (2209.10656v2)

Published 21 Sep 2022 in cs.AI

Abstract: Meta-reinforcement learning (meta-RL) is a promising approach that enables the agent to learn new tasks quickly. However, most meta-RL algorithms show poor generalization in multi-task scenarios due to the insufficient task information provided only by rewards. Language-conditioned meta-RL improves the generalization capability by matching language instructions with the agent's behaviors. While both behaviors and language instructions have symmetry, which can speed up human learning of new knowledge. Thus, combining symmetry and language instructions into meta-RL can help improve the algorithm's generalization and learning efficiency. We propose a dual-MDP meta-reinforcement learning method that enables learning new tasks efficiently with symmetrical behaviors and language instructions. We evaluate our method in multiple challenging manipulation tasks, and experimental results show that our method can greatly improve the generalization and learning efficiency of meta-reinforcement learning. Videos are available at https://tumi6robot.wixsite.com/symmetry/.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (28)
  1. C. Finn, P. Abbeel, and S. Levine, “Model-agnostic meta-learning for fast adaptation of deep networks,” in International conference on machine learning.   PMLR, 2017, pp. 1126–1135.
  2. K. Rakelly, A. Zhou, C. Finn, S. Levine, and D. Quillen, “Efficient off-policy meta-reinforcement learning via probabilistic context variables,” in International conference on machine learning.   PMLR, 2019, pp. 5331–5340.
  3. M. Wang, Z. Bing, X. Yao, S. Wang, H. Su, C. Yang, K. Huang, and A. Knoll, “Meta-reinforcement learning based on self-supervised task representation learning,” 2023.
  4. Z. Bing, Y. Meng, Y. Yun, H. Su, X. Su, K. Huang, and A. Knoll, “Diva: A dirichlet process based incremental deep clustering algorithm via variational auto-encoder,” 2023.
  5. Z. Bing, D. Lerch, K. Huang, and A. Knoll, “Meta-reinforcement learning in non-stationary and dynamic environments,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 45, no. 3, pp. 3476–3491, 2023.
  6. T. Yu, D. Quillen, Z. He, R. Julian, K. Hausman, C. Finn, and S. Levine, “Meta-world: A benchmark and evaluation for multi-task and meta reinforcement learning,” in Conference on robot learning.   PMLR, 2020, pp. 1094–1100.
  7. J. D. Co-Reyes, A. Gupta, S. Sanjeev, N. Altieri, J. DeNero, P. Abbeel, and S. Levine, “Guiding policies with language via meta-learning,” in International Conference on Learning Representations, 2019.
  8. C. B. Fisher, K. Ferdinandsen, and M. H. Bornstein, “The role of symmetry in infant form discrimination,” Child Development, vol. 52, no. 2, pp. 457–462, 1981.
  9. M. H. Pornstein and S. J. Krinsky, “Perception of symmetry in infancy: The salience of vertical symmetry and the perception of pattern wholes,” Journal of Experimental Child Psychology, 1985.
  10. A. Zhou, T. Knowles, and C. Finn, “Meta-learning symmetries by reparameterization,” in International Conference on Learning Representations, 2021.
  11. L. Kirsch, S. Flennerhag, H. van Hasselt, A. Friesen, J. Oh, and Y. Chen, “Introducing symmetries to black box meta reinforcement learning,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, no. 7, 2022, pp. 7202–7210.
  12. Z. Bing, H. Zhou, R. Li, X. Su, F. O. Morin, K. Huang, and A. Knoll, “Solving robotic manipulation with sparse reward reinforcement learning via graph-based diversity and proximity,” IEEE Transactions on Industrial Electronics, vol. 70, no. 3, pp. 2759–2769, 2023.
  13. Z. Bing, M. Brucker, F. O. Morin, R. Li, X. Su, K. Huang, and A. Knoll, “Complex robotic manipulation via graph-based hindsight goal generation,” IEEE Transactions on Neural Networks and Learning Systems, vol. 33, no. 12, pp. 7863–7876, 2022.
  14. Z. Bing, A. Koch, X. Yao, K. Huang, and A. Knoll, “Meta-reinforcement learning via language instructions.” in 2023 IEEE International Conference on Robotics and Automation(ICRA), May 2023. [Online]. Available: https://arxiv.org/abs/2209.04924
  15. P. Goyal, S. Niekum, and R. Mooney, “Pixl2r: Guiding reinforcement learning using natural language by mapping pixels to rewards,” in Conference on Robot Learning.   PMLR, 2021, pp. 485–497.
  16. E. Jang, A. Irpan, M. Khansari, D. Kappler, F. Ebert, C. Lynch, S. Levine, and C. Finn, “Bc-z: Zero-shot task generalization with robotic imitation learning,” in Conference on Robot Learning.   PMLR, 2022, pp. 991–1002.
  17. M. Shridhar, L. Manuelli, and D. Fox, “Cliport: What and where pathways for robotic manipulation,” in Conference on Robot Learning.   PMLR, 2022, pp. 894–906.
  18. H. Zhou, Z. Bing, X. Yao, X. Su, C. Yang, K. Huang, and A. Knoll, “Language-conditioned imitation learning with base skill priors under unstructured data,” 2023.
  19. M. Zinkevich and T. Balch, “Symmetry in markov decision processes and its implications for single agent and multi agent learning,” in In Proceedings of the 18th International Conference on Machine Learning.   Citeseer, 2001.
  20. A. Agostini and E. Celaya, “Exploiting domain symmetries in reinforcement learning with continuous state and action spaces,” in 2009 International Conference on Machine Learning and Applications.   IEEE, 2009, pp. 331–336.
  21. E. van der Pol, D. Worrall, H. van Hoof, F. Oliehoek, and M. Welling, “Mdp homomorphic networks: Group symmetries in reinforcement learning,” Advances in Neural Information Processing Systems, vol. 33, pp. 4199–4210, 2020.
  22. Y. Lin, J. Huang, M. Zimmer, Y. Guan, J. Rojas, and P. Weng, “Invariant transform experience replay: Data augmentation for deep reinforcement learning,” IEEE Robotics and Automation Letters, vol. 5, no. 4, pp. 6615–6622, 2020.
  23. Z. Bing, D. Lerch, K. Huang, and A. Knoll, “Meta-reinforcement learning in non-stationary and dynamic environments,” IEEE Transactions on Pattern Analysis and Machine Intelligence, pp. 1–17, 2022.
  24. E. Todorov, T. Erez, and Y. Tassa, “Mujoco: A physics engine for model-based control,” in 2012 IEEE/RSJ international conference on intelligent robots and systems.   IEEE, 2012, pp. 5026–5033.
  25. V. Joshi, M. Peters, and M. Hopkins, “Extending a parser to distant domains using a few dozen partially annotated examples,” in Association for Computational Linguistics, Melbourne, Australia, July 2018.
  26. G. A. Miller, “Wordnet: a lexical database for english,” Communications of the ACM, vol. 38, no. 11, pp. 39–41, 1995.
  27. J. Fu, K. Luo, and S. Levine, “Learning robust rewards with adverserial inverse reinforcement learning,” in International Conference on Learning Representations, 2018.
  28. A. Zhou, E. Jang, D. Kappler, A. Herzog, M. Khansari, P. Wohlhart, Y. Bai, M. Kalakrishnan, S. Levine, and C. Finn, “Watch, try, learn: Meta-learning from demonstrations and rewards,” in International Conference on Learning Representations, 2020.
Citations (5)

Summary

We haven't generated a summary for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Lightbulb On Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

Don't miss out on important new AI/ML research

See which papers are being discussed right now on X, Reddit, and more:

“Emergent Mind helps me see which AI papers have caught fire online.”

Philip

Philip

Creator, AI Explained on YouTube