Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

MESA: Offline Meta-RL for Safe Adaptation and Fault Tolerance (2112.03575v1)

Published 7 Dec 2021 in cs.LG, cs.AI, and cs.RO

Abstract: Safe exploration is critical for using reinforcement learning (RL) in risk-sensitive environments. Recent work learns risk measures which measure the probability of violating constraints, which can then be used to enable safety. However, learning such risk measures requires significant interaction with the environment, resulting in excessive constraint violations during learning. Furthermore, these measures are not easily transferable to new environments. We cast safe exploration as an offline meta-RL problem, where the objective is to leverage examples of safe and unsafe behavior across a range of environments to quickly adapt learned risk measures to a new environment with previously unseen dynamics. We then propose MEta-learning for Safe Adaptation (MESA), an approach for meta-learning a risk measure for safe RL. Simulation experiments across 5 continuous control domains suggest that MESA can leverage offline data from a range of different environments to reduce constraint violations in unseen environments by up to a factor of 2 while maintaining task performance. See https://tinyurl.com/safe-meta-rl for code and supplementary material.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (9)
  1. Michael Luo (13 papers)
  2. Ashwin Balakrishna (40 papers)
  3. Brijen Thananjeyan (26 papers)
  4. Suraj Nair (39 papers)
  5. Julian Ibarz (26 papers)
  6. Jie Tan (85 papers)
  7. Chelsea Finn (264 papers)
  8. Ion Stoica (177 papers)
  9. Ken Goldberg (162 papers)
Citations (4)

Summary

We haven't generated a summary for this paper yet.