A Bayesian Approach to Robust Reinforcement Learning (1905.08188v2)

Published 20 May 2019 in cs.LG, cs.AI, and stat.ML

Abstract: Robust Markov Decision Processes (RMDPs) intend to ensure robustness with respect to changing or adversarial system behavior. In this framework, transitions are modeled as arbitrary elements of a known and properly structured uncertainty set and a robust optimal policy can be derived under the worst-case scenario. In this study, we address the issue of learning in RMDPs using a Bayesian approach. We introduce the Uncertainty Robust BeLLMan Equation (URBE) which encourages safe exploration for adapting the uncertainty set to new observations while preserving robustness. We propose a URBE-based algorithm, DQN-URBE, that scales this method to higher dimensional domains. Our experiments show that the derived URBE-based strategy leads to a better trade-off between less conservative solutions and robustness in the presence of model misspecification. In addition, we show that the DQN-URBE algorithm can adapt significantly faster to changing dynamics online compared to existing robust techniques with fixed uncertainty sets.

Citations (52)

View on Semantic Scholar

Summary

We haven't generated a summary for this paper yet.

Summarize Now

A Bayesian Approach to Robust Reinforcement Learning (1905.08188v2)

Summary

Related Papers