Emergent Mind

On the Search for Feedback in Reinforcement Learning

(2002.09478)

Published Feb 21, 2020 in cs.LG and stat.ML

Abstract

The problem of Reinforcement Learning (RL) in an unknown nonlinear dynamical system is equivalent to the search for an optimal feedback law utilizing the simulations/ rollouts of the dynamical system. Most RL techniques search over a complex global nonlinear feedback parametrization making them suffer from high training times as well as variance. Instead, we advocate searching over a local feedback representation consisting of an open-loop sequence, and an associated optimal linear feedback law completely determined by the open-loop. We show that this alternate approach results in highly efficient training, the answers obtained are repeatable and hence reliable, and the resulting closed performance is superior to global state-of-the-art RL techniques. Finally, if we replan, whenever required, which is feasible due to the fast and reliable local solution, it allows us to recover global optimality of the resulting feedback law.

We're not able to analyze this paper right now due to high demand.

Please check back later (sorry!).

Generate a summary of this paper on our Pro plan:

We ran into a problem analyzing this paper.