Robust Satisfaction of Temporal Logic Specifications via Reinforcement Learning (1510.06460v1)
Abstract: We consider the problem of steering a system with unknown, stochastic dynamics to satisfy a rich, temporally layered task given as a signal temporal logic formula. We represent the system as a Markov decision process in which the states are built from a partition of the state space and the transition probabilities are unknown. We present provably convergent reinforcement learning algorithms to maximize the probability of satisfying a given formula and to maximize the average expected robustness, i.e., a measure of how strongly the formula is satisfied. We demonstrate via a pair of robot navigation simulation case studies that reinforcement learning with robustness maximization performs better than probability maximization in terms of both probability of satisfaction and expected robustness.
- Austin Jones (9 papers)
- Derya Aksaray (15 papers)
- Zhaodan Kong (20 papers)
- Mac Schwager (88 papers)
- Calin Belta (103 papers)