Safe Exploration in Reinforcement Learning: Training Backup Control Barrier Functions with Zero Training Time Safety Violations (2312.07828v2)

Published 13 Dec 2023 in eess.SY and cs.SY

Abstract: This paper introduces the reinforcement learning backup shield (RLBUS), an algorithm that guarantees safe exploration in reinforcement learning (RL) by incorporating backup control barrier functions (BCBFs). RLBUS constructs an implicit control forward invariant subset of the safe set using multiple backup policies, ensuring safety in the presence of input constraints. While traditional BCBFs often result in conservative control forward-invariant sets due to the design of backup controllers, RLBUS addresses this limitation by leveraging model-free RL to train an additional backup policy, which enlarges the identified control forward invariant subset of the safe set. This approach enables the exploration of larger regions in the state space with zero safety violations during training. The effectiveness of RLBUS is demonstrated on an inverted pendulum example, where the expanded invariant set allows for safe exploration over a broader state space, enhancing performance without compromising safety.

References (25)

Authors (2)

Pedram Rabiee (8 papers)
Amirsaeid Safari (4 papers)

Citations (5)

View on Semantic Scholar

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Safe Exploration in Reinforcement Learning: Training Backup Control Barrier Functions with Zero Training Time Safety Violations (2312.07828v2)

Summary

Related Papers