Robust Deep Reinforcement Learning against Adversarial Perturbations on State Observations (2003.08938v7)

Published 19 Mar 2020 in cs.LG and stat.ML

Abstract: A deep reinforcement learning (DRL) agent observes its states through observations, which may contain natural measurement errors or adversarial noises. Since the observations deviate from the true states, they can mislead the agent into making suboptimal actions. Several works have shown this vulnerability via adversarial attacks, but existing approaches on improving the robustness of DRL under this setting have limited success and lack for theoretical principles. We show that naively applying existing techniques on improving robustness for classification tasks, like adversarial training, is ineffective for many RL tasks. We propose the state-adversarial Markov decision process (SA-MDP) to study the fundamental properties of this problem, and develop a theoretically principled policy regularization which can be applied to a large family of DRL algorithms, including proximal policy optimization (PPO), deep deterministic policy gradient (DDPG) and deep Q networks (DQN), for both discrete and continuous action control problems. We significantly improve the robustness of PPO, DDPG and DQN agents under a suite of strong white box adversarial attacks, including new attacks of our own. Additionally, we find that a robust policy noticeably improves DRL performance even without an adversary in a number of environments. Our code is available at https://github.com/chenhongge/StateAdvDRL.

Authors (7)

Huan Zhang (171 papers)
Hongge Chen (20 papers)
Chaowei Xiao (110 papers)
Bo Li (1107 papers)
Mingyan Liu (70 papers)
Duane Boning (11 papers)
Cho-Jui Hsieh (211 papers)

Citations (234)

View on Semantic Scholar

Summary

The paper introduces a state-adversarial MDP framework to model adversarial perturbations on state observations.
It proposes robust policy regularization for PPO, DDPG, and DQN, significantly enhancing robustness and normal performance.
Empirical results across 11 diverse environments validate increased resilience against strong adversarial attacks.

Robust Deep Reinforcement Learning against Adversarial Perturbations on State Observations

The paper presents a comprehensive paper on enhancing the robustness of deep reinforcement learning (DRL) algorithms against adversarial perturbations to state observations. The authors introduce the state-adversarial Markov decision process (SA-MDP) to model this vulnerability, providing a structured framework for addressing the issue beyond typical adversarial training seen in classification tasks.

The primary contribution is a novel policy regularization approach that theoretically promotes robustness across various DRL algorithms, specifically proximal policy optimization (PPO), deep deterministic policy gradient (DDPG), and deep Q networks (DQN). These methods span discrete and continuous action spaces, emphasizing the work's broad applicability.

Key Findings and Methodology

SA-MDP Framework: SA-MDP represents the environment incorporating adversarial state perturbations. It treats adversarial attacks as perturbations in the state representation rather than the action space, differentiating it from other robust RL frameworks, such as those considering robust transition probabilities or multi-agent adversarial settings.
Theoretical Backbone: The paper thoroughly analyzes the effects of adversarial perturbations, establishing that merely applying techniques from the supervised learning domain proves insufficient in RL settings. Instead, they develop a robust policy regularization strategy grounded in the theoretical underpinnings of SA-MDP, linking it to the total variation distance and KL-divergence for ensuring robustness.
Robust Policy Regularization: The authors propose regularization terms crafted for PPO, DDPG, and DQN that theoretically limit the policy's susceptibility to input perturbations. This is shown to improve not only robustness under attack but also to enhance normal performance in various environments.
Adversarial Attacks and Defenses: Besides proposing defenses, the paper introduces two strong white-box attack strategies—the robust SARSA (RS) attack and the maximal action difference (MAD) attack—highlighting the studied methods' effectiveness against especially resilient adversarial conditions.

Empirical Validation

The authors conduct extensive experiments across 11 environments, including both grid-based and control tasks, to assess the robustness of the proposed methods. They demonstrate that their regularization technique significantly boosts performance in adversarial setups across the DRL algorithms studied. Notably, the paper discusses the performance variance and the real-world applicability of robust strategies.

Implications and Future Directions

Practical Applications: The work's real-world implications are substantial, especially for safety-critical tasks like autonomous driving, where adversarial robustness is a necessity rather than an option. Introducing such robustness has the potential to propel DRL's adoption in practical settings.
Future Research: The paper opens avenues for exploring robustness in more complex adversarial scenarios, potentially involving non-stationary and non-Markovian adversaries. Additionally, there is a call for improved theoretical models to establish guaranteed robustness bounds.

In conclusion, the paper contributes robust solutions while laying down a robust theoretical framework for understanding the implications of adversarial perturbations in DRL. Its methodological depth and empirical backing equip future researchers to build on these insights, potentially transforming the landscape of safe and reliable reinforcement learning applications.

Related Papers

GitHub

GitHub - chenhongge/StateAdvDRL: [NeurIPS 2020, Spotlight] Code for "Robust Deep Reinforcement Learning against Adversarial Perturbations on Observations" (132 stars)

YouTube

Show All Videos