Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
158 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Adversarial Cheap Talk (2211.11030v4)

Published 20 Nov 2022 in cs.LG, cs.AI, and cs.CR

Abstract: Adversarial attacks in reinforcement learning (RL) often assume highly-privileged access to the victim's parameters, environment, or data. Instead, this paper proposes a novel adversarial setting called a Cheap Talk MDP in which an Adversary can merely append deterministic messages to the Victim's observation, resulting in a minimal range of influence. The Adversary cannot occlude ground truth, influence underlying environment dynamics or reward signals, introduce non-stationarity, add stochasticity, see the Victim's actions, or access their parameters. Additionally, we present a simple meta-learning algorithm called Adversarial Cheap Talk (ACT) to train Adversaries in this setting. We demonstrate that an Adversary trained with ACT still significantly influences the Victim's training and testing performance, despite the highly constrained setting. Affecting train-time performance reveals a new attack vector and provides insight into the success and failure modes of existing RL algorithms. More specifically, we show that an ACT Adversary is capable of harming performance by interfering with the learner's function approximation, or instead helping the Victim's performance by outputting useful features. Finally, we show that an ACT Adversary can manipulate messages during train-time to directly and arbitrarily control the Victim at test-time. Project video and code are available at https://sites.google.com/view/adversarial-cheap-talk

Definition Search Book Streamline Icon: https://streamlinehq.com
References (42)
  1. C. Ashcraft and K. Karra. Poisoning deep reinforcement learning agents with in-distribution triggers. arXiv preprint arXiv:2106.07798, 2021.
  2. Interference and generalization in temporal difference learning. In Proceedings of the 37th International Conference on Machine Learning, volume 119 of Proceedings of Machine Learning Research, pages 767–777, 2020.
  3. JAX: composable transformations of Python+NumPy programs, 2018.
  4. Openai gym, 2016.
  5. Emergent communication through negotiation. In 6th International Conference on Learning Representations, 2018.
  6. V. P. Crawford and J. Sobel. Strategic information transmission. Econometrica, 50(6):1431–1451, 1982.
  7. Goal misgeneralization in deep reinforcement learning. In International Conference on Machine Learning, pages 12004–12019. PMLR, 2022.
  8. On catastrophic interference in atari 2600 games. arXiv preprint arXiv:2002.12499, 2020.
  9. Learning with opponent-learning awareness. In Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems, pages 122–130, 2018.
  10. Learning to communicate with deep multi-agent reinforcement learning. In Advances in Neural Information Processing Systems, volume 29, pages 2137–2145, 2016.
  11. Adversarial policies: Attacking deep reinforcement learning. In 8th International Conference on Learning Representations, 2020.
  12. Badnets: Identifying vulnerabilities in the machine learning model supply chain. arXiv preprint arXiv:1708.06733, 2017.
  13. Adversarial policy learning in two-player competitive games. In International Conference on Machine Learning, pages 3910–3919. PMLR, 2021.
  14. Adversarial attacks on neural network policies. In 5th International Conference on Learning Representations, Workshop Track Proceedings, 2017.
  15. Trojdrl: Evaluation of backdoor attacks on deep reinforcement learning. In 57th ACM/IEEE Design Automation Conference, pages 1–6, 2020.
  16. J. Kos and D. Song. Delving into adversarial attacks on deep policies. In 5th International Conference on Learning Representations, Workshop Track Proceedings, 2017.
  17. R. T. Lange. evosax: Jax-based evolution strategies, 2022a.
  18. R. T. Lange. gymnax: A JAX-based reinforcement learning environment library, 2022b. URL http://github.com/RobertTLange/gymnax.
  19. Efficient backprop. In G. Montavon, G. B. Orr, and K. Müller, editors, Neural Networks: Tricks of the Trade - Second Edition, volume 7700 of Lecture Notes in Computer Science, pages 9–48. Springer, 2012. doi: 10.1007/978-3-642-35289-8_3. URL https://doi.org/10.1007/978-3-642-35289-8_3.
  20. Ivy: Templated deep learning for inter-framework portability. arXiv preprint arXiv:2102.02886, 2021.
  21. Differentiable game mechanics. J. Mach. Learn. Res., 20:84:1–84:40, 2019a.
  22. Stable opponent shaping in differentiable games. In 7th International Conference on Learning Representations, 2019b.
  23. Detecting adversarial attacks on neural network policies with visual foresight. CoRR, abs/1710.00814, 2017. URL http://arxiv.org/abs/1710.00814.
  24. Model-free opponent shaping. arXiv preprint arXiv:2205.01447, 2022.
  25. Understanding and preventing capacity loss in reinforcement learning. arXiv preprint arXiv:2204.09560, 2022.
  26. Gradients are not all you need. arXiv preprint arXiv:2111.05803, 2021.
  27. The primacy bias in deep reinforcement learning. In International Conference on Machine Learning, pages 16828–16847. PMLR, 2022.
  28. Robust adversarial reinforcement learning. In D. Precup and Y. W. Teh, editors, Proceedings of the 34th International Conference on Machine Learning, ICML 2017, Sydney, NSW, Australia, 6-11 August 2017, volume 70 of Proceedings of Machine Learning Research, pages 2817–2826. PMLR, 2017. URL http://proceedings.mlr.press/v70/pinto17a.html.
  29. Dynamic backdoor attacks against machine learning models. arXiv preprint arXiv:2003.03675, 2020.
  30. Evolution strategies as a scalable alternative to reinforcement learning. arXiv preprint arXiv:1703.03864, 2017.
  31. Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347, 2017.
  32. Goal misgeneralization: Why correct specifications aren’t enough for correct goals. arXiv preprint arXiv:2210.01790, 2022.
  33. Observational overfitting in reinforcement learning. In 8th International Conference on Learning Representations, 2020.
  34. Who is the strongest enemy? towards optimal and efficient evasion attacks in deep rl. arXiv preprint arXiv:2106.05087, 2021.
  35. Deep reinforcement learning and the deadly triad. arXiv preprint arXiv:1812.02648, 2018.
  36. On lottery tickets and minimal task representations in deep reinforcement learning. CoRR, abs/2105.01648, 2021. URL https://arxiv.org/abs/2105.01648.
  37. BACKDOORL: backdoor attack against competitive reinforcement learning. In Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, pages 3699–3705, 2021.
  38. Adversarial policies beat professional-level go ais. arXiv:2211.00241v1 [cs.LG], 2022.
  39. COLA: consistent learning with opponent-learning awareness. arXiv preprint arXiv:2203.04098, 2022.
  40. K. Young and T. Tian. Minatar: An atari-inspired testbed for thorough and reproducible reinforcement learning experiments. arXiv preprint arXiv:1903.03176, 2019.
  41. Robust deep reinforcement learning against adversarial perturbations on state observations. Advances in Neural Information Processing Systems, 33:21024–21037, 2020.
  42. Robust reinforcement learning on state observations with learned optimal adversary. In 9th International Conference on Learning Representations, ICLR 2021, Virtual Event, Austria, May 3-7, 2021. OpenReview.net, 2021. URL https://openreview.net/forum?id=sCZbhBvqQaU.
Citations (15)

Summary

We haven't generated a summary for this paper yet.