Papers
Topics
Authors
Recent
2000 character limit reached

Cooperative Backdoor Attack in Decentralized Reinforcement Learning with Theoretical Guarantee (2405.15245v1)

Published 24 May 2024 in cs.LG and cs.AI

Abstract: The safety of decentralized reinforcement learning (RL) is a challenging problem since malicious agents can share their poisoned policies with benign agents. The paper investigates a cooperative backdoor attack in a decentralized reinforcement learning scenario. Differing from the existing methods that hide a whole backdoor attack behind their shared policies, our method decomposes the backdoor behavior into multiple components according to the state space of RL. Each malicious agent hides one component in its policy and shares its policy with the benign agents. When a benign agent learns all the poisoned policies, the backdoor attack is assembled in its policy. The theoretical proof is given to show that our cooperative method can successfully inject the backdoor into the RL policies of benign agents. Compared with the existing backdoor attacks, our cooperative method is more covert since the policy from each attacker only contains a component of the backdoor attack and is harder to detect. Extensive simulations are conducted based on Atari environments to demonstrate the efficiency and covertness of our method. To the best of our knowledge, this is the first paper presenting a provable cooperative backdoor attack in decentralized reinforcement learning.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (41)
  1. Deep reinforcement learning for cyber security. IEEE Transactions on Neural Networks and Learning Systems, 34(8):3779–3795, 2021.
  2. Reinforcement learning for iot security: A comprehensive survey. IEEE Internet of Things Journal, 8(11):8693–8706, 2020.
  3. Sample-efficient and safe deep reinforcement learning via reset deep ensemble agents. Advances in Neural Information Processing Systems, 36, 2024.
  4. Fedgame: A game-theoretic defense against backdoor attacks in federated learning. Advances in Neural Information Processing Systems, 36, 2024.
  5. Security analysis of safe and seldonian reinforcement learning algorithms. Advances in Neural Information Processing Systems, 33:8959–8970, 2020.
  6. Safe offline reinforcement learning with real-time budget constraints. In International Conference on Machine Learning, pages 21127–21152. PMLR, 2023.
  7. Safe reinforcement learning in constrained markov decision processes. In International Conference on Machine Learning, pages 9797–9806. PMLR, 2020.
  8. Trojdrl: Evaluation of backdoor attacks on deep reinforcement learning. 2020 57th ACM/IEEE Design Automation Conference (DAC), pages 1–6, 2020. URL https://api.semanticscholar.org/CorpusID:222297804.
  9. Backdoorl: Backdoor attack against competitive reinforcement learning. arXiv preprint arXiv:2105.00579, 2021.
  10. Baffle: Hiding backdoors in offline reinforcement learning datasets. In 2024 IEEE Symposium on Security and Privacy (SP), pages 218–218. IEEE Computer Society, 2024.
  11. Badrl: Sparse targeted backdoor attack against reinforcement learning. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 38, pages 11687–11694, 2024.
  12. Yingqi Liu et al. Trojaning attack on neural networks. arXiv preprint arXiv:1702.05521, 2017.
  13. Blackbox attacks on reinforcement learning agents using approximated temporal information. In 2020 50th Annual IEEE/IFIP International Conference on Dependable Systems and Networks Workshops (DSN-W), pages 16–24. IEEE, 2020.
  14. Provable defense against backdoor policies in reinforcement learning. Advances in Neural Information Processing Systems, 35:14704–14714, 2022.
  15. Flame: Taming backdoors in federated learning. arXiv preprint arXiv:2102.05117, 2021.
  16. Auror: Defending against poisoning attacks in collaborative deep learning systems. Proceedings of the 32nd Annual Conference on Computer Security Applications, pages 508–519, 2016.
  17. Decentralized reinforcement learning of robot behaviors. Artificial Intelligence, 256:130–159, 2018.
  18. How to backdoor federated learning. Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics, pages 2938–2948, 2020a.
  19. Federated learning: Challenges, methods, and future directions. IEEE Signal Processing Magazine, 37(3):50–60, 2020.
  20. Iba: Towards irreversible backdoor attacks in federated learning. Advances in Neural Information Processing Systems, 36, 2024.
  21. A3fl: Adversarially adaptive backdoor attacks to federated learning. Advances in Neural Information Processing Systems, 36, 2024a.
  22. Poisoning with cerberus: Stealthy and colluded backdoor attack against federated learning. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 37, pages 9020–9028, 2023.
  23. Crfl: Certifiably robust federated learning against backdoor attacks. In International Conference on Machine Learning, pages 11372–11382. PMLR, 2021.
  24. Beyond traditional threats: A persistent backdoor attack on federated learning. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 38, pages 21359–21367, 2024.
  25. Attack of the tails: Yes, you really can backdoor federated learning. arXiv preprint arXiv:2007.05084, 2020.
  26. Dba: Distributed backdoor attacks against federated learning. arXiv preprint arXiv:1905.10447, 2019.
  27. Collaborative ai teaming in unknown environments via active goal deduction. arXiv preprint arXiv:2403.15341, 2024b.
  28. A distributed abstract mac layer for cooperative learning on internet of vehicles. IEEE Transactions on Intelligent Transportation Systems, 2024.
  29. Modeling other players with bayesian beliefs for games with incomplete information, 2024c.
  30. Pac: Assisted value factorization with counterfactual predictions in multi-agent reinforcement learning. Advances in Neural Information Processing Systems, 35:15757–15769, 2022.
  31. How to backdoor federated learning. In International conference on artificial intelligence and statistics, pages 2938–2948. PMLR, 2020b.
  32. Value functions factorization with latent state information sharing in decentralized multi-agent policy gradients. IEEE Transactions on Emerging Topics in Computational Intelligence, 2023a.
  33. Mac-po: Multi-agent experience replay via collective priority optimization. arXiv preprint arXiv:2302.10418, 2023.
  34. Osteoporotic-like vertebral fracture with less than 20% height loss is associated with increased further vertebral fracture risk in older women: the mros and msos (hong kong) year-18 follow-up radiograph results. Quantitative Imaging in Medicine and Surgery, 13(2):1115, 2023.
  35. Coordinate-aligned multi-camera collaboration for active multi-object tracking. arXiv preprint arXiv:2202.10881, 2022.
  36. Implementing first-person shooter game ai in wild-scav with rule-enhanced deep reinforcement learning. In 2023 IEEE Conference on Games (CoG), pages 1–8. IEEE, 2023.
  37. A bayesian optimization framework for finding local optima in expensive multi-modal functions. arXiv preprint arXiv:2210.06635, 2022.
  38. Projection-optimal monotonic value function factorization in multi-agent reinforcement learning. In Proceedings of the 2024 International Conference on Autonomous Agents and Multiagent Systems, 2024.
  39. Real-time network intrusion detection via decision transformers. arXiv preprint arXiv:2312.07696, 2023.
  40. Double policy estimation for importance sampling in sequence modeling-based reinforcement learning. In NeurIPS 2023 Foundation Models for Decision Making Workshop, 2023b.
  41. Bringing fairness to actor-critic reinforcement learning for network utility optimization. In IEEE INFOCOM 2021-IEEE Conference on Computer Communications, pages 1–10. IEEE, 2021.
Citations (3)

Summary

  • The paper presents a novel cooperative backdoor attack that fragments malicious triggers across multiple agents in decentralized RL.
  • It employs distributed state space partitioning and rigorous theoretical analysis to ensure effective attacks with minimal detection risk.
  • Experimental validations on Atari games demonstrate that Co-Trojan triggers backdoor behaviors while preserving overall agent performance.

Cooperative Backdoor Attack in Decentralized Reinforcement Learning: An Expert Analysis

Introduction to Cooperative Backdoor Attacks

This paper introduces a novel approach to embedding backdoor attacks in decentralized reinforcement learning (RL) systems. Traditional backdoor attacks rely on embedding a single, comprehensive malicious policy within a reinforcement learning model. However, this method is susceptible to detection due to the significant deviation of the backdoor policy from normal policies. The research explores a cooperative strategy that decomposes the backdoor into multiple components, which are then assembled by benign agents during policy training, rendering the attacks both effective and difficult to detect. Figure 1

Figure 1: We study cooperative backdoor policy attacks in decentralized RL. Differing from the single backdoor policy attack that hides a whole backdoor knowledge behind its malign policy, our method decomposes the backdoor behavior into multiple components, each of which is hidden by an individual attacker within its malign policy. When a benign agent learns all the poisoned policies, the backdoor attack is assembled in its policy. Compared with a single backdoor policy attack, our method has the same attacking performance but is harder to detect.

Methodology and Theoretical Analysis

The cooperative backdoor attack strategy, designated as Co-Trojan, leverages the distributed nature of decentralized RL to secretly embed malicious triggers. By partitioning the state space into multiple subspaces, each controlled by a different malicious agent, Co-Trojan opportunistically injects trigger components into the benign agents' policies. Once a benign agent aggregates these sub-components, the complete backdoor effect is realized, ensuring that the benign policy now includes the desired malicious behavior.

The paper proves that this decomposition and reassembly approach allows for the successful implementation of backdoor policies with minimal risk of detection. Theoretical guarantees are provided, demonstrating that the aggregate effects of these sub-components approximate a predefined global backdoor policy. These guarantees are anchored in rigorous proofs showing that the distributed triggers can collectively achieve the targeted behavior without influencing the performance within the safe state subspace.

Experimental Validation

The efficacy of Co-Trojan was evaluated through experiments conducted on classic Atari games, Breakout and Seaquest, within a decentralized RL framework. The results revealed that Co-Trojan could successfully enforce backdoor actions while maintaining performance comparability to traditional backdoor attacks. Figure 2

Figure 2

Figure 2

Figure 2: Performance Results for Breakout with Various Poisoning Conditions: (a) Strong Targeted Poison, (b) Weak Targeted Poison, and (c) Untargeted Poison. Each subplot shows the average rewards for TrojDRL (triggered), TrojDRL (clean), Co-Trojan (triggered), and Co-Trojan (clean). The lines are smoothed by averaging every five data points.

In Breakout, agents with embedded backdoor policies showed an increase in missed balls at critical moments, indicative of successful backdoor activation. Similar results were noted in Seaquest, where the submarine's controlled instability validated the attack efficacy. Figure 3

Figure 3

Figure 3

Figure 3: Performance Results for Seaquest with Various Poisoning Conditions: (a) Strong Targeted Poison, (b) Weak Targeted Poison, and (c) Untargeted Poison. Each subplot shows the average rewards for TrojDRL (triggered), TrojDRL (clean), Co-Trojan (triggered), and Co-Trojan (clean). The lines are smoothed by averaging every five data points.

Implications and Future Directions

The demonstrated success of Co-Trojan in decentralized settings underlines a significant vulnerability in RL systems. This cooperative backdoor strategy underscores the need for robust defense mechanisms capable of detecting and mitigating distributed attack vectors across multi-agent RL systems. Future work should focus on developing defensive strategies that are both efficient and unobtrusive, taking into account the challenges introduced by the covert nature of cooperative attacks.

Among potential future directions is the extension of Co-Trojan to broader applications beyond traditional gaming environments, particularly in areas where decentralized RL is gaining traction, such as autonomous systems and IoT networks. Additionally, research could explore the application of adversarial training techniques to strengthen RL security frameworks against sophisticated backdoor strategies.

Conclusion

Cooperative backdoor attacks in decentralized RL introduce a paradigm shift in understanding the vulnerabilities of multi-agent learning systems. By fragmenting malicious intents across multiple agents, Co-Trojan demonstrates how these systems can be compromised while remaining undetected. This research contributes to the growing field of adversarial RL by challenging the security assumptions of distributed learning models and emphasizing the need for forward-thinking defenses in this evolving domain.

Whiteboard

Paper to Video (Beta)

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 1 tweet with 2 likes about this paper.