Kick Bad Guys Out! Conditionally Activated Anomaly Detection in Federated Learning with Zero-Knowledge Proof Verification (2310.04055v5)
Abstract: Federated Learning (FL) systems are vulnerable to adversarial attacks, such as model poisoning and backdoor attacks. However, existing defense mechanisms often fall short in real-world settings due to key limitations: they may rely on impractical assumptions, introduce distortions by modifying aggregation functions, or degrade model performance even in benign scenarios. To address these issues, we propose a novel anomaly detection method designed specifically for practical FL scenarios. Our approach employs a two-stage, conditionally activated detection mechanism: cross-round check first detects whether suspicious activity has occurred, and, if warranted, a cross-client check filters out malicious participants. This mechanism preserves utility while avoiding unrealistic assumptions. Moreover, to ensure the transparency and integrity of the defense mechanism, we incorporate zero-knowledge proofs, enabling clients to verify the detection without relying solely on the server's goodwill. To the best of our knowledge, this is the first method to bridge the gap between theoretical advances in FL security and the demands of real-world deployment. Extensive experiments across diverse tasks and real-world edge devices demonstrate the effectiveness of our method over state-of-the-art defenses.
- T. Aoki. On the stability of the linear transformation in banach spaces. Journal of the Mathematical Society of Japan, 2(1-2):64–66, 1950.
- How to backdoor federated learning. In AISTATS, 2020a.
- How to backdoor federated learning. In International Conference on Artificial Intelligence and Statistics, pp. 2938–2948. PMLR, 2020b.
- Analyzing federated learning through an adversarial lens. In International Conference on Machine Learning, pp. 634–643. PMLR, 2019.
- Machine learning with adversaries: Byzantine tolerant gradient descent. In NeurIPS, 2017.
- Differentially private secure multi-party computation for federated learning in financial applications. In Proceedings of the First ACM International Conference on AI in Finance, pp. 1–9, 2020.
- Leaf: A benchmark for federated settings. arXiv preprint arXiv:1812.01097, 2018.
- Mpaf: Model poisoning attacks to federated learning based on fake clients. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3396–3404, 2022.
- Fltrust: Byzantine-robust federated learning via trust bootstrapping. arXiv preprint arXiv:2012.13995, 2020.
- Flcert: Provably secure federated learning against poisoning attacks. IEEE Transactions on Information Forensics and Security, 17:3691–3705, 2022.
- Federated learning of out-of-vocabulary words. arXiv preprint arXiv:1903.10635, 2019.
- Distributed statistical machine learning in adversarial settings: Byzantine gradient descent. ACM on Measurement and Analysis of Computing Systems, 1(2):1–25, 2017.
- A review of medical federated learning: Applications in oncology and cancer research. In Brainlesion: Glioma, Multiple Sclerosis, Stroke and Traumatic Brain Injuries: 7th International Workshop, BrainLes 2021, Held in Conjunction with MICCAI 2021, Virtual Event, September 27, 2021, Revised Selected Papers, Part I, pp. 3–24. Springer, 2022.
- Local model poisoning attacks to Byzantine-robust federated learning. In USENIX Security, 2020.
- ZEN: An optimizing compiler for verifiable, zero-knowledge neural network inferences. 2021. Cryptology ePrint Archive.
- Grant S Fletcher. Clinical epidemiology: the essentials. Lippincott Williams & Wilkins, 2019.
- R. Freivalds. Probabilistic machines can use less running time. In IFIP Congress, 1977.
- Attack-resistant federated learning with residual-based reweighting. arXiv preprint arXiv:1912.11464, 2019.
- The limitations of federated learning in sybil settings. In RAID, pp. 301–316, 2020.
- The knowledge complexity of interactive proof systems. SIAM Jour. on Comp., 18(1):186–208, 1989.
- J. Groth. On the size of pairing-based non-interactive arguments. In Eurocrypt, 2016.
- The hidden vulnerability of distributed learning in byzantium. In International Conference on Machine Learning, pp. 3521–3530. PMLR, 2018.
- An iterative scheme for leverage-based approximate aggregation. In IEEE ICDE, 2019.
- Fedmlsecurity: A benchmark for attacks and defenses in federated learning and federated llms. arXiv preprint arXiv:2306.04959, 2023.
- Federated learning for mobile keyboard prediction. arXiv preprint arXiv:1811.03604, 2018.
- Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770–778, 2016.
- Byzantine-robust decentralized learning via self-centered clipping. 2022. Available on arXiv:2202.01545.
- Cafe: Catastrophic data leakage in vertical federated learning. Advances in Neural Information Processing Systems, 34:994–1006, 2021.
- Byzantine-robust learning on heterogeneous datasets via bucketing. arXiv preprint arXiv:2006.09365, 2020.
- Cocktail party attack: Breaking aggregation-based privacy in federated learning using independent component analysis. In International Conference on Machine Learning, 2022. URL https://api.semanticscholar.org/CorpusID:252211968.
- Learning multiple layers of features from tiny images. 2009.
- Gradient disaggregation: Breaking privacy in federated learning by reconstructing the user participant matrix. In International Conference on Machine Learning, pp. 5959–5968. PMLR, 2021.
- vCNN: Verifiable convolutional neural network based on zk-snarks. 2020. Cryptology ePrint Archive.
- Federated learning for keyword spotting. In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6341–6345, 2019.
- Learning to detect malicious clients for robust federated learning. arXiv preprint arXiv:2002.00211, 2020.
- ZkCNN: Zero knowledge proofs for convolutional neural network predictions and accuracy. In ACM CCS, 2021.
- A. Lyon. Why are normal distributions normal? The British Journal for the Philosophy of Science, 65(3):621–649, 2014.
- Communication-efficient learning of deep networks from decentralized data. In Artificial intelligence and statistics, pp. 1273–1282. PMLR, 2017a.
- Communication-efficient learning of deep networks from decentralized data. In Artificial intelligence and statistics, pp. 1273–1282. PMLR, 2017b.
- J. Osborne. Improving your data transformations: Applying the Box-Cox transformation. Practical Assessment, Research, and Evaluation, 15(1):12, 2010.
- Defending against backdoors in federated learning with robust learning rate. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, pp. 9268–9276, 2021.
- Robust aggregation for federated learning. IEEE Transactions on Signal Processing, 70:1142–1154, 2022.
- Federated learning for emoji prediction in a mobile keyboard. arXiv preprint arXiv:1906.04329, 2019.
- M. Rosenblatt. A central limit theorem and a strong mixing condition. National Academy of Sciences, 42(1):43–47, 1956.
- R. M. Sakia. The Box-Cox transformation technique: A review. Journal of the Royal Statistical Society: Series D, 41(2):169–178, 1992.
- Zerocash: Decentralized anonymous payments from bitcoin. In IEEE S&P, 2014.
- Fl-wbc: Enhancing robustness against model poisoning attacks in federated learning from a client perspective. Advances in Neural Information Processing Systems, 34:12613–12624, 2021.
- Can you really backdoor federated learning? arXiv preprint arXiv:1911.07963, 2019.
- Circom Contributors. Circom zkSNARK ecosystem, 2022. https://github.com/iden3/circom.
- Data poisoning attacks against federated learning systems. In European Symposium on Research in Computer Security, pp. 480–501. Springer, 2020.
- Model poisoning attacks against distributed machine learning systems. In Artificial Intelligence and Machine Learning for Multi-Domain Operations Applications, volume 11006, pp. 481–489. SPIE, 2019.
- Attack of the tails: Yes, you really can backdoor federated learning. In NeurIPS, 2020.
- Jianhua Wang. Pass: Parameters audit-based secure and fair federated learning scheme against free rider. arXiv preprint arXiv:2207.07292, 2022.
- S. Weisberg. Yeo-Johnson power transformations. 2001. Available at https://www.stat.umn.edu/arc/yjpower.pdf.
- SLSGD: Secure and Efficient Distributed On-device Machine Learning. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pp. 213–228. Springer, 2020.
- Byzantine-resilient stochastic gradient descent for distributed learning: A Lipschitz-inspired coordinate-wise median approach. In IEEE CDC, 2019.
- Byzantine-robust distributed learning: Towards optimal statistical rates. In International Conference on Machine Learning, pp. 5650–5659. PMLR, 2018.
- Implementation of fldetector. https://github.com/zaixizhang/FLDetector, 2022a.
- Fldetector: Defending federated learning against model poisoning attacks via detecting malicious clients. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 2545–2555, 2022b.
- Neurotoxin: Durable backdoors in federated learning. In International Conference on Machine Learning, 2022c. URL https://api.semanticscholar.org/CorpusID:249889464.