Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

CipherFormer: Efficient Transformer Private Inference with Low Round Complexity (2403.16860v1)

Published 25 Mar 2024 in cs.CR

Abstract: There is a growing trend to outsource the inference task of large transformer models to cloud servers. However, this poses a severe threat to users' private data as they are exposed to cloud servers after uploading. Although several works attempted to provide private inference for transformer models, their hundreds of communication rounds limit the application scenarios. Motivated by the desire to minimize round complexity, we propose CipherFormer, a novel transformer private inference scheme using homomorphic encryption and garbled circuits. We present a protocol for quickly computing homomorphic matrix multiplications. We then modify the attention mechanism and design the corresponding garbled circuits. Furthermore, we show how to use a lightweight attention mechanism and mixed-bitwidth to reduce the inference latency while maintaining accuracy. In comparison with an advanced homomorphic encryption scheme on text classification tasks, our model improves accuracy by 3% to 11% while performing private inference with a 7.7x-11.9x speedup.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (24)
  1. A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, and I. Polosukhin, “Attention is all you need,” in Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems, 2017, pp. 5998–6008.
  2. Z. Brakerski, “Fully homomorphic encryption without modulus switching from classical gapsvp,” in CRYPTO, R. Safavi-Naini and R. Canetti, Eds., 2012, pp. 868–886.
  3. J. Fan and F. Vercauteren, “Somewhat practical fully homomorphic encryption,” IACR Cryptol. ePrint Arch., p. 144, 2012.
  4. P. Mohassel and Y. Zhang, “Secureml: A system for scalable privacy-preserving machine learning,” in IEEE Symposium on Security and Privacy, 2017, pp. 19–38.
  5. C. Juvekar, V. Vaikuntanathan, and A. P. Chandrakasan, “GAZELLE: A low latency framework for secure neural network inference,” in USENIX Security Symposium, W. Enck and A. P. Felt, Eds., 2018, pp. 1651–1669.
  6. S. Zahur, M. Rosulek, and D. Evans, “Two halves make a whole - reducing data transfer in garbled circuits using half gates,” in EUROCRYPT, 2015, pp. 220–250.
  7. A. C. Yao, “How to generate and exchange secrets (extended abstract),” in 27th Annual Symposium on Foundations of Computer Science, 1986, pp. 162–167.
  8. S. Halevi and V. Shoup, “Algorithms in helib,” in CRYPTO, 2014, pp. 554–571.
  9. M. Hao, H. Li, H. Chen, P. Xing, G. Xu, and T. Zhang, “Iron: Private inference on transformers,” in NeurIPS, 2022.
  10. D. Li, R. Shao, H. Wang, H. Guo, E. P. Xing, and H. Zhang, “Mpcformer: fast, performant and private transformer inference with MPC,” CoRR, 2022.
  11. M. Zheng, Q. Lou, and L. Jiang, “Primer: Fast private transformer inference on encrypted data,” in 60th ACM/IEEE Design Automation Conference, 2023, pp. 1–6.
  12. W. Zeng, M. Li, W. Xiong, T. Tong, W.-j. Lu, J. Tan, R. Wang, and R. Huang, “Mpcvit: Searching for accurate and efficient mpc-friendly vision transformer with heterogeneous attention,” in Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2023, pp. 5052–5063.
  13. Y. Ding, H. Guo, Y. Guan, W. Liu, J. Huo, Z. Guan, and X. Zhang, “East: Efficient and accurate secure transformer framework for inference,” CoRR, 2023.
  14. D. Comi, “Herbert: a privacy-preserving natural language processing solution for text classification,” Ph.D. dissertation, Scuola di Ingegneria Industriale e dell’Informazione, Univ. Politecnico di Milano, 2021.
  15. T. Chen, H. Bao, S. Huang, L. Dong, B. Jiao, D. Jiang, H. Zhou, J. Li, and F. Wei, “THE-X: Privacy-preserving transformer inference with homomorphic encryption,” in Findings of the Association for Computational Linguistics: ACL 2022, 2022, pp. 3510–3520.
  16. A. Radford, K. Narasimhan, T. Salimans, I. Sutskever et al., “Improving language understanding by generative pre-training,” OpenAI, 2018.
  17. M. O. Rabin, “How to exchange secrets with oblivious transfer,” Cryptology ePrint Archive, 2005.
  18. O. Catrina and A. Saxena, “Secure computation with fixed-point numbers,” in Financial Cryptography and Data Security, 14th International Conference, 2010, pp. 35–50.
  19. A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,” in Advances in Neural Information Processing Systems 25: 26th Annual Conference on Neural Information Processing Systems, 2012, pp. 1106–1114.
  20. W. Y. Wang Fanchuan, “Secure image classification with deep neural networks for iot applications,” Journal of Ambient Intelligence and Humanized Computing, vol. 12, p. 8319–8337, 2021.
  21. B. D. Rouhani, M. S. Riazi, and F. Koushanfar, “Deepsecure: scalable provably-secure deep learning,” in Proceedings of the 55th Annual Design Automation Conference, 2018, pp. 2:1–2:6.
  22. Z. Qin, W. Sun, H. Deng, D. Li, Y. Wei, B. Lv, J. Yan, L. Kong, and Y. Zhong, “cosformer: Rethinking softmax in attention,” in The Tenth International Conference on Learning Representations, 2022.
  23. D. Rathee, M. Rathee, R. K. K. Goli, D. Gupta, R. Sharma, N. Chandran, and A. Rastogi, “Sirnn: A math library for secure RNN inference,” in 42nd IEEE Symposium on Security and Privacy, 2021, pp. 1003–1020.
  24. J. L. Elman, “Finding structure in time,” Cogn. Sci., vol. 14, no. 2, pp. 179–211, 1990.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (2)
  1. Weize Wang (2 papers)
  2. Yi Kuang (2 papers)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com