Faster Post-Quantum TLS 1.3 Based on ML-KEM: Implementation and Assessment (2404.13544v2)
Abstract: TLS is extensively utilized for secure data transmission over networks. However, with the advent of quantum computers, the security of TLS based on traditional public-key cryptography is under threat. To counter quantum threats, it is imperative to integrate post-quantum algorithms into TLS. Most PQ-TLS research focuses on integration and evaluation, but few studies address the improvement of PQ-TLS performance by optimizing PQC implementation. For the TLS protocol, handshake performance is crucial, and for post-quantum TLS (PQ-TLS) the performance of post-quantum key encapsulation mechanisms (KEMs) directly impacts handshake performance. In this work, we explore the impact of post-quantum KEMs on PQ-TLS performance. We explore how to improve ML-KEM performance using the latest Intel's Advanced Vector Extensions instruction set AVX-512. We detail a spectrum of techniques devised to parallelize polynomial multiplication, modular reduction, and other computationally intensive modules within ML-KEM. Our optimized ML-KEM implementation achieves up to 1.64x speedup compared to the latest AVX2 implementation. Furthermore, we introduce a novel batch key generation method for ML-KEM that can seamlessly integrate into the TLS protocols. The batch method accelerates the key generation procedure by 3.5x to 4.9x. We integrate the optimized AVX-512 implementation of ML-KEM into TLS 1.3, and assess handshake performance under both PQ-only and hybrid modes. The assessment demonstrates that our faster ML-KEM implementation results in a higher number of TLS 1.3 handshakes per second under both modes. Additionally, we revisit two IND-1-CCA KEM constructions discussed in Eurocrypt22 and Asiacrypt23. Besides, we implement them based on ML-KEM and integrate the one of better performance into TLS 1.3 with benchmarks.
- Hybrid key exchange in TLS 1.3, https://www.ietf.org/archive/id/draft-ietf-tls-hybrid-design-04.html
- Intel xeon phi processor 7250 specifications, https://www.intel.com/content/www/us/en/products/sku/94035/intel-xeon-phi-processor-7250-16gb-1-40-ghz-68-core/specifications.html
- OpenSSL: s_server- tls/ssl server program. OpenSSL Documentation (2022), https://www.openssl.org/docs/man3.3/man1/s_server.html
- OpenSSL: s_time- ssl/tls performance timing program. OpenSSL Documentation (2022), https://www.openssl.org/docs/man3.3/man1/s_time.html
- Barrett, P.: Implementing the rivest shamir and adleman public key encryption algorithm on a standard digital signal processor. In: Conference on the Theory and Application of Cryptographic Techniques. pp. 311–323. Springer (1986)
- FIPS, P.: Secure hash algorithm-3 (sha-3) standard: Permutation-based hash and extendable-output functions. National Institute for Standards and Technology (NIST) 202(0) (2014)
- Internet Engineering Task Force: Hybrid Terminology for Post-Quantum Key Establishment. Tech. rep., Internet Engineering Task Force (2023)
- National Institute of Standards and Technology: Post-quantum cryptography standardization: Selected algorithms (2022), https://csrc.nist.gov/Projects/post-quantum-cryptography/selected-algorithms-2022
- Roy, S.S.: Saberx4: High-throughput software implementation of saber key encapsulation mechanism. In: 2019 IEEE 37th International Conference on Computer Design (ICCD). pp. 321–324. IEEE (2019)
- Shor, P.W.: Algorithms for quantum computation: discrete logarithms and factoring. In: Proceedings 35th annual symposium on foundations of computer science. pp. 124–134. Ieee (1994)
- Westerbaan, B.: When to barrett reduce in the inverse ntt. Cryptology ePrint Archive (2020)