NDSS2025

Secure Transformer Inference Made Non-interactive

Jiawen Zhang, Xinpeng Yang, Lipeng He, Kejia Chen, Wen-jie Lu, Yinghao Wang, Xiaoyang Hou, Jian Liu, Kui Ren, Xiaohu Yang

Abstract

Secure transformer inference has emerged as a prominent research topic following the proliferation of ChatGPT. Existing solutions are typically interactive, involving substantial communication load and numerous interaction rounds between the client and the server. In this paper, we propose NEXUS , the first non-interactive protocol for secure transformer inference, where the client is only required to submit an encrypted input and await the encrypted result from the server. Central to NEXUS are two innovative techniques: SIMD ciphertext compression/decompression, and SIMD slots folding. Consequently, our approach achieves a speedup of 2.8 × and a remarkable bandwidth reduction of 368.6 × , compared to the state-of-the-art solution presented in S&P ’24.