ICLR2025

DCT-CryptoNets: Scaling Private Inference in the Frequency Domain

Arjun Roy, Kaushik Roy

Abstract

The convergence of fully homomorphic encryption (FHE) and machine learning offers unprecedented opportunities for private inference of sensitive data. FHE enables computation directly on encrypted data, safeguarding the entire machine learning pipeline, including data and model confidentiality. However, existing FHE-based implementations for deep neural networks face significant challenges in computational cost, latency, and scalability, limiting their practical deployment. This paper introduces DCT-CryptoNets, a novel approach that operates directly in the frequency-domain to reduce the burden of computationally expensive nonlinear activations and homomorphic bootstrap operations during private inference. It does so by utilizing the discrete cosine transform (DCT), commonly employed in JPEG encoding, which has inherent compatibility with remote computing services where images are generally stored and transmitted in this encoded format. DCT-CryptoNets demonstrates a substantial latency reductions of up to 5.3× compared to prior work on benchmark image classification tasks. Notably, it demonstrates inference on the ImageNet dataset within 2.5 hours (down from 12.5 hours on equivalent 96-thread compute resources). Furthermore, by learning perceptually salient low-frequency information DCT-CryptoNets improves the reliability of encrypted predictions compared to RGB-based networks by reducing error accumulating homomorphic bootstrap operations. DCT-CryptoNets also demonstrates superior scalability to RGB-based networks by further reducing computational cost as image size increases. This study demonstrates a promising avenue for achieving efficient and practical private inference of deep learning models on high resolution images seen in real-world applications. * * Code is available at https://github.com/ar-roy/dct-cryptonets Published as a conference paper at ICLR 2025 5.3x r eduction in latency 1e5 1e4 1e3 1e2 ResNet-20 (274K) CIFAR-10 ResNet-18 (11M ) CIFAR-10 ResNet-18 (11M ) Im ageNet M odel (Par am eter s) & Dataset M ethod FHE Schem e. Lee et al. (2022b) CKKS Lee et al. (2022a) [1] CKKS Kim & Guyot (2023) [1] CKKS Rovida & Lepor ati (2024) CKKS SHE (Lou & Jiang, 2019) TFHE DCT-Cr yptoNets (our s) TFHE This work introduces DCT-CryptoNets, a novel framework that addresses the computational challenges of fully homomorphic encrypted neural networks (FHENNs). Traditional convolutional neural networks operate on raw pixel data, learning features from spatial intensity variations. Instead, we utilize Discrete Cosine Transforms (DCT) to represent images in the frequency domain, enabling our models to learn features from the rate of change in intensities (Gueguen et al., 2018; Ehrlich & Davis, 2019) . This not only aligns with the human visual system's differential sensitivity to perceptually † Lee et al. (2022a) scale to . Kim & Guyot (2023) scale to a Plain-18 network (ResNet-18 without skip connections) but only encrypt the last 8 layers when running on ImageNet.