CVPR2024

Enhancing Post-Training Quantization Calibration Through Contrastive Learning

Yuzhang Shang, Gaowen Liu, Ramana Rao Kompella, Yan Yan

10 citations

Abstract

Post-training quantization (PTQ) converts a pre-trained full-precision (FP) model into a quantized model in a training-free manner. Determining suitable quantization parameters, such as scaling factors and zero points, is the primary strategy for mitigating the impact of quantization noise (calibration) and restoring the performance of the quantized models. However, the existing activation calibration methods have never considered information degradation between pre- (FP) and post-quantized activations. In this study, we introduce a well-defined distributional metric from information theory, mutual information, into PTQ calibration. We aim to calibrate the quantized activations by maximizing the mutual information between the pre- and post-quantized activations. To realize this goal, we establish a contrastive learning (CL) framework for the calibration, where the quantization parameters are optimized through a self-supervised proxy task. Specifically, by leveraging CL during the PTQ calibration, we can benefit from pulling the positive pairs of quantized and FP activations collected from the same input samples, while pushing negative pairs from different samples. Thanks to the ingeniously designed critic function, we avoid the unwanted but of tenencountered collision solution in CL, especially in calibration scenarios where the amount of calibration data is limited. Additionally, we provide a theoretical guarantee that minimizing our designed loss is equivalent to maximizing the desired mutual information. Consequently, the quantized activations retain more information, which ultimately enhances the performance of the quantized network. Experimental results show that our method can effectively serve as an add-on module to existing SoTA PTQ methods.