AAAI2026

Transform-Free Feature Coding via Entropy-Constrained Vector Quantization

Qiaoxi Chen, Changsheng Gao, Li Li, Dong Liu

Abstract

Akfmct-An iterative descent algorithm based on a Lagrangian formulation is introduced for designing vector quantizers having minimum distortion subject to an entropy constraint. These entropy-constrained vector quantizers (ECVQ's) can be used in tandem with variable rate noiseless coding systems to provide locally optimal variable rate block source coding with respect to a fidelity criterion. Experiments on sampled speech and on synthetic sources with memory indicate that for waveform coding at low rates (about 1 bit/sample) under the squared error distortion measure, about 1.6 dB improvement in the signal-to-noise ratio can be expected over the best scalar and lattice quantizers when block entropy-coded with blocklength 4. Even greater gains are made over other forms of entropy-coded vector quantizers. For pattern recognition, it is shown that the ECVQ algorithm is a generalization of the k-means and related algorithms for estimating cluster means, in that the ECVQ algorithm estimates the prior cluster probabilities as well. Experiments on multivariate Gaussian distributions show that for clustering problems involving classes with widely different priors, the ECVQ outperforms the k-means algorithm in both likelihood and probability of error. I. INTRODUCTION ONSIDER the quantization or data compression C scheme of blocking a discrete-time stationary ergodic source into contiguous vectors of length n, and representing each vector by one of m reproduction vectors. For transmission of the source across a binary channel (or storage in a binary medium), the index of each reproduction vector can be encoded into binary in a straightforward manner requiring (log, m ) / n bits per sample. For this encoding scheme, the natural design problem for a fixed block length n is to find the set of m reproduction vectors, known as the reproduction codebook, which minimizes the average distortion between the source and its reproduction, subject to a constraint on the maximum rate of bit transmission. Distortion-rate theory [ 11-[3] guarantees that if n is sufficiently large, there exists a finite set of m reproduction vectors with transmission rate R = ( log, m ) / n bits per sample, such that the average distortion per sample D is arbitrarily close to the minimum distortion D( R ) , over all possible compression schemes with average transmission rate less than or equal to R bits per sample. This provides a theoretical basis for the use of such vector quantizers; for sufficiently long block lengths, they Manuscript