ICLR2026

Texture Vector-Quantization and Reconstruction Aware Prediction for Generative Super-Resolution

Qifan Li, Jiale Zou, Jinhua Zhang, Wei Long, Xingyu Zhou, Shuhang Gu

被引用 2 次

摘要

Vector-quantized based models have recently demonstrated strong potential for visual prior modeling. However, existing VQ-based methods simply encode visual features with nearest codebook items and train index predictor with code-level supervision. Due to the richness of visual signal, VQ encoding often leads to large quantization error. Furthermore, training predictor with code-level supervision can not take the final reconstruction errors into consideration, result in sub-optimal prior modeling accuracy. In this paper we address the above two issues and propose a Texture Vector-Quantization and a Reconstruction Aware Prediction strategy. The texture vector-quantization strategy leverages the task character of superresolution and only introduce codebook to model the prior of missing textures. While the reconstruction aware prediction strategy makes use of the straightthrough estimator to directly train index predictor with image-level supervision. Our proposed generative SR model (TVQ&RAP) is able to deliver photo-realistic SR results with small computational cost. di se nt an gl in g Vanilla Codebook: Modeling a complex feature space containing both structures and textures. encoding F e a tu r e (a) Vanilla Vector Quantization Lookup Lookup encoding F e a tu r e Structures inherently in LR Texture Codebook: Modeling a simple feature space by remove structures inherently in LR. (b) Texture Vector Quantization