ICLR2025
ViDiT-Q: Efficient and Accurate Quantization of Diffusion Transformers for Image and Video Generation
Tianchen Zhao, Tongcheng Fang, Haofeng Huang, Rui Wan, Widyadewi Soedarmadji, Enshu Liu, Shiyao Li, Zinan Lin, Guohao Dai, Shengen Yan, Huazhong Yang, Xuefei Ning, Yu Wang
摘要
Quantization Method specialized for DiT Figure 1: We introduce ViDiT-Q, a quantization method specialized for diffusion transformers used in image and video generation. ViDiT-Q achieves lossless W8A8 quantization and minimal visual quality degradation at W4A8, gaining 2.5x model size reduction and a 1.5x latency speedup. 1