ICLR2025

SaRA: High-Efficient Diffusion Model Fine-tuning with Progressive Sparse Low-Rank Adaptation

Teng Hu, Jiangning Zhang, Ran Yi, Hongrui Huang, Yabiao Wang, Lizhuang Ma

Abstract

Recent advancements in text-to-image generation models have witnessed the success of large-scale diffusion-based generative models. However, exerting control over these models, particularly for structure-conditioned text-to-image generation, remains an open challenge. One straightforward way to achieve control is via fine-tuning, often coming at the cost of efficiency. In this work, we address this challenge by introducing ELR-Diffusion (Efficient Low-rank Diffusion), a method tailored for efficient structure-conditioned image generation. Our innovative approach leverages the low-rank decomposition of model weights, leading to a dramatic reduction in memory cost and model parameters -by up to 58%, at the same time performing comparably to larger models trained with expansive datasets and more computational resources. At the heart of ELR-Diffusion lies a two-stage training scheme that resorts to the low-rank decomposition and knowledge distillation strategy. To provide a robust assessment of our model, we undertake a thorough comparative analysis in the controllable text-to-image generation domain. We employ a diverse array of evaluation metrics with various conditions, including edge maps, segmentation maps, and image quality measures, offering a holistic view of the model's capabilities. We believe that ELR-Diffusion has the potential to serve as an efficient foundation model for diverse user applications that demand accurate comprehension of inputs containing multiple conditional information. To fully exploit the prior knowledge from the pre-trained teacher model while exploiting less data and training a lightweight diffusion model, we propose a new two-stage training schema. The first one is the initialization strategy to inherit the knowledge from the teacher model. Another is the knowledge distillation strategy. The overall pipeline is shown in Figure 3 .