EMNLP2025
AROMA: Autonomous Rank-one Matrix Adaptation
Hao Nan Sheng, Zhi-Yong Wang, Hing Cheung So, Mingrui Yang
Abstract
As large language models continue to grow in size, parameter-efficient fine-tuning (PEFT) has become increasingly crucial. While lowrank adaptation (LoRA) offers a solution through low-rank updates, its static rank allocation may yield suboptimal results. Adaptive low-rank adaptation (AdaLoRA) improves this with dynamic allocation but remains sensitive to initial and target rank configurations. We introduce AROMA, a framework that automatically constructs layer-specific updates by iteratively building up rank-one components with very few trainable parameters that gradually diminish to zero. Unlike existing methods that employ rank reduction mechanisms, AROMA introduces a dual-loop architecture for rank growth. The inner loop extracts information from each rank-one subspace, while the outer loop determines the number of rankone subspaces, i.e., the optimal rank. We reset optimizer states to maintain subspace independence. AROMA significantly reduces parameters compared to LoRA and AdaLoRA while achieving superior performance on natural language understanding and generation, commonsense reasoning, offering new insights into adaptive PEFT. The code is available at https://github.com/ShuDun23/AROMA . 0 0.6k 1.2k 1.8k 2.4k 3k Training steps 0.0M 0.4M 0.8M 1.2M 1.6M 2.0M #Trainable parameters LoRA AdaLoRA AROMA (a) #Parameter 0 0.6k 1.2k 1.8k 2.4k 3k Training steps 0 150 300 450 600 750 900 Total rank LoRA AdaLoRA AROMA (b) Total rank 0 0.6k 1.2k 1.8k 2.4k 3k Training steps 0 3 6 9 12 Rank of layer.0. attention.output.dense LoRA AdaLoRA AROMA (c) Specific rank 0 0.6k 1.2k 1.8k 2.4k 3k Training steps 0 3 6 9 12 Rank of layer.9. attention.self.value LoRA AdaLoRA AROMA (d) Specific rank 0 0.6k 1.2k 1.8k 2.4k 3k Training steps 0.70 0.75 0.80 0.85 0.90 0.95 1.00 Accuracy LoRA AdaLoRA AROMA (e) Accuracy Figure 1: Results for LoRA r=8 , AdaLoRA r=8 , and AROMA (ours) include the number of trainable parameters, total rank, rank of a specific layer and evaluation accuracy versus training step for RoBERTa-base on MRPC task. For AROMA, training of "layer.0.attention.output.dense" and "layer.9.attention.self.value" automatically terminates at 2000 and 1600 steps, respectively, while the overall training automatically stops at 2400 steps.