ICLR2025
Efficient Model Editing with Task-Localized Sparse Fine-tuning
Leonardo Iurada, Marco Ciccone, Tatiana Tommasi
Abstract
Task arithmetic has emerged as a promising approach for editing models by representing task-specific knowledge as composable task vectors. However, existing methods rely on network linearization to derive task vectors, leading to computational bottlenecks during training and inference. Moreover, linearization alone does not ensure weight disentanglement, the key property that enables conflict-free composition of task vectors. To address this, we propose TaLoS which allows to build sparse task vectors with minimal interference without requiring explicit linearization and sharing information across tasks. We find that pre-trained models contain a subset of parameters with consistently low gradient sensitivity across tasks, and that sparsely updating only these parameters allows for promoting weight disentanglement during fine-tuning. Our experiments prove that TaLoS improves training and inference efficiency while outperforming current methods in task addition and negation. By enabling modular parameter editing, our approach fosters practical deployment of adaptable foundation models in real-world applications 1 . Published as a conference paper at ICLR 2025 work, we first show that model linearization alone is not sufficient, as its task functions can still activate for arbitrary inputs. Instead, we propose a set of function localization constraints to exactly implement the weight disentanglement property on linearized networks. Then, we introduce a novel sparse fine-tuning approach that implements such constraints while avoiding the need for explicit model linearization. The proposed method strategically updates a subset of model parameters, simultaneously promoting linearized behavior and enforcing function localization. Extensive empirical analyses and theoretical justifications demonstrate that our approach effectively promotes weight disentanglement, ensuring compatibility between task vectors without the need for sharing information between users and tasks. This enables efficient and robust model editing through the simple addition and subtraction of sparse task vectors, facilitating decentralized collaborative strategies. We can summarize our main contributions as follows. • We advance the field of task arithmetic by deriving a novel set of function localization constraints that provide exact guarantees of weight disentanglement on linearized networks. • We empirically observed that the least sensitive parameters in transformer-based architectures pre-trained on large-scale datasets can be consistently identified regardless of the task. We exploit this regularity to satisfy the localization constraints under strict individual training assumptions. • We introduce Task-Localized Sparse Fine-Tuning (TaLoS) that enables task arithmetic by jointly implementing the localization constraints and inducing a linear regime during fine-tuning, without incurring in the overheads of explicit network linearization. Overall, our work addresses a critical gap in task arithmetic, providing a more complete and practical framework for parameter-space model editing, targeting real-world applications. RELATED WORKS Sparsity & Parameter-Efficient Fine-Tuning. Sparsity has emerged as a fundamental concept in efficient deep learning, manifesting in both training and adaptation methodologies. Sparse fine-tuning strategies (Guo et al., 2021; Xu et al., 2021) improve training efficiency by selectively updating subsets of model parameters. These approaches often leverage the Fisher information matrix (Fisher