ACL2023

Parameter-efficient Weight Ensembling Facilitates Task-level Knowledge Transfer

Xingtai Lv, Ning Ding, Yujia Qin, Zhiyuan Liu, Maosong Sun

被引用 2 次

摘要

Recent studies show that large-scale pretrained language models could be efficaciously adapted to particular tasks in a parameterefficient manner. The trained lightweight set of parameters, such as adapters, can be easily stored and shared as a capability equipped with the corresponding models. Owning many lightweight parameters, we focus on transferring them between tasks to acquire an improvement in performance of new tasks, the key point of which is to obtain the similarity between tasks. In this paper, we explore 5 parameter-efficient weight ensembling methods to achieve such transferability and verify the effectiveness of them. These methods extract the information of datasets and trained lightweight parameters from different perspectives to obtain the similarity between tasks, and weight the existing lightweight parameters according to the comparability to acquire a suitable module for the initialization of new tasks. We apply them to three parameter-efficient tuning methods and test them on a wide set of downstream tasks. Experimental results show that our methods show an improvement of 5% 8% over baselines and could largely facilitate task-level knowledge transfer. Web Search Environment Content Find key words….. 1/9 为什么⽕⻋轨道总是停在岩 Why do train tracks always stop on 为什么普通铁路铁轨上都会铺⽯⼦? 普通铁路采⽤的是有砟轨道,⽽没有" " 道砟 ")的⾼铁采⽤的是⽆砟轨道。 统有砟轨道的优点;3. 传统有砟轨道 同样是铁轨,为什么⽕⻋轨道下⾯会 ⼦,⾼铁轨道却不铺? Why ordinary railroad tracks are paved The same is the railway track, why are under the train track, but not the high-s 道砟的好处之⼀是能够降低列⻋经过时 和热量… Web Search Environmen Content Find key words….. 为什么⽕⻋轨道总是停在 Why do train tracks always stop o 为什么普通铁路铁轨上都会铺⽯⼦ 普通铁路采⽤的是有砟轨道,⽽没有 " 道砟 ")的⾼铁采⽤的是⽆砟轨道。 统有砟轨道的优点;3. 传统有砟轨道