ACL2025
Dynamic Prefix as Instructor for Incremental Named Entity Recognition: A Unified Seq2Seq Generation Framework
Zihao Wu, YongXiang Hua, Yongxin Zhu, Fang Zhang, Linli Xu
Abstract
The Incremental Named Entity Recognition (INER) task aims to update a model to extract entities from an expanding set of entity type candidates due to concerns related to data privacy and scarcity. However, conventional incremental learning methods for INER often suffer from the catastrophic forgetting problem, which leads to the degradation of the model's performance on previously encountered entity types. In this paper, we propose a parameterefficient dynamic prefix method and formalize INER as a unified seq2seq generation task. By employing the dynamic prefix as a task instructor to guide the generative model, our approach can preserve task-invariant knowledge while adapting to new entities with minimal parameter updates, making it particularly effective in low-resource scenarios. Additionally, we design a generative label augmentation strategy and a novel self-entropy loss to balance the stability and plasticity of the model. Empirical experiments on NER benchmarks demonstrate the effectiveness of our proposed method in addressing the challenges associated with INER. Continual learning aims to learn a sequence of 041 tasks incrementally which mirrors the human ca-042 pability of learning and accumulating knowledge 043 continually without forgetting previously learned 044 knowledge and and leveraging it to facilitate learn-045 ing new tasks (Ke and Liu, 2022). However, catas-046 trophic forgetting (McCloskey and Cohen, 1989) 047 poses a significant challenge in continual learning 048 where the model gradually forgets previous knowl-049 edge in the current learning step. In continual learn-050 ing for NER, the information of previous and future 051 entity types is missing in the current step. Ma et al. 052 (2023) point out that the majority of prediction er-053 rors of INER stem from the confusion between 054 pre-defined entities and other entities ("O"). As 055 shown in Figure 1, the model learned to recognize 056 "PER" (person) and "LOC" (location) in one step 057 would be trained to annotate "PER" or "LOC" as 058 "O" in current and subsequent steps. At step t, only 059 the entity type "MISC" (miscellaneous) is labeled, 060 which leads to the wrong prediction of the entity 061 "Croatia". This indicates that the model has for-062 gotten the entity information of "LOC" learned in 063 previous tasks. 064 Directly training the model on the new data 065 will exacerbate this problem with background shift 066 130 in significantly fewer parameters to fine-tune com-131 pared to prior INER methods. During inference, all 132 prefixes collaborate to generate a sequence of entity 133 types from current options and their corresponding 134 entities. Moreover, we integrate the generation-135 based label augmentation strategy and self-entropy 136 loss to achieve a more refined equilibrium between 137 stability and plasticity. 138 Our main contributions are summarized as fol-139 lows: 140 • We propose a dynamic prefix method to retain 141 task-invariant capabilities and preserve task-142 specific knowledge in INER. 143 • As an instructor, our proposed dynamic prefix 144 method inspires the seq2seq model, demon-145 strating robustness and practicality, particu-146 larly in more realistic low-resource setting. 147 • Empirical experiments on INER benchmark 148 demonstrate the effectiveness of our proposed 149 DPI. Notably, our method based on generation 150 architecture achieves better performance with 151 significantly fewer fine-tuned parameters than 152 prior sequence labeling INER methods. 153 2 Related Work 154 2.1 Class-Incremental Learning 155 Prior approaches to class-incremental learning can 156 be divided into three categories: (1) Architecture-157 based methods dynamically adjust the model ar-158 chitecture to learn new knowledge while mitigat-159 ing forgetting of previously learned tasks (Chen 160