ACL2024
Order-Agnostic Data Augmentation for Few-Shot Named Entity Recognition
Huiming Wang, Liying Cheng, Wenxuan Zhang, De Wen Soh, Lidong Bing
摘要
Data augmentation (DA) methods have been proven to be effective for pre-trained language models (PLMs) in low-resource settings, including few-shot named entity recognition (NER). However, existing NER DA techniques either perform rule-based manipulations on words that break the semantic coherence of the sentence, or exploit generative models for entity or context substitution, which requires a substantial amount of labeled data and contradicts the objective of operating in low-resource settings. In this work, we propose orderagnostic data augmentation (OADA), an alternative solution that exploits the often overlooked order-agnostic property in the training data construction phase of sequence-tosequence NER methods for data augmentation. To effectively utilize the augmented data without suffering from the one-to-many issue, where multiple augmented target sequences exist for one single sentence, we further propose the use of ordering instructions and an innovative OADA-XE loss. Specifically, by treating each permutation of entity types as an ordering instruction, we rearrange the entity set accordingly, ensuring a distinct input-output pair, while OADA-XE assigns loss based on the best match between the target sequence and model predictions. We conduct comprehensive experiments and analyses across three major NER benchmarks and can significantly enhance the few-shot capabilities of PLMs with OADA. Our code is available at https://github.com/Circle-Ming/OADA-NER .