ACL2023
Improving Self-training for Cross-lingual Named Entity Recognition with Contrastive and Prototype Learning
Ran Zhou, Xin Li, Lidong Bing, Erik Cambria, Chunyan Miao
被引用 19 次
摘要
In cross-lingual named entity recognition (NER), self-training is commonly used to bridge the linguistic gap by training on pseudolabeled target-language data. However, due to sub-optimal performance on target languages, the pseudo labels are often noisy and limit the overall performance. In this work, we aim to improve self-training for cross-lingual NER by combining representation learning and pseudo label refinement in one coherent framework. Our proposed method, namely ContProto mainly comprises two components: (1) contrastive self-training and (2) prototype-based pseudo-labeling. Our contrastive self-training facilitates span classification by separating clusters of different classes, and enhances crosslingual transferability by producing closelyaligned representations between the source and target language. Meanwhile, prototype-based pseudo-labeling effectively improves the accuracy of pseudo labels during training. We evaluate ContProto on multiple transfer pairs, and experimental results show our method brings in substantial improvements over current stateof-the-art methods. 1 * Ran Zhou is under the Joint Ph.D. Program between Alibaba and Nanyang Technological University.