ACL2025

Enhancing NER by Harnessing Multiple Datasets with Conditional Variational Autoencoders

Taku Oi, Makoto Miwa

Abstract

We propose a novel method to integrate a Conditional Variational Autoencoder (CVAE) into a span-based Named Entity Recognition (NER) model. This approach models shared and unshared information among labels in multiple datasets, thereby easing training on these datasets. Experimental results using multiple biomedical datasets demonstrate the effectiveness of the proposed method, showing improved performance on the BioRED dataset. Our source code for this implementation is publicly available at GitHub 1 .