EMNLP2025

BIRD: Bronze Inscription Restoration and Dating

Wenjie Hua, Hoang H. Nguyen, Gangyan Ge

Abstract

Bronze inscriptions from early China are often fragmentary, with missing or undeciphered characters and uncertain chronological assignments. To address this, we propose BIRD (Bronze Inscription Restoration and Dating), a dataset and framework that leverages pretrained language models (PLMs) tailored to the unique demands of ancient texts. By integrating domain-adaptive pretraining (DAPT) and task-adaptive pretraining (TAPT) techniques, along with a glyph net resource that links graphemes and allographs, our approach overcomes key challenges in low-resource settings and the prevalence of allography. Our results show marked improvements in both restoration and dating accuracy.