ACL2024

ProLex: A Benchmark for Language Proficiency-oriented Lexical Substitution

Xuanming Zhang, Zixun Chen, Zhou Yu

Abstract

Lexical Substitution discovers appropriate 001 substitutes for a given target word in a context 002 sentence. However, the task fails to consider 003 substitutes that are of equal or higher profi-004 ciency than the target, an aspect that could be 005 beneficial for language learners looking to im-006 prove their writing. To bridge this gap, we 007 propose a new task -language proficiency-008 oriented lexical substitution. We also intro-009 duce ProLex, a novel benchmark designed to 010 assess systems' ability to generate not only ap-011 propriate substitutes but also substitutes that 012 demonstrate better language proficiency. Be-013 sides the benchmark, we propose models that 014 can automatically perform the new task. We 015 show that our best model, a Llama2-13B model 016 fine-tuned with task-specific synthetic data, out-017 performs ChatGPT by an average of 3.2% in 018 F-score and achieves comparable results with 019 GPT-4 on ProLex 1 .