ACL2025
RusConText Benchmark: A Russian Language Evaluation Benchmark for Understanding Context
Andrey Chirkin, Svetlana Kuznetsova, Maria Volina, Anna Dengina
摘要
This paper represents an implementation of an approach rather similar to that of Zhu et al. (2024) , adapted for the Russian-language data. We introduce the RusConText Benchmark for evaluating short-context understanding in Russian, comprising four distinct yet interrelated tasks: coreference resolution, discourse understanding, idiom interpretation and ellipsis resolution. Each task targets a specific aspect of linguistic processing, challenging a large language model to recover omitted information, resolve referential dependencies, interpret idioms and discourse. The RusConText Benchmark is an additional resource beyond standard benchmarks, designed to assess model performance from a specific perspective. In addition, we present the results of scoring 4 models on our benchmark.