ACL2025

DREsS: Dataset for Rubric-based Essay Scoring on EFL Writing

Haneul Yoo, Jieun Han, So-Yeon Ahn, Alice Oh

被引用 13 次

摘要

Automated essay scoring (AES) is a useful tool in English as a Foreign Language (EFL) writing education, offering real-time essay scores for students and instructors. However, previous AES models were trained on essays and scores irrelevant to the practical scenarios of EFL writing education and usually provided a single holistic score due to the lack of appropriate datasets. In this paper, we release DREsS, a large-scale, standard dataset for rubric-based automated essay scoring with 48.9K samples in total. DREsS comprises three sub-datasets: DREsS New , DREsS Std. , and DREsS CASE . We collect DREsS New , a real-classroom dataset with 2.3K essays authored by EFL undergraduate students and scored by English education experts. We also standardize existing rubricbased essay scoring datasets as DREsS Std. . We suggest CASE, a corruption-based augmentation strategy for essays, which generates 40.1K synthetic samples of DREsS CASE and improves the baseline results by 45.44%. DREsS will enable further research to provide a more accurate and practical AES system for EFL writing education. 1 1. DREsS_New (2,279 samples) EFL classroom data: 1) Student-written essays 2) Rubric-based scores assessed by instructors 2. DREsS_Std. (6,515 samples) Unified AES datasets with standardized rubrics under professional consultation Corruption 3. DREsS_CASE (40,185 samples) Synthetic essay samples generated by CASE, our proposed augmentation strategy