ACL2024

MAVEN-ARG: Completing the Puzzle of All-in-One Event Understanding Dataset with Event Argument Annotation

Xiaozhi Wang, Hao Peng, Yong Guan, Kaisheng Zeng, Jianhui Chen, Lei Hou, Xu Han, Yankai Lin, Zhiyuan Liu, Ruobing Xie, Jie Zhou, Juanzi Li

DOI Publisher

Abstract

Understanding events in texts is a core objective of natural language understanding, which requires detecting event occurrences, extracting event arguments, and analyzing inter-event relationships. However, due to the annotation challenges brought by task complexity, a largescale dataset covering the full process of event understanding has long been absent. In this paper, we introduce MAVEN-ARG, which augments MAVEN datasets with event argument annotations, making the first all-in-one dataset supporting event detection, event argument extraction (EAE), and event relation extraction. As an EAE benchmark, MAVEN-ARG offers three main advantages: (1) a comprehensive schema covering 162 event types and 612 argument roles, all with expert-written definitions and examples; (2) a large data scale, containing 98, 591 events and 290, 613 arguments obtained with laborious human annotation; (3) the exhaustive annotation supporting all task variants of EAE, which annotates both entity and non-entity event arguments in document level. Experiments indicate that MAVEN-ARG is quite challenging for both fine-tuned EAE models and proprietary large language models (LLMs). Furthermore, to demonstrate the benefits of an all-in-one dataset, we preliminarily explore a potential application, future event prediction, with LLMs. MAVEN-ARG and our baseline codes will be publicly released. et al., 2021; Peng et al., 2023b): event detection 043 (ED), which detects event occurrences by identi-044 fying event triggers and classifying event types; 045 event argument extraction (EAE), which extracts 046 event arguments and classifies their argument roles; 047 event relation extraction (ERE), which analyzes 048 the coreference, temporal, causal, and hierarchical 049 relationships among events. 050 Despite the importance of event understand-051 ing, a large-scale dataset covering all the event 052 understanding tasks has long been absent. Es-053 tablished sentence-level event extraction (ED and 054 EAE) datasets like ACE 2005 (Walker et al., 2006) 055 and TAC KBP (Ellis et al., 2015, 2016; Getman 056 et al., 2017) do not involve event relation types 057 besides the basic coreferences. RAMS (Ebner 058 et al., 2020) and WikiEvents (Li et al., 2021) ex-059 tend EAE to the document level but do not in-060 volve event relations. ERE datasets are mostly 061 developed independently for coreference (Cybul-062