ACL2023

Few-shot Event Detection: An Empirical Study and a Unified View

Yubo Ma, Zehao Wang, Yixin Cao, Aixin Sun

7 citations

Abstract

Few-shot event detection (ED) has been widely studied, while this brings noticeable discrepancies, e.g., various motivations, tasks, and experimental settings, that hinder the understanding of models for future progress. This paper presents a thorough empirical study, a unified view of ED models, and a better unified baseline. For fair evaluation, we compare 12 representative methods on three datasets, which are roughly grouped into prompt-based and prototype-based models for detailed analysis. Experiments consistently demonstrate that prompt-based methods, including Chat-GPT, still significantly trail prototype-based methods in terms of overall performance. To investigate their superior performance, we break down their design elements along several dimensions and build a unified framework on prototype-based methods. Under such unified view, each prototype-method can be viewed a combination of different modules from these design elements. We further combine all advantageous modules and propose a simple yet effective baseline, which outperforms existing methods by a large margin (e.g., 2.7% F 1 gains under low-resource setting). 1 Method Task setting Experimental setting LR EL CT TT Dataset Sample Number Sample Source Prototype-based Seed-based (Bronstein et al., 2015) ✓ ACE 30 Guidelines MSEP (Peng et al., 2016) ✓ ✓ ACE 0 Guidelines ZSL (Huang et al., 2018) ✓ ACE 0 Datasets DMBPN (Deng et al., 2020) ✓ FewEvent 5,10,15-shot Datasets OntoED (Deng et al., 2021) ✓ ✓ MAVEN / FewEvent 0,1,5,10,15,20% Datasets Zhang's (Zhang et al., 2021) ✓ ACE 0 Corpus PA-CRF (Cong et al., 2021) ✓ FewEvent 5,10-shot Datasets ProAcT (Lai et al., 2021) ✓ ACE / FewEvent / RAMS 5,10-shot Datasets CausalED (Chen et al., 2021) ✓ ACE / MAVEN / ERE 5-shot Datasets Yu's (Yu et al., 2022) ✓ ACE 176 Guidelines + Corpus ZED (Zhang et al., 2022a) ✓ MAVEN 0 Corpus HCL-TAT (Zhang et al., 2022b) ✓ FewEvent 5,10-shot Datasets KE-PN (Zhao et al., 2022) ✓ ACE / MAVEN / FewEvent 1,5-shot Datasets Prompt-based EERC (Liu et al., 2020) ✓ ✓ ✓ ACE 0,1,5,10,20% Datasets FSQA (Feng et al., 2020) ✓ ✓ ACE 0,1,3,5,7,9-shot Datasets EDTE (Lyu et al., 2021) ✓ ACE / ERE 0 -Text2Event (Lu et al., 2021) ✓ ACE / ERE 1,5,25% Datasets UIE (Lu et al., 2022) ✓ ✓ ACE / CASIE 1,5,10-shot/% Datasets DEGREE (Hsu et al., 2022) ✓ ✓ ACE / ERE 0,1,5,10-shot Datasets PILED (Li et al., 2022b) ✓ ✓ ACE / MAVEN / FewEvent 5,10-shot Datasets duct an empirical study of twelve SOTA methods under two practical settings: low-resource setting for generalization ability and class-transfer setting for transferability. We roughly classify the existing methods into two groups: prototype-based models to learn event-type representations and proximity measurement for prediction and prompt-based models that convert ED into a familiar task of Pretrained Language Models (PLMs).