EMNLP2023

From Dissonance to Insights: Dissecting Disagreements in Rationale Construction for Case Outcome Classification

Shanshan Xu, T. Y. S. S. Santosh, Oana Ichim, Isabella Risini, Barbara Plank, Matthias Grabmair

被引用 1 次

摘要

In legal NLP, Case Outcome Classification<br/>(COC) must not only be accurate but also<br/>trustworthy and explainable. Existing work<br/>in explainable COC has been limited to an-<br/>notations by a single expert. However, it is<br/>well-known that lawyers may disagree in their<br/>assessment of case facts. We hence collect<br/>a novel dataset RAVE: Rationale Variation<br/>in ECHR1, which is obtained from two ex-<br/>perts in the domain of international human<br/>rights law, for whom we observe weak agree-<br/>ment. We study their disagreements and build a<br/>two-level task-independent taxonomy, supple-<br/>mented with COC-specific subcategories. We<br/>quantitatively assess different taxonomy cate-<br/>gories and find that disagreements mainly stem<br/>from underspecification of the legal context,<br/>which poses challenges given the typically lim-<br/>ited granularity and noise in COC metadata. To<br/>our knowledge, this is the first work in the legal<br/>NLP that focuses on building a taxonomy over<br/>human label variation. We further assess the ex-<br/>plainablility of state-of-the-art COC models on<br/>RAVE and observe limited agreement between<br/>models and experts. Overall, our case study re-<br/>veals hitherto underappreciated complexities in<br/>creating benchmark datasets in legal NLP that<br/>revolve around identifying aspects of a case’s<br/>facts supposedly relevant to its outcome