EMNLP2023

From Dissonance to Insights: Dissecting Disagreements in Rationale Construction for Case Outcome Classification

Shanshan Xu, T. Y. S. S. Santosh, Oana Ichim, Isabella Risini, Barbara Plank, Matthias Grabmair

被引用 1 次

摘要

In legal NLP, Case Outcome Classification (COC) must not only be accurate but also trustworthy and explainable. Existing work in explainable COC has been limited to an- notations by a single expert. However, it is well-known that lawyers may disagree in their assessment of case facts. We hence collect a novel dataset RAVE: Rationale Variation in ECHR1, which is obtained from two ex- perts in the domain of international human rights law, for whom we observe weak agree- ment. We study their disagreements and build a two-level task-independent taxonomy, supple- mented with COC-specific subcategories. We quantitatively assess different taxonomy cate- gories and find that disagreements mainly stem from underspecification of the legal context, which poses challenges given the typically lim- ited granularity and noise in COC metadata. To our knowledge, this is the first work in the legal NLP that focuses on building a taxonomy over human label variation. We further assess the ex- plainablility of state-of-the-art COC models on RAVE and observe limited agreement between models and experts. Overall, our case study re- veals hitherto underappreciated complexities in creating benchmark datasets in legal NLP that revolve around identifying aspects of a case’s facts supposedly relevant to its outcome