EMNLP2020

TORQUE: A Reading Comprehension Dataset of Temporal Ordering Questions

Qiang Ning, Hao Wu, Rujun Han, Nanyun Peng, Matt Gardner, Dan Roth

79 citations

Abstract

A critical part of reading is being able to understand the temporal relationships between events described in a passage of text, even when those relationships are not explicitly stated. However, current machine reading comprehension benchmarks have practically no questions that test temporal phenomena, so systems trained on these benchmarks have no capacity to answer questions such as "what happened before/after [some event]?" We introduce TORQUE, a new English reading comprehension benchmark built on 3.2k news snippets with 21k human-generated questions querying temporal relationships. Results show that RoBERTa-large achieves an exact-match score of 51% on the test set of TORQUE, about 30% behind human performance. 1 1 https://allennlp.org/torque.html Heavy snow is causing disruption to transport across the UK, with heavy rainfall bringing flooding to the south-west of England. Rescuers searching for a woman trapped in a landslide at her home in Looe, Cornwall, said they had found a body. Q1: What events have already finished? A: searching trapped landslide said found Q2: What events have begun but has not finished? A: snow causing disruption rainfall bringing flooding Q3: What will happen in the future? A: No answers. Q4: What happened before a woman was trapped? A: landslide Q5: What had started before a woman was trapped? A: snow rainfall landslide Q6: What happened while a woman was trapped? A: searching Q7: What happened after a woman was trapped? A: searching said found Q8: What happened at about the same time as the snow? A: rainfall Q9: What happened after the snow started? A: causing disruption bringing flooding searching trapped landslide said found Q10: What happened before the snow started? A: No answers.