ACL2021

Out of Order: How important is the sequential order of words in a sentence in Natural Language Understanding tasks?

Thang M. Pham, Trung Bui, Long Mai, Anh Nguyen

Abstract

Do state-of-the-art natural language understanding models care about word order? Not always! We found 75% to 90% of the correct predictions of BERT-based classifiers, trained on many GLUE tasks, remain constant after input words are randomly shuffled. Although BERT embeddings are famously contextual, the contribution of each individual word to classification is almost unchanged even after its surrounding words are shuffled. BERTbased models exploit superficial cues (e.g. the sentiment of keywords in sentiment analysis; or the word-wise similarity between sequencepair inputs in natural language inference) to make correct decisions when tokens are randomly shuffled. Encouraging models to capture word order information improves the performance on most GLUE tasks and SQuAD 2.0. Our work suggests that many GLUE tasks are not challenging machines to understand the meaning of a sentence.