ACL2020
Influence Paths for Characterizing Subject-Verb Number Agreement in LSTM Language Models
Kaiji Lu, Piotr Mardziel, Klas Leino, Matt Fredrikson, Anupam Datta
8 citations
Abstract
LSTM-based recurrent neural networks are the state-of-the-art for many natural language processing (NLP) tasks. Despite their performance, it is unclear whether, or how, LSTMs learn structural features of natural languages such as subject-verb number agreement in English. Lacking this understanding, the generality of LSTMs on this task and their suitability for related tasks remains uncertain. Further, errors cannot be properly attributed to a lack of structural capability, training data omissions, or other exceptional faults. We introduce influence paths, a causal account of structural properties as carried by paths across gates and neurons of a recurrent neural network. The approach refines the notion of influence (the subject's grammatical number has influence on the grammatical number of the subsequent verb) into a set of gate-level or neuron-level paths. The set localizes and segments the concept (e.g., subject-verb agreement), its constituent elements (e.g., the subject), and related or interfering elements (e.g., attractors). We exemplify the methodology on a widely-studied multi-layer LSTM language model, demonstrating its accounting for subject-verb number agreement. The results offer both a finer and a more complete view of an LSTM's handling of this structural aspect of the English language than prior results based on diagnostic classifiers and ablation. Candidate Cell c1 1 Cell c 1 1 Candidate Cell c0 1 Hidden h 0 1 c 1 2 c 1 3 c 1 4 h 1 4 c 0 4 c1 4 c0 4 h 0 4 agreement s4(run) -s4(runs) grammatical number boys -boy+boys