ACL2023

A Fast Algorithm for Computing Prefix Probabilities

Franz Nowak, Ryan Cotterell

被引用 1 次

摘要

Multiple algorithms are known for efficiently calculating the prefix probability of a string under a probabilistic context-free grammar (PCFG). Good algorithms for the problem have a runtime cubic in the length of the input string. However, some proposed algorithms are suboptimal with respect to the size of the grammar. This paper proposes a novel speed-up of Jelinek and Lafferty's ( 1991 ) algorithm, whose original runtime is O(N 3 |N | 3 + |N | 4 ), where N is the input length and |N | is the number of non-terminals in the grammar. In contrast, our speed-up runs in O(N 2 |N | 3 + N 3 |N | 2 ). https://github.com/rycolab/ prefix-parsing * ⇒ is the closure over applications of derivation rules of the grammar. 1 Our paper gives a more efficient algorithm for the simultaneous computation of the prefix probabilities of all prefixes of a string w under a PCFG. The authors are aware of two existing efficient algorithms to compute prefix probabilities under a PCFG. 2 The first is Jelinek and Lafferty's (1991) 1 Specifically, α * ⇒ β means that there exists an n ≥ 0 such that α ⇒ • • • ⇒ n times β, where ⇒ marks a derivation step.