ACL2021

Perceptual Models of Machine-Edited Text

Elizabeth M. Merkhofer, Monica-Ann Mendoza, Rebecca Marvin, John C. Henderson

Abstract

We introduce a novel dataset of human judgments of machine-edited text and initial models of those perceptions. Six machine-editing methods ranging from character swapping to variational autoencoders are applied to collections of English-language social media text and scientific abstracts. The edits are judged in context for detectability and the extent to which they preserve the meaning of the original. Automated measures of semantic similarity and fluency are evaluated individually and combined to produce composite models of human perception. Both meaning preservation and detectability are predicted within 6% of the upper bound of human consensus labeling.