EMNLP2023

A Fine-Grained Taxonomy of Replies to Hate Speech

Xinchen Yu, Ashley Zhao, Eduardo Blanco, Lingzi Hong

1 citation

Abstract

Countering rather than censoring hate speech has emerged as a promising strategy to address hatred. There are many types of counterspeech in user-generated content: addressing the hateful content or its author, generic requests, well-reasoned counter arguments, insults, etc. The effectiveness of counterspeech, which we define as subsequent incivility, depends on these types. In this paper, we present a theoretically grounded taxonomy of replies to hate speech and a new corpus. We work with real, user-generated hate speech and all the replies it elicits rather than replies generated by a third party. Our analyses provide insights into the content real users reply with as well as which replies are empirically most effective. We also experiment with models to characterize the replies to hate speech, thereby opening the door to estimating whether a reply to hate speech will result in further incivility. Error Type % Example Ground Truth Predicted Rhetorical 26 Hate: F**k worthless inbreds who've contributed nothing to society. question Reply: Where are your contributions? I doubt there's any. Author Content Irony 21 Hate: Retarded republicans fear everything. Reply: It's amazing how broken you have to be to believe in their positions as a whole.