ACL2025
Harnessing Whisper for Prosodic Stress Analysis
Samuel S. Sohn, Sten Knutsen, Karin Stromswold
1 citation
Abstract
Prosody affects how people produce and understand language, yet studies of how it does so have been hindered by the lack of efficient tools for analyzing prosodic stress. We fine-tune OpenAI Whisper large-v2, a stateof-the-art speech recognition model, to recognize phrasal, lexical, and contrastive stress using a small, carefully annotated dataset. Our results show that Whisper can learn distinct, gender-specific stress patterns to achieve nearhuman and super-human accuracy in stress classification and transfer its learning from one type of stress to another, surpassing traditional machine learning models. Furthermore, we explore how acoustic context influences its performance and propose a novel black-box evaluation method for characterizing the decision boundaries used by Whisper for prosodic stress interpretation. These findings open new avenues for large-scale, automated prosody research. Models can be found at github.com/SSSohn/ProsodyBench. Stress Minimal Pair Transcription Phrasal The <greenhouse / green house> spoils the view. Phrasal There's a <darkroom / dark room> in this house. Phrasal The <whiteboard / white board> needs cleaning. Phrasal That <hotdog / hot dog> is under the table. Phrasal A <blackbird / black bird> just flew past. Phrasal His <wetsuit / wet suit> is on the floor. Phrasal That <bluebell / blue bell> is pretty. Phrasal The <bullseye / bull's eye> is red. Lexical <DIFfer / deFER> Lexical <DIScard / disCARD> Lexical <DIScount / disCOUNT> Lexical <INcrease / inCREASE> Lexical <INdent / inDENT> Lexical <INsert / inSERT> Lexical <INsight / inCITE> Lexical <INsult / inSULT> Contra. The <BLACK cow / black COW> has the ball. Contra. The <BLACK sheep / black SHEEP> has the ball. Contra. The <BLUE cow / blue COW> has the ball. Contra. The <BLUE sheep / blue SHEEP> has the ball. Contra. The <RED cow / red COW> has the ball. Contra. The <RED sheep / red SHEEP> has the ball. Contra. The <WHITE cow / white COW> has the ball. Contra. The <WHITE sheep / white SHEEP> has the ball.