WWW2026

Edit-Level Tracking of Narrative Changes in 10-K Filings

Xiao Li, Changhong Jin, Yingjie Niu, Jingyi Wang, Ruihai Dong

Abstract

The Form 10-K provides a detailed overview of a company's operations, strategies, competitive environment, and future outlook. Investors meticulously analyse annual changes in these reports to identify subtle shifts in disclosed information. However, financial narratives often contain extensive boilerplate text, which dilutes the efficacy of traditional sentiment analysis. We propose a novel edit-level analysis framework to isolate meaningful changes in Management Discussion and Analysis (MD&A) sections of 10-K filings. Using sentence embeddings, we classify each sentence in the filings for years t and t+1 as Kept (repeated boilerplate), Deleted (removed information), and Added (newly introduced information). We then construct a net sentiment change signal from the difference in average FinBERT-tone sentiment between Added and Deleted content. This Core Sentiment measure has been demonstrated to outperform conventional bag-of-words sentiment measures. Furthermore, based on cleaned and standardised 10-K corpora (e.g., EDGAR-CORPUS) and our Flex10K pre-processing pipeline, our approach provides a reproducible basis for quantifying changes between annual reports.