STOC2025
Faster Weighted and Unweighted Tree Edit Distance and APSP Equivalence
Jakob Nogler, Adam Polak, Barna Saha, Virginia Vassilevska Williams, Yinzhan Xu, Christopher Ye
4 citations
Abstract
The tree edit distance (TED) between two rooted ordered trees with ๐ nodes labeled from an alphabet ฮฃ is the minimum cost of transforming one tree into the other by a sequence of valid operations consisting of insertions, deletions and relabeling of nodes. The tree edit distance is a well-known generalization of string edit distance and has been studied since the 1970s. Its running time has seen steady improvements starting with an ๐ช(๐ 6 ) algorithm [Tai, J.ACM 1979], improved to ๐ช(๐ 4 ) [Shasha, Zhang, SICOMP 1989] and to ๐ช(๐ 3 log ๐) [Klein, ESA 1998], and culminating in an ๐ช(๐ 3 ) algorithm [Demaine, Mozes, Rossman, Weimann, ACM TALG 2010]. The latter is known to be optimal for any dynamic programming based algorithm that falls under a certain decomposition framework that captures all known sub-๐ 4 time algorithms. Fine-grained complexity casts further light onto this hardness showing that a truly subcubic time algorithm for TED implies a truly subcubic time algorithm for All-Pairs Shortest Paths (APSP) [Bringmann, Gawrychowski, Mozes, Weimann, ACM TALG 2020]. Therefore, under the popular APSP hypothesis, a truly subcubic time algorithm for TED cannot exist. However, unlike many problems in fine-grained complexity for which conditional hardness based on APSP also comes with equivalence to APSP, whether TED can be reduced to APSP has remained unknown. In this paper, we resolve this. Not only we show that TED is fine-grained equivalent to APSP, our reduction is tight enough, so that combined with the fastest APSP algorithm to-date [Williams, SICOMP 2018] it gives the first ever subcubic time algorithm for TED running in ๐ 3 /2 ฮฉ( โ log ๐) time. We also consider the unweighted tree edit distance problem in which the cost of each edit (insertion, deletion, and relabeling) is one. For unweighted TED, a truly subcubic algorithm is known due to Mao [Mao, FOCS 2022], and later improved slightly by D รผrr [D รผrr, IPL 2023] to run in ๐ช(๐ 2.9148 ) time. Since their algorithm uses bounded monotone min-plus product as a crucial subroutine, and the best running time for this product is ร(๐ 3+๐ 2 ) โค ๐ช(๐ 2.6857 ) (where ๐ is the exponent of fast matrix multiplication), the much higher running time of unweighted TED remained unsatisfactory. In this work, we close this gap and give an algorithm for unweighted TED that runs in ร(๐ 3+๐ 2 ) time.