STOC2025

Faster Weighted and Unweighted Tree Edit Distance and APSP Equivalence

Jakob Nogler, Adam Polak, Barna Saha, Virginia Vassilevska Williams, Yinzhan Xu, Christopher Ye

4 citations

Abstract

The tree edit distance (TED) between two rooted ordered trees with 𝑛 nodes labeled from an alphabet Σ is the minimum cost of transforming one tree into the other by a sequence of valid operations consisting of insertions, deletions and relabeling of nodes. The tree edit distance is a well-known generalization of string edit distance and has been studied since the 1970s. Its running time has seen steady improvements starting with an 𝒪(𝑛 6 ) algorithm [Tai, J.ACM 1979], improved to 𝒪(𝑛 4 ) [Shasha, Zhang, SICOMP 1989] and to 𝒪(𝑛 3 log 𝑛) [Klein, ESA 1998], and culminating in an 𝒪(𝑛 3 ) algorithm [Demaine, Mozes, Rossman, Weimann, ACM TALG 2010]. The latter is known to be optimal for any dynamic programming based algorithm that falls under a certain decomposition framework that captures all known sub-𝑛 4 time algorithms. Fine-grained complexity casts further light onto this hardness showing that a truly subcubic time algorithm for TED implies a truly subcubic time algorithm for All-Pairs Shortest Paths (APSP) [Bringmann, Gawrychowski, Mozes, Weimann, ACM TALG 2020]. Therefore, under the popular APSP hypothesis, a truly subcubic time algorithm for TED cannot exist. However, unlike many problems in fine-grained complexity for which conditional hardness based on APSP also comes with equivalence to APSP, whether TED can be reduced to APSP has remained unknown. In this paper, we resolve this. Not only we show that TED is fine-grained equivalent to APSP, our reduction is tight enough, so that combined with the fastest APSP algorithm to-date [Williams, SICOMP 2018] it gives the first ever subcubic time algorithm for TED running in 𝑛 3 /2 Ω( √ log 𝑛) time. We also consider the unweighted tree edit distance problem in which the cost of each edit (insertion, deletion, and relabeling) is one. For unweighted TED, a truly subcubic algorithm is known due to Mao [Mao, FOCS 2022], and later improved slightly by D ürr [D ürr, IPL 2023] to run in 𝒪(𝑛 2.9148 ) time. Since their algorithm uses bounded monotone min-plus product as a crucial subroutine, and the best running time for this product is Õ(𝑛 3+𝜔 2 ) ≤ 𝒪(𝑛 2.6857 ) (where 𝜔 is the exponent of fast matrix multiplication), the much higher running time of unweighted TED remained unsatisfactory. In this work, we close this gap and give an algorithm for unweighted TED that runs in Õ(𝑛 3+𝜔 2 ) time.