STOC2020

Optimally resilient codes for list-decoding from insertions and deletions

Venkatesan Guruswami, Bernhard Haeupler, Amirbehshad Shahrasbi

42 citations

Abstract

We give a complete answer to the following basic question: “What is the maximal fraction of deletions or insertions tolerable by <inline-formula> <tex-math notation="LaTeX">qq </tex-math></inline-formula>-ary list-decodable codes with non-vanishing information rate?” This question has been open even for binary codes, including the restriction to the binary insertion-only setting, where the best-known result was that a <inline-formula> <tex-math notation="LaTeX">γ0.707\gamma \leq 0.707 </tex-math></inline-formula> fraction of insertions is tolerable by some binary code family. For any desired <inline-formula> <tex-math notation="LaTeX">ε>0\varepsilon > 0 </tex-math></inline-formula>, we construct a family of binary codes of positive rate which can be efficiently list-decoded from any combination of <inline-formula> <tex-math notation="LaTeX">γ\gamma </tex-math></inline-formula> fraction of insertions and <inline-formula> <tex-math notation="LaTeX">δ\delta </tex-math></inline-formula> fraction of deletions as long as <inline-formula> <tex-math notation="LaTeX">γ+2δ1ε\gamma + 2\delta \leq 1 - \varepsilon </tex-math></inline-formula>. On the other hand, for any <inline-formula> <tex-math notation="LaTeX">γ,δ\gamma, \delta </tex-math></inline-formula> with <inline-formula> <tex-math notation="LaTeX">γ+2δ=1\gamma + 2\delta = 1 </tex-math></inline-formula> list-decoding is impossible. Our result thus precisely characterizes the feasibility region of binary list-decodable codes for insertions and deletions. We further generalize our result to codes over any finite alphabet of size <inline-formula> <tex-math notation="LaTeX">qq </tex-math></inline-formula>. Surprisingly, our work reveals that the feasibility region for <inline-formula> <tex-math notation="LaTeX">q>2q>2 </tex-math></inline-formula> is <italic>not</italic> the natural generalization of the binary bound above. We provide tight upper and lower bounds that precisely pin down the feasibility region, which turns out to have a <inline-formula> <tex-math notation="LaTeX">(q1)(q-1) </tex-math></inline-formula>-piece-wise linear boundary whose <inline-formula> <tex-math notation="LaTeX">qq </tex-math></inline-formula> corner-points lie on a quadratic curve. The main technical work in our results is proving the existence of code families of sufficiently large <italic>size</italic> with good list-decoding properties for any combination of <inline-formula> <tex-math notation="LaTeX">δ,γ\delta,\gamma </tex-math></inline-formula> within the claimed feasibility region. We achieve this via an intricate analysis of codes introduced by [Bukh and Ma, 2014]. Finally, we give a simple yet powerful concatenation scheme for list-decodable insertion-deletion codes which transforms any such (non-efficient) code family (with vanishing information rate) into an efficiently decodable code family with constant rate.