STOC2021

Efficient list-decoding with constant alphabet and list sizes

Zeyu Guo, Noga Ron-Zewi

21 citations

Abstract

We present an explicit and efficient algebraic construction of capacity-achieving list decodable codes with both constant alphabet and constant list sizes. More specifically, for any R ∈ (0, 1) and ǫ > 0, we give an algebraic construction of an infinite family of error-correcting codes of rate R, over an alphabet of size (1/ǫ) O(1/ǫ 2 ) , that can be list decoded from a (1 -Rǫ)-fraction of errors with list size at most exp(poly(1/ǫ)). Moreover, the codes can be encoded in time poly(1/ǫ, n), the output list is contained in a linear subspace of dimension at most poly(1/ǫ), and a basis for this subspace can be found in time poly(1/ǫ, n). Thus, both encoding and list decoding can be performed in fully polynomial-time poly(1/ǫ, n), except for pruning the subspace and outputting the final list which takes time exp(poly(1/ǫ))•poly(n). In contrast, prior explicit and efficient constructions of capacity-achieving list decodable codes either required a much higher complexity in terms of 1/ǫ (and were additionally much less structured), or had super-constant alphabet or list sizes. Our codes are quite natural and structured. Specifically, we use algebraic-geometric (AG) codes with evaluation points restricted to a subfield, and with the message space restricted to a (carefully chosen) linear subspace. Our main observation is that the output list of AG codes with subfield evaluation points is contained in an affine shift of the image of a block-triangular-Toeplitz (BTT) matrix, and that the list size can potentially be reduced to a constant by restricting the message space to a BTT evasive subspace, which is a large subspace that intersects the image of any BTT matrix in a constant number of points. We further show how to explicitly construct such BTT evasive subspaces, based on the explicit subspace designs of Guruswami and Kopparty (Combinatorica, 2016), and composition. * Research supported in part by ISF grant 735/20. The (relative) Hamming distance dist(z, w) between a pair of strings z, w ∈ Σ n is the fraction of coordinates on which z and w differ. 1 message in the presence of some error or corruption. Other desirable properties of an error-correcting code are that its alphabet size would be small (ideally, a constant, independent of the codeword length), and that it admits efficient (poly(n)-time) encoding and decoding algorithms. Clearly, there is a qualitative trade-off between the above parameters: the largest the distance δ is, the smallest the rate R must be. Quantitatively, the Singleton bound states that any code must satisfy that δ ≤ 1 -R. This bound is precisely matched by the classical family of Reed-Solomon (RS) codes [RS60]. Given a finite field F q , and n distinct elements α 1 , α 2 , . . . , α n ∈ F q , the Reed-Solomon code RS q (n, k) with evaluation points α 1 , . . . , α n maps a message A disadvantage of RS codes is that by definition, their alphabet size q must be at least the codeword length n. To match the Singelton bound over a constant-size alphabet, independent of the codeword length n, one can resort to algebraic-geometric (AG) codes that achieve a distance of δ = 1 -Rǫ over a constant-size alphabet (depending on ǫ) [Sti09]. Moreover, both RS and AG codes can be efficiently encoded and decoded up to half their minimum distance [Pet60, BW87, JLJ + 89]. List decoding. In list decoding, the fraction of errors α is large enough so that unique recovery of the message x is impossible (that is, α > δ 2 ). Instead, the goal is, given a received word w, to return a short list L with the guarantee that x ∈ L for any message x with dist(w, C(x)) ≤ α. Besides being a fundamental concept in coding theory, list decoding has found diverse applications in theoretical computer science, for example in cryptography [GL89], learning theory [KM93], average-to-worst-case reductions [CPS99, GRS00], hardness amplification [BFNW93, STV01, Tre03], and pseudo-randomness [TZ04, GUV09, DKSS13, TU12, GRX18]. The list-decoding capacity theorem states that the maximal fraction of errors for which list decoding with non-trivial list sizes is possible is α ≤ 1 -R. Moreover, it is not hard to show that a random code of rate R and alphabet-size exp(1/ǫ) is with high probability list decodable from a (1 -Rǫ)-fraction of errors with list size as small as O(1/ǫ). So in principle, by allowing a small (constant-size) list, one can correct twice as many errors than in the unique decoding setting! However, matching these bounds with an explicit and efficient construction (ideally, encodable and list decodable in fully polynomial-time poly(1/ǫ, n)) turned out to be more challenging than in the unique decoding setting. Capacity-achieving list decodable codes. The celebrated work of Guruswami and Sudan [Sud97, GS99] showed that RS codes can be efficiently list decoded beyond half their minimum distance (up to the so-called Johnson bound), which gave the first family of error-correcting codes that are efficiently list decodable be