Near-Optimal Average-Case Approximate Trace Reconstruction from Few Traces

Near-Optimal Average-Case Approximate Trace Reconstruction from Few Traces.
X. Chen and A. De and C.-H. Lee and R. Servedio and S. Sinha.
ACM-SIAM Symposium on Discrete Algoithms (SODA), pp. 779-821, 2022.

Abstract:

In the standard trace reconstruction problem, the goal is to exactly reconstruct an unknown source string $x \in \{0,1\}^n$ from independent "traces", which are copies of $x$ that have been corrupted by a $\delta$-deletion channel which independently deletes each bit of $x$ with probability $\delta$ and concatenates the surviving bits. We study the approximate trace reconstruction problem, in which the goal is only to obtain a high-accuracy approximation of $x$ rather than an exact reconstruction.

We give an efficient algorithm, and a near-matching lower bound, for approximate reconstruction of a random source string $x \in \{0,1\}^n$ from few traces. Our main algorithmic result is a polynomial-time algorithm with the following property: for any deletion rate $\delta \in (0,1)$ (which may depend on $n$), for almost every source string $x \in \{0,1\}^n$, given any number $M \leq \Theta(1/\delta)$ of traces from $\Del_\delta(x)$, the algorithm constructs a hypothesis string $\hat{x}$ that has edit distance at most $n \cdot (\delta M)^{\Omega(M)}$ from $x$. We also prove a near-matching information-theoretic lower bound showing that given $M \leq \Theta(1/\delta)$ traces from $\Del_\delta(x)$ for a random n-bit string $x$, the smallest possible expected edit distance that any algorithm can achieve, regardless of its running time, is $n \cdot (\delta M)^{O(M)}.$

pdf of full version

Back to main papers page