Subpolynomial trace reconstruction for random strings and arbitrary deletion probability (1801.04783v2)
Abstract: The insertion-deletion channel takes as input a bit string ${\bf x}\in{0,1}{n}$, and outputs a string where bits have been deleted and inserted independently at random. The trace reconstruction problem is to recover $\bf x$ from many independent outputs (called "traces") of the insertion-deletion channel applied to $\bf x$. We show that if $\bf x$ is chosen uniformly at random, then $\exp(O(\log{1/3} n))$ traces suffice to reconstruct $\bf x$ with high probability. For the deletion channel with deletion probability $q < 1/2$ the earlier upper bound was $\exp(O(\log{1/2} n))$. The case of $q\geq 1/2$ or the case where insertions are allowed has not been previously analyzed, and therefore the earlier upper bound was as for worst-case strings, i.e., $\exp(O( n{1/3}))$. We also show that our reconstruction algorithm runs in $n{1+o(1)}$ time. A key ingredient in our proof is a delicate two-step alignment procedure where we estimate the location in each trace corresponding to a given bit of $\bf x$. The alignment is done by viewing the strings as random walks and comparing the increments in the walk associated with the input string and the trace, respectively.