A note on the longest common substring with $k$-mismatches problem (1409.7217v2)

Published 25 Sep 2014 in cs.DS

Abstract: The recently introduced longest common substring with $k$-mismatches ($k$-LCF) problem is to find, given two sequences $S_1$ and $S_2$ of length $n$ each, a longest substring $A_1$ of $S_1$ and $A_2$ of $S_2$ such that the Hamming distance between $A_1$ and $A_2$ is at most $k$. So far, the only subquadratic time result for this problem was known for $k = 1$~\cite{FGKU2014}. We first present two output-dependent algorithms solving the $k$-LCF problem and show that for $k = O(\log^{{1-\varepsilon}} n)$, where $\varepsilon > 0$, at least one of them works in subquadratic time, using $O(n)$ words of space. The choice of one of these two algorithms to be applied for a given input can be done after linear time and space preprocessing. Finally we present a tabulation-based algorithm working, in its range of applicability, in $O(n^{2\log\min(k+\ell_0,} \sigma)/\log n)$ time, where $\ell_0$ is the length of the standard longest common substring.

Citations (18)

View on Semantic Scholar