A note on the longest common substring with $k$-mismatches problem (1409.7217v2)
Abstract: The recently introduced longest common substring with $k$-mismatches ($k$-LCF) problem is to find, given two sequences $S_1$ and $S_2$ of length $n$ each, a longest substring $A_1$ of $S_1$ and $A_2$ of $S_2$ such that the Hamming distance between $A_1$ and $A_2$ is at most $k$. So far, the only subquadratic time result for this problem was known for $k = 1$~\cite{FGKU2014}. We first present two output-dependent algorithms solving the $k$-LCF problem and show that for $k = O(\log{1-\varepsilon} n)$, where $\varepsilon > 0$, at least one of them works in subquadratic time, using $O(n)$ words of space. The choice of one of these two algorithms to be applied for a given input can be done after linear time and space preprocessing. Finally we present a tabulation-based algorithm working, in its range of applicability, in $O(n2\log\min(k+\ell_0, \sigma)/\log n)$ time, where $\ell_0$ is the length of the standard longest common substring.
Collections
Sign up for free to add this paper to one or more collections.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.