Approximate Circular Pattern Matching (2208.08915v1)
Abstract: We consider approximate circular pattern matching (CPM, in short) under the Hamming and edit distance, in which we are given a length-$n$ text $T$, a length-$m$ pattern $P$, and a threshold $k>0$, and we are to report all starting positions of fragments of $T$ (called occurrences) that are at distance at most $k$ from some cyclic rotation of $P$. In the decision version of the problem, we are to check if any such occurrence exists. All previous results for approximate CPM were either average-case upper bounds or heuristics, except for the work of Charalampopoulos et al. [CKP$+$, JCSS'21], who considered only the Hamming distance. For the reporting version of the approximate CPM problem, under the Hamming distance we improve upon the main algorithm of [CKP$+$, JCSS'21] from ${\cal O}(n+(n/m)\cdot k4)$ to ${\cal O}(n+(n/m)\cdot k3\log\log k)$ time; for the edit distance, we give an ${\cal O}(nk2)$-time algorithm. We also consider the decision version of the approximate CPM problem. Under the Hamming distance, we obtain an ${\cal O}(n+(n/m)\cdot k2\log k/\log\log k)$-time algorithm, which nearly matches the algorithm by Chan et al. [CGKKP, STOC'20] for the standard counterpart of the problem. Under the edit distance, the ${\cal O}(nk\log3 k)$ runtime of our algorithm nearly matches the ${\cal O}(nk)$ runtime of the Landau-Vishkin algorithm [LV, J. Algorithms'89]. As a stepping stone, we propose ${\cal O}(nk\log3 k)$-time algorithm for the Longest Prefix $k'$-Approximate Match problem, proposed by Landau et al. [LMS, SICOMP'98], for all $k'\in {1,\dots,k}$. We give a conditional lower bound that suggests a polynomial separation between approximate CPM under the Hamming distance over the binary alphabet and its non-circular counterpart. We also show that a strongly subquadratic-time algorithm for the decision version of approximate CPM under edit distance would refute SETH.
- Panagiotis Charalampopoulos (34 papers)
- Tomasz Kociumaka (97 papers)
- Jakub Radoszewski (52 papers)
- Solon P. Pissis (52 papers)
- Wojciech Rytter (39 papers)
- Tomasz Waleń (32 papers)
- Wiktor Zuba (18 papers)