Emergent Mind

Substring Complexities on Run-length Compressed Strings

(2205.12421)
Published May 25, 2022 in cs.DS

Abstract

Let $S{T}(k)$ denote the set of distinct substrings of length $k$ in a string $T$, then the $k$-th substring complexity is defined by its cardinality $|S{T}(k)|$. Recently, $\delta = \max { |S{T}(k)| / k : k \ge 1 }$ is shown to be a good compressibility measure of highly-repetitive strings. In this paper, given $T$ of length $n$ in the run-length compressed form of size $r$, we show that $\delta$ can be computed in $\mathit{C}{\mathsf{sort}}(r, n)$ time and $O(r)$ space, where $\mathit{C}{\mathsf{sort}}(r, n) = O(\min (r \lg\lg r, r \lg{r} n))$ is the time complexity for sorting $r$ $O(\lg n)$-bit integers in $O(r)$ space in the Word-RAM model with word size $\Omega(\lg n)$.

We're not able to analyze this paper right now due to high demand.

Please check back later (sorry!).

Generate a summary of this paper on our Pro plan:

We ran into a problem analyzing this paper.

Newsletter

Get summaries of trending comp sci papers delivered straight to your inbox:

Unsubscribe anytime.