Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
GPT-5.1
GPT-5.1 71 tok/s
Gemini 2.5 Flash 146 tok/s Pro
Gemini 2.5 Pro 46 tok/s Pro
Kimi K2 187 tok/s Pro
Claude Sonnet 4.5 36 tok/s Pro
2000 character limit reached

Sensitivity of string compressors and repetitiveness measures (2107.08615v6)

Published 19 Jul 2021 in cs.DS

Abstract: The sensitivity of a string compression algorithm $C$ asks how much the output size $C(T)$ for an input string $T$ can increase when a single character edit operation is performed on $T$. This notion enables one to measure the robustness of compression algorithms in terms of errors and/or dynamic changes occurring in the input string. In this paper, we analyze the worst-case multiplicative sensitivity of string compression algorithms, which is defined by $\max_{T \in \Sigman}{C(T')/C(T) : ed(T, T') = 1}$, where $ed(T, T')$ denotes the edit distance between $T$ and $T'$. For the most common versions of the Lempel-Ziv 77 compressors, we prove that the worst-case multiplicative sensitivity is upper bounded by a small constant, and give matching lower bounds. We generalize these results to the smallest bidirectional scheme $b$. In addition, we show that the sensitivity of a grammar-based compressor called GCIS is also a small constant. Further, we extend the notion of the worst-case sensitivity to string repetitiveness measures such as the smallest string attractor size $\gamma$ and the substring complexity $\delta$, and show that the worst-case sensitivity of $\delta$ is also a small constant. These results contrast with the previously known related results such that the size $z_{\rm 78}$ of the Lempel-Ziv 78 factorization can increase by a factor of $\Omega(n{1/4})$ [Lagarde and Perifel, 2018], and the number $r$ of runs in the Burrows-Wheeler transform can increase by a factor of $\Omega(\log n)$ [Giuliani et al., 2021] when a character is prepended to an input string of length $n$. By applying our sensitivity bounds of $\delta$ or the smallest grammar to known results (c.f. [Navarro, 2021]), some non-trivial upper bounds for the sensitivities of important string compressors and repetitiveness measures including $\gamma$, $r$, LZ-End, RePair, LongestMatch, and AVL-grammar are derived.

Citations (18)

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.