Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 134 tok/s
Gemini 2.5 Pro 41 tok/s Pro
GPT-5 Medium 28 tok/s Pro
GPT-5 High 29 tok/s Pro
GPT-4o 71 tok/s Pro
Kimi K2 208 tok/s Pro
GPT OSS 120B 426 tok/s Pro
Claude Sonnet 4.5 37 tok/s Pro
2000 character limit reached

Sensitivity of string compressors and repetitiveness measures (2107.08615v6)

Published 19 Jul 2021 in cs.DS

Abstract: The sensitivity of a string compression algorithm $C$ asks how much the output size $C(T)$ for an input string $T$ can increase when a single character edit operation is performed on $T$. This notion enables one to measure the robustness of compression algorithms in terms of errors and/or dynamic changes occurring in the input string. In this paper, we analyze the worst-case multiplicative sensitivity of string compression algorithms, which is defined by $\max_{T \in \Sigman}{C(T')/C(T) : ed(T, T') = 1}$, where $ed(T, T')$ denotes the edit distance between $T$ and $T'$. For the most common versions of the Lempel-Ziv 77 compressors, we prove that the worst-case multiplicative sensitivity is upper bounded by a small constant, and give matching lower bounds. We generalize these results to the smallest bidirectional scheme $b$. In addition, we show that the sensitivity of a grammar-based compressor called GCIS is also a small constant. Further, we extend the notion of the worst-case sensitivity to string repetitiveness measures such as the smallest string attractor size $\gamma$ and the substring complexity $\delta$, and show that the worst-case sensitivity of $\delta$ is also a small constant. These results contrast with the previously known related results such that the size $z_{\rm 78}$ of the Lempel-Ziv 78 factorization can increase by a factor of $\Omega(n{1/4})$ [Lagarde and Perifel, 2018], and the number $r$ of runs in the Burrows-Wheeler transform can increase by a factor of $\Omega(\log n)$ [Giuliani et al., 2021] when a character is prepended to an input string of length $n$. By applying our sensitivity bounds of $\delta$ or the smallest grammar to known results (c.f. [Navarro, 2021]), some non-trivial upper bounds for the sensitivities of important string compressors and repetitiveness measures including $\gamma$, $r$, LZ-End, RePair, LongestMatch, and AVL-grammar are derived.

Citations (18)

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.