Papers
Topics
Authors
Recent
Detailed Answer
Quick Answer
Concise responses based on abstracts only
Detailed Answer
Well-researched responses based on abstracts and relevant paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses
Gemini 2.5 Flash
Gemini 2.5 Flash 39 tok/s
Gemini 2.5 Pro 49 tok/s Pro
GPT-5 Medium 12 tok/s Pro
GPT-5 High 18 tok/s Pro
GPT-4o 91 tok/s Pro
Kimi K2 191 tok/s Pro
GPT OSS 120B 456 tok/s Pro
Claude Sonnet 4 37 tok/s Pro
2000 character limit reached

Bounds and Constructions of $\ell$-Read Codes under the Hamming Metric (2403.11754v1)

Published 18 Mar 2024 in cs.IT and math.IT

Abstract: Nanopore sequencing is a promising technology for DNA sequencing. In this paper, we investigate a specific model of the nanopore sequencer, which takes a $q$-ary sequence of length $n$ as input and outputs a vector of length $n+\ell-1$ referred to as an $\ell$-read vector where the $i$-th entry is a multi-set composed of the $\ell$ elements located between the $(i-\ell+1)$-th and $i$-th positions of the input sequence. Considering the presence of substitution errors in the output vector, we study $\ell$-read codes under the Hamming metric. An $\ell$-read $(n,d)_q$-code is a set of $q$-ary sequences of length $n$ in which the Hamming distance between $\ell$-read vectors of any two distinct sequences is at least $d$. We first improve the result of Banerjee \emph{et al.}, who studied $\ell$-read $(n,d)_q$-codes with the constraint $\ell\geq 3$ and $d=3$. Then, we investigate the bounds and constructions of $2$-read codes with a minimum distance of $3$, $4$, and $5$, respectively. Our results indicate that when $d \in {3,4}$, the optimal redundancy of $2$-read $(n,d)_q$-codes is $o(\log_q n)$, while for $d=5$ it is $\log_q n+o(\log_q n)$. Additionally, we establish an equivalence between $2$-read $(n,3)_q$-codes and classical $q$-ary single-insertion reconstruction codes using two noisy reads. We improve the lower bound on the redundancy of classical $q$-ary single-insertion reconstruction codes as well as the upper bound on the redundancy of classical $q$-ary single-deletion reconstruction codes when using two noisy reads. Finally, we study $\ell$-read codes under the reconstruction model.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (22)
  1. A. Banerjee, Y. Yehezkeally, A. Wachter-Zeh, and E. Yaakobi, “Error Correcting Codes for Nanopore Sequencing,” arXiv:2305.10214, 2024.
  2. A. Banerjee, Y. Yehezkeally, A. Wachter-Zeh, and E. Yaakobi, “Correcting a Single Deletion in Reads from a Nanopore Sequencer,” arXiv:2401.15939, 2024.
  3. K. Cai, H. M. Kiah, T. T. Nguyen, and E. Yaakobi, “Coding for sequence reconstruction for single edits,” IEEE Trans. Inf. Theory, vol. 68, no. 1, pp. 66-79, 2022.
  4. J. Chrisnata, H. M. Kiah, and E. Yaakobi, “Correcting deletions with multiple reads,” IEEE Trans. Inf. Theory, vol. 68, no. 11, pp. 7141-7158, 2022.
  5. Y. M. Chee, A. Vardy, V. K. Vu, and E. Yaakobi, “Transverse-Read-Codes for Domain Wall Memories,” IEEE J. Sel. Areas Inf. Theory, vol. 4, pp. 784-793, 2023.
  6. D. Deamer, M. Akeson, and D. Branton, “Three decades of nanopore sequencing,” Nature biotechnology, vol. 34, no. 5, pp. 518-524, 2016.
  7. DNA Data Storage Alliance, “Preserving our digital legacy: an introduction to DNA data storage,” 2021.
  8. R. Hulett, S. Chandak, and M. Wootters, “On coding for an abstracted nanopore channel for dna storage,” in Proc. IEEE Int. Symp. Inf. Theory (ISIT), Melbourne, Australia, pp. 2465-2470, 2021.
  9. D. E. Knuth, “The sandwich theorem,” Elec. J. Comb., vol. 1, no. 1, p. A1, Apr. 1994.
  10. J. J. Kasianowicz, E. Brandin, D. Branton, and D. W. Deamer, “Characterization of individual polynucleotide molecules using a membrane channel,” Proceedings of the National Academy of Sciences, vol. 93, no. 24, pp. 13 770-13 773, 1996.
  11. V. I. Levenshtein, “Binary codes capable of correcting deletions, insertions, and reversals,” Soviet Phys. Dokl., vol. 10, no. 8, pp. 707-710, 1966.
  12. V. I. Levenshtein, “Efficient reconstruction of sequences,” IEEE Trans. Inf. Theory, vol. 47, no. 1, pp. 2-22, 2001.
  13. S. Liu and C. Xing, “Nonlinear codes with low redundancy,” arXiv:2310.14219, 2023.
  14. W. Mao, S. N. Diggavi, and S. Kannan, “Models and information theoretic bounds for nanopore sequencing,” IEEE Trans. Inf. Theory, vol. 64, no. 4, pp. 3216-3236, 2018
  15. B. McBain, E. Viterbo, and J. Saunderson, “Finite-State Semi-Markov Channels for Nanopore Sequencing,” in Proc. IEEE Int. Symp. Inf. Theory (ISIT), Espoo, Finland, pp. 216-221, 2022.
  16. B. McBain, E. Viterbo, and J. Saunderson, “Homophonic Coding for the Noisy Nanopore Channel with Constrained Markov Sources,” in Proc. IEEE Int. Symp. Inf. Theory (ISIT), Taipei, Taiwan, pp. 376-381, 2023.
  17. J. Rydning, “Worldwide idc global datasphere forecast, 2022-2026: Enterprise organizations driving most of the data growth,” tech. rep., Technical Report, 2022.
  18. Y. Sun and G. Ge, “Correcting two-deletion with a constant number of reads,” IEEE Trans. Inf. Theory, vol. 69, no. 5, pp. 2969-2982, 2023.
  19. Y. Sun, Y. Xi, and G. Ge, “Sequence reconstruction under single-burst-insertion/deletion/edit channel,” IEEE Trans. Inf. Theory, vol. 69, no. 7, pp. 4466-4483, 2023.
  20. A. Vidal, V. B. Wijekoon, and E. Viterbo, “Error Bounds for Decoding Piecewise Constant Nanopore Signals in DNA Storage,” in Proc IEEE Int. Con. Commu. (ICC), Rome, Italy, pp. 4452-4457, 2023.
  21. A. Vidal, V. B. Wijekoon, and E. Viterbo, “Union Bound for Generalized Duplication Channels with DTW Decoding,” in Proc. IEEE Int. Symp. Inf. Theory (ISIT), Taipei, Taiwan, pp. 358-363, 2023.
  22. O. Yerushalmi, T. Etzion, and E. Yaakobi, “The Capacity of the Weighted Read Channel,” arXiv:2401.15368, 2024.

Summary

We haven't generated a summary for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Lightbulb On Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

Authors (2)