Minimal Absent Words on Run-Length Encoded Strings

Tooru Akagi, Kouta Okabe, Takuya Mieno, Yuto Nakashima, Shunsuke Inenaga

研究成果: 書籍/レポート タイプへの寄稿会議への寄与

1 被引用数 (Scopus)

抄録

A string w is called a minimal absent word for another string T if w does not occur (as a substring) in T and all proper substrings of w occur in T. State-of-the-art data structures for reporting the set MAW(T) of MAWs from a given string T of length n require O(n) space, can be built in O(n) time, and can report all MAWs in O(|MAW(T)|) time upon a query. This paper initiates the problem of computing MAWs from a compressed representation of a string. In particular, we focus on the most basic compressed representation of a string, run-length encoding (RLE), which represents each maximal run of the same characters a by ap where p is the length of the run. Let m be the RLE-size of string T. After categorizing the MAWs into five disjoint sets M1, M2, M3, M4, M5 using RLE, we present matching upper and lower bounds for the number of MAWs in Mi for i = 1, 2, 4, 5 in terms of RLE-size m, except for M3 whose size is unbounded by m. We then present a compact O(m)-space data structure that can report all MAWs in optimal O(|MAW(T)|) time.

本文言語英語
ホスト出版物のタイトル33rd Annual Symposium on Combinatorial Pattern Matching, CPM 2022
編集者Hideo Bannai, Jan Holub
出版社Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing
ISBN(電子版)9783959772341
DOI
出版ステータス出版済み - 6月 1 2022
イベント33rd Annual Symposium on Combinatorial Pattern Matching, CPM 2022 - Prague, チェコ共和国
継続期間: 6月 27 20226月 29 2022

出版物シリーズ

名前Leibniz International Proceedings in Informatics, LIPIcs
223
ISSN(印刷版)1868-8969

会議

会議33rd Annual Symposium on Combinatorial Pattern Matching, CPM 2022
国/地域チェコ共和国
CityPrague
Period6/27/226/29/22

!!!All Science Journal Classification (ASJC) codes

  • ソフトウェア

フィンガープリント

「Minimal Absent Words on Run-Length Encoded Strings」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル