Minimal Absent Words on Run-Length Encoded Strings

Tooru Akagi, Kouta Okabe, Takuya Mieno, Yuto Nakashima, Shunsuke Inenaga

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

A string w is called a minimal absent word for another string T if w does not occur (as a substring) in T and all proper substrings of w occur in T. State-of-the-art data structures for reporting the set MAW(T) of MAWs from a given string T of length n require O(n) space, can be built in O(n) time, and can report all MAWs in O(|MAW(T)|) time upon a query. This paper initiates the problem of computing MAWs from a compressed representation of a string. In particular, we focus on the most basic compressed representation of a string, run-length encoding (RLE), which represents each maximal run of the same characters a by ap where p is the length of the run. Let m be the RLE-size of string T. After categorizing the MAWs into five disjoint sets M1, M2, M3, M4, M5 using RLE, we present matching upper and lower bounds for the number of MAWs in Mi for i = 1, 2, 4, 5 in terms of RLE-size m, except for M3 whose size is unbounded by m. We then present a compact O(m)-space data structure that can report all MAWs in optimal O(|MAW(T)|) time.

Original languageEnglish
Title of host publication33rd Annual Symposium on Combinatorial Pattern Matching, CPM 2022
EditorsHideo Bannai, Jan Holub
PublisherSchloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing
ISBN (Electronic)9783959772341
DOIs
Publication statusPublished - Jun 1 2022
Event33rd Annual Symposium on Combinatorial Pattern Matching, CPM 2022 - Prague, Czech Republic
Duration: Jun 27 2022Jun 29 2022

Publication series

NameLeibniz International Proceedings in Informatics, LIPIcs
Volume223
ISSN (Print)1868-8969

Conference

Conference33rd Annual Symposium on Combinatorial Pattern Matching, CPM 2022
Country/TerritoryCzech Republic
CityPrague
Period6/27/226/29/22

All Science Journal Classification (ASJC) codes

  • Software

Fingerprint

Dive into the research topics of 'Minimal Absent Words on Run-Length Encoded Strings'. Together they form a unique fingerprint.

Cite this