TY - GEN
T1 - Simple Linear-Time Repetition Factorization
AU - Yonemoto, Yuki
AU - Inenaga, Shunsuke
N1 - Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2025.
PY - 2025
Y1 - 2025
N2 - A factorization f1,…,fm of a string w of length n is called a repetition factorization of w if fi is a repetition, i.e., fi is a form of xkx′ where x is a non-empty string, x′ is a (possibly-empty) proper prefix of x, and k≥2. Dumitran et al. [SPIRE 2015] presented an O(n)-time and space algorithm for computing an arbitrary repetition factorization of a given string of length n. Their algorithm heavily relies on the Union-Find data structure on trees proposed by Gabow and Tarjan [JCSS 1985] that works in linear time on the word RAM model, and an interval stabbing data structure of Schmidt [ISAAC 2009]. In this paper, we explore more combinatorial insights into the problem, and present a simple algorithm to compute an arbitrary repetition factorization of a given string of length n in O(n) time, without relying on data structures for Union-Find and interval stabbing. Our algorithm follows the approach by Inoue et al. [ToCS 2022] that computes the smallest/largest repetition factorization in O(nlogn) time.
AB - A factorization f1,…,fm of a string w of length n is called a repetition factorization of w if fi is a repetition, i.e., fi is a form of xkx′ where x is a non-empty string, x′ is a (possibly-empty) proper prefix of x, and k≥2. Dumitran et al. [SPIRE 2015] presented an O(n)-time and space algorithm for computing an arbitrary repetition factorization of a given string of length n. Their algorithm heavily relies on the Union-Find data structure on trees proposed by Gabow and Tarjan [JCSS 1985] that works in linear time on the word RAM model, and an interval stabbing data structure of Schmidt [ISAAC 2009]. In this paper, we explore more combinatorial insights into the problem, and present a simple algorithm to compute an arbitrary repetition factorization of a given string of length n in O(n) time, without relying on data structures for Union-Find and interval stabbing. Our algorithm follows the approach by Inoue et al. [ToCS 2022] that computes the smallest/largest repetition factorization in O(nlogn) time.
KW - repetitions
KW - runs
KW - string factorization
UR - http://www.scopus.com/inward/record.url?scp=85205344301&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85205344301&partnerID=8YFLogxK
U2 - 10.1007/978-3-031-72200-4_26
DO - 10.1007/978-3-031-72200-4_26
M3 - Conference contribution
AN - SCOPUS:85205344301
SN - 9783031721991
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 348
EP - 361
BT - String Processing and Information Retrieval - 31st International Symposium, SPIRE 2024, Proceedings
A2 - Lipták, Zsuzsanna
A2 - Moura, Edleno
A2 - Figueroa, Karina
A2 - Baeza-Yates, Ricardo
PB - Springer Science and Business Media Deutschland GmbH
T2 - 31st International Symposium on String Processing and Information Retrieval, SPIRE 2024
Y2 - 23 September 2024 through 25 September 2024
ER -