TY - JOUR

T1 - Closed factorization

AU - Badkobeh, Golnaz

AU - Bannai, Hideo

AU - Goto, Keisuke

AU - I, Tomohiro

AU - Iliopoulos, Costas S.

AU - Inenaga, Shunsuke

AU - Puglisi, Simon J.

AU - Sugimoto, Shiho

N1 - Publisher Copyright:
© 2016 Elsevier B.V.

PY - 2016/10/30

Y1 - 2016/10/30

N2 - A closed string is a string with a proper substring that occurs in the string as a prefix and a suffix, but not elsewhere. Closed strings were introduced by Fici (2011) as objects of combinatorial interest in the study of Trapezoidal and Sturmian words. In this paper we present algorithms for computing closed factors (substrings) in strings. First, we consider the problem of greedily factorizing a string into a sequence of longest closed factors. We describe an algorithm for this problem that uses linear time and space. We then consider the related problem of computing, for every position in the string, the longest closed factor starting at that position. We describe a simple algorithm for the problem that runs in O(nlogn/loglogn) time, where n is the length of the string. This also leads to an algorithm to compute the maximal closed factor containing (i.e. covering) each position in the string in O(nlogn/loglogn) time. We also present linear time algorithms to factorize a string into a sequence of shortest closed factors of length at least two, to compute the shortest closed factor of length at least two starting at each position of the string, and to compute a minimal closed factor of length at least two containing each position of the string.

AB - A closed string is a string with a proper substring that occurs in the string as a prefix and a suffix, but not elsewhere. Closed strings were introduced by Fici (2011) as objects of combinatorial interest in the study of Trapezoidal and Sturmian words. In this paper we present algorithms for computing closed factors (substrings) in strings. First, we consider the problem of greedily factorizing a string into a sequence of longest closed factors. We describe an algorithm for this problem that uses linear time and space. We then consider the related problem of computing, for every position in the string, the longest closed factor starting at that position. We describe a simple algorithm for the problem that runs in O(nlogn/loglogn) time, where n is the length of the string. This also leads to an algorithm to compute the maximal closed factor containing (i.e. covering) each position in the string in O(nlogn/loglogn) time. We also present linear time algorithms to factorize a string into a sequence of shortest closed factors of length at least two, to compute the shortest closed factor of length at least two starting at each position of the string, and to compute a minimal closed factor of length at least two containing each position of the string.

UR - http://www.scopus.com/inward/record.url?scp=84967167017&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84967167017&partnerID=8YFLogxK

U2 - 10.1016/j.dam.2016.04.009

DO - 10.1016/j.dam.2016.04.009

M3 - Article

AN - SCOPUS:84967167017

SN - 0166-218X

VL - 212

SP - 23

EP - 29

JO - Discrete Applied Mathematics

JF - Discrete Applied Mathematics

ER -