TY - JOUR
T1 - The "runs" theorem
AU - Bannai, Hideo
AU - I, Tomohiro
AU - Inenaga, Shunsuke
AU - Nakashima, Yuto
AU - Takeda, Masayuki
AU - Tsuruta, Kazuya
N1 - Funding Information:
∗Received by the editors March 4, 2015; accepted for publication (in revised form) July 19, 2017; published electronically September 21, 2017. A preliminary version of this paper has appeared in Proceedings of the Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), SIAM, Philadelphia, 2015, pp. 562–571 [1]. http://www.siam.org/journals/sicomp/46-5/M101103.html Funding: The work of the first, third, and fifth authors was supported by JSPS KAKENHI Grants 25280086, 16H02783, 26280003, 25240003.
Publisher Copyright:
Copyright © by SIAM.
PY - 2017
Y1 - 2017
N2 - We give a new characterization of maximal repetitions (or runs) in strings based on Lyndon words. The characterization leads to a proof of what was known as the "runs" conjecture [R. M. Kolpakov andG.Kucherov, Proceedings of the IEEE Symposium on Foundations of Computer Science (FOCS), IEEE Computer Society, Los Alamitos, CA, 1999, pp. 596-604]), which states that the maximum number of runs ρ(n) in a string of length n is less than n. The proof is remarkably simple, considering the numerous endeavors to tackle this problem in the last 15 years, and significantly improves our understanding of how runs can occur in strings. In addition, we obtain an upper bound of 3n for the maximum sum of exponents σ(n) of runs in a string of length n, improving on the best known bound of 4.1n by Crochemore et al. [J. Discrete Algorithms, 14 (2012), pp. 29-36], as well as other improved bounds on related problems. The characterization also gives rise to a new, conceptually simple linear-time algorithm for computing all the runs in a string. A notable characteristic of our algorithm is that, unlike all existing linear-time algorithms, it does not utilize the Lempel-Ziv factorization of the string. We also establish a relationship between runs and nodes of the Lyndon tree, which gives a simple optimal solution to the 2-period query problem that was recently solved by Kociumaka et al. [Proceedings of the Twenty-Sixth Annual ACM-SIAM Symposium on Discrete Algorithms, (SODA) 2015, San Diego, CA, SIAM, Philadelphia, 2015, pp. 532-551].
AB - We give a new characterization of maximal repetitions (or runs) in strings based on Lyndon words. The characterization leads to a proof of what was known as the "runs" conjecture [R. M. Kolpakov andG.Kucherov, Proceedings of the IEEE Symposium on Foundations of Computer Science (FOCS), IEEE Computer Society, Los Alamitos, CA, 1999, pp. 596-604]), which states that the maximum number of runs ρ(n) in a string of length n is less than n. The proof is remarkably simple, considering the numerous endeavors to tackle this problem in the last 15 years, and significantly improves our understanding of how runs can occur in strings. In addition, we obtain an upper bound of 3n for the maximum sum of exponents σ(n) of runs in a string of length n, improving on the best known bound of 4.1n by Crochemore et al. [J. Discrete Algorithms, 14 (2012), pp. 29-36], as well as other improved bounds on related problems. The characterization also gives rise to a new, conceptually simple linear-time algorithm for computing all the runs in a string. A notable characteristic of our algorithm is that, unlike all existing linear-time algorithms, it does not utilize the Lempel-Ziv factorization of the string. We also establish a relationship between runs and nodes of the Lyndon tree, which gives a simple optimal solution to the 2-period query problem that was recently solved by Kociumaka et al. [Proceedings of the Twenty-Sixth Annual ACM-SIAM Symposium on Discrete Algorithms, (SODA) 2015, San Diego, CA, SIAM, Philadelphia, 2015, pp. 532-551].
UR - http://www.scopus.com/inward/record.url?scp=85032917335&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85032917335&partnerID=8YFLogxK
U2 - 10.1137/15M1011032
DO - 10.1137/15M1011032
M3 - Article
AN - SCOPUS:85032917335
SN - 0097-5397
VL - 46
SP - 1501
EP - 1514
JO - SIAM Journal on Computing
JF - SIAM Journal on Computing
IS - 5
ER -