Simple Linear-Time Repetition Factorization

Yuki Yonemoto, Shunsuke Inenaga

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

A factorization f1,…,fm of a string w of length n is called a repetition factorization of w if fi is a repetition, i.e., fi is a form of xkx where x is a non-empty string, x is a (possibly-empty) proper prefix of x, and k≥2. Dumitran et al. [SPIRE 2015] presented an O(n)-time and space algorithm for computing an arbitrary repetition factorization of a given string of length n. Their algorithm heavily relies on the Union-Find data structure on trees proposed by Gabow and Tarjan [JCSS 1985] that works in linear time on the word RAM model, and an interval stabbing data structure of Schmidt [ISAAC 2009]. In this paper, we explore more combinatorial insights into the problem, and present a simple algorithm to compute an arbitrary repetition factorization of a given string of length n in O(n) time, without relying on data structures for Union-Find and interval stabbing. Our algorithm follows the approach by Inoue et al. [ToCS 2022] that computes the smallest/largest repetition factorization in O(nlogn) time.

Original languageEnglish
Title of host publicationString Processing and Information Retrieval - 31st International Symposium, SPIRE 2024, Proceedings
EditorsZsuzsanna Lipták, Edleno Moura, Karina Figueroa, Ricardo Baeza-Yates
PublisherSpringer Science and Business Media Deutschland GmbH
Pages348-361
Number of pages14
ISBN (Print)9783031721991
DOIs
Publication statusPublished - 2025
Event31st International Symposium on String Processing and Information Retrieval, SPIRE 2024 - Puerto Vallarta, Mexico
Duration: Sept 23 2024Sept 25 2024

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume14899 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference31st International Symposium on String Processing and Information Retrieval, SPIRE 2024
Country/TerritoryMexico
CityPuerto Vallarta
Period9/23/249/25/24

All Science Journal Classification (ASJC) codes

  • Theoretical Computer Science
  • General Computer Science

Fingerprint

Dive into the research topics of 'Simple Linear-Time Repetition Factorization'. Together they form a unique fingerprint.

Cite this