TY - JOUR
T1 - Tight bounds for the sensitivity of CDAWGs with left-end edits
AU - Fujimaru, Hiroto
AU - Nakashima, Yuto
AU - Inenaga, Shunsuke
N1 - Publisher Copyright:
© The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2025.
PY - 2025/3
Y1 - 2025/3
N2 - Compact directed acyclic word graphs (CDAWGs) (Blumer et al. in J ACM 34(3):578–595, 1987) are a fundamental data structure on strings with applications in text pattern searching, data compression, and pattern discovery. Intuitively, the CDAWG of a string T is obtained by merging isomorphic subtrees of the suffix tree (Weiner, in: Proceedings of the 14th annual symposium on switching and automata theory, pp 1–11, 1973) of the same string T, thus CDAWGs are a compact indexing structure. In this paper, we investigate the sensitivity of CDAWGs when a single character edit operation (insertion, deletion, or substitution) is performed at the left-end of the input string T, namely, we are interested in the worst-case increase in the size of the CDAWG after a left-end edit operation. We prove that if e is the number of edges of the CDAWG for string T, then the number of new edges added to the CDAWG after a left-end edit operation on T does not exceed e. Further, we present a matching lower bound on the sensitivity of CDAWGs for left-end insertions, and almost matching lower bounds for left-end deletions and substitutions. We then generalize our lower-bound instance for left-end insertions to leftward online construction of the CDAWG, and show that it requires Ω(n2) time for some string of length n.
AB - Compact directed acyclic word graphs (CDAWGs) (Blumer et al. in J ACM 34(3):578–595, 1987) are a fundamental data structure on strings with applications in text pattern searching, data compression, and pattern discovery. Intuitively, the CDAWG of a string T is obtained by merging isomorphic subtrees of the suffix tree (Weiner, in: Proceedings of the 14th annual symposium on switching and automata theory, pp 1–11, 1973) of the same string T, thus CDAWGs are a compact indexing structure. In this paper, we investigate the sensitivity of CDAWGs when a single character edit operation (insertion, deletion, or substitution) is performed at the left-end of the input string T, namely, we are interested in the worst-case increase in the size of the CDAWG after a left-end edit operation. We prove that if e is the number of edges of the CDAWG for string T, then the number of new edges added to the CDAWG after a left-end edit operation on T does not exceed e. Further, we present a matching lower bound on the sensitivity of CDAWGs for left-end insertions, and almost matching lower bounds for left-end deletions and substitutions. We then generalize our lower-bound instance for left-end insertions to leftward online construction of the CDAWG, and show that it requires Ω(n2) time for some string of length n.
UR - http://www.scopus.com/inward/record.url?scp=85218252274&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85218252274&partnerID=8YFLogxK
U2 - 10.1007/s00236-025-00478-y
DO - 10.1007/s00236-025-00478-y
M3 - Article
AN - SCOPUS:85218252274
SN - 0001-5903
VL - 62
JO - Acta Informatica
JF - Acta Informatica
IS - 1
M1 - 12
ER -