Optimally Computing Compressed Indexing Arrays Based on the Compact Directed Acyclic Word Graph

Hiroki Arimura, Shunsuke Inenaga, Yasuaki Kobayashi, Yuto Nakashima, Mizuki Sue

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Citations (Scopus)

Abstract

In this paper, we present the first study of the computational complexity of converting an automata-based text index structure, called the Compact Directed Acyclic Word Graph (CDAWG), of size e for a text T of length n into other text indexing structures for the same text, suitable for highly repetitive texts: the run-length BWT of size r, the irreducible PLCP array of size r, and the quasi-irreducible LPF array of size e, as well as the lex-parse of size O(r) and the LZ77-parse of size z, where $$r, z \leqslant e$$. As main results, we showed that the above structures can be optimally computed from either the CDAWG for T stored in read-only memory or its self-index version of size e without a text in O(e) worst-case time and words of working space. To obtain the above results, we devised techniques for enumerating a particular subset of suffixes in the lexicographic and text orders using the forward and backward search on the CDAWG by extending the result by Belazzougui et al. in 2015.

Original languageEnglish
Title of host publicationString Processing and Information Retrieval - 30th International Symposium, SPIRE 2023, Proceedings
EditorsFranco Maria Nardini, Nadia Pisanti, Rossano Venturini
PublisherSpringer Science and Business Media Deutschland GmbH
Pages28-34
Number of pages7
ISBN (Print)9783031439797
DOIs
Publication statusPublished - 2023
Event30th International Symposium on String Processing and Information Retrieval, SPIRE 2023 - Pisa, Italy
Duration: Sept 26 2023Sept 28 2023

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume14240 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference30th International Symposium on String Processing and Information Retrieval, SPIRE 2023
Country/TerritoryItaly
CityPisa
Period9/26/239/28/23

All Science Journal Classification (ASJC) codes

  • Theoretical Computer Science
  • General Computer Science

Fingerprint

Dive into the research topics of 'Optimally Computing Compressed Indexing Arrays Based on the Compact Directed Acyclic Word Graph'. Together they form a unique fingerprint.

Cite this