Grammar Index by Induced Suffix Sorting

Tooru Akagi, Dominik Köppl, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda

Research output: Chapter in Book/Report/Conference proceedingConference contribution

4 Citations (Scopus)


We propose a new compressed text index built upon a grammar compression based on induced suffix sorting [Nunes et al., DCC’18]. We show that this grammar exhibits a locality sensitive parsing property, which allows us to specify, given a pattern P, certain substrings of P, called cores, that are similarly parsed in the text grammar whenever these occurrences are extensible to occurrences of P. Supported by the cores, given a pattern of length m, we can locate all its occ occurrences in a text T of length n within O(mlg | S| + occ Clg | S| lg n+ occ ) time, where S is the set of all characters and non-terminals, occ is the number of occurrences, and occ C is the number of occurrences of a chosen core C of P in the right hand side of all production rules of the grammar of T. Our grammar index requires O(g) words of space and can be built in O(n) time using O(g) working space, where g is the sum of the lengths of the right hand sides of all production rules. We practically evaluate that our proposed index excels at locating long patterns in highly-repetitive texts. Our implementation is available at

Original languageEnglish
Title of host publicationString Processing and Information Retrieval - 28th International Symposium, SPIRE 2021, Proceedings
EditorsThierry Lecroq, Hélène Touzet
PublisherSpringer Science and Business Media Deutschland GmbH
Number of pages15
ISBN (Print)9783030866914
Publication statusPublished - 2021
Event28th International Symposium on String Processing and Information Retrieval, SPIRE 2021 - Virtual, Online
Duration: Oct 4 2021Oct 6 2021

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume12944 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349


Conference28th International Symposium on String Processing and Information Retrieval, SPIRE 2021
CityVirtual, Online

All Science Journal Classification (ASJC) codes

  • Theoretical Computer Science
  • General Computer Science


Dive into the research topics of 'Grammar Index by Induced Suffix Sorting'. Together they form a unique fingerprint.

Cite this