Revisiting the Folklore Algorithm for Random Access to Grammar-Compressed Strings

Alan M. Cleary, Joseph Winjum, Jordan Dood, Shunsuke Inenaga

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Grammar-based compression is a widely-accepted model of string compression that allows for efficient and direct manipulations on the compressed data. Most, if not all, such manipulations rely on the primitive random access queries, a task of quickly returning the character at a specified position of the original uncompressed string without explicit decompression. While there are advanced data structures for random access to grammar-compressed strings that guarantee theoretical query time and space bounds, little has been done for the practical perspective of this important problem. In this paper, we revisit a well-known folklore random access algorithm for grammars in the Chomsky normal form, modify it to work directly on general grammars, and show that this modified version is fast and memory efficient in practice.

Original languageEnglish
Title of host publicationString Processing and Information Retrieval - 31st International Symposium, SPIRE 2024, Proceedings
EditorsZsuzsanna Lipták, Edleno Moura, Karina Figueroa, Ricardo Baeza-Yates
PublisherSpringer Science and Business Media Deutschland GmbH
Pages88-101
Number of pages14
ISBN (Print)9783031721991
DOIs
Publication statusPublished - 2025
Event31st International Symposium on String Processing and Information Retrieval, SPIRE 2024 - Puerto Vallarta, Mexico
Duration: Sept 23 2024Sept 25 2024

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume14899 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference31st International Symposium on String Processing and Information Retrieval, SPIRE 2024
Country/TerritoryMexico
CityPuerto Vallarta
Period9/23/249/25/24

All Science Journal Classification (ASJC) codes

  • Theoretical Computer Science
  • General Computer Science

Fingerprint

Dive into the research topics of 'Revisiting the Folklore Algorithm for Random Access to Grammar-Compressed Strings'. Together they form a unique fingerprint.

Cite this