Linear-time off-line text compression by longest-first substitution

Shunsuke Inenaga, Takashi Funamoto, Masayuki Takeda, Ayumi Shinohara

Research output: Chapter in Book/Report/Conference proceedingChapter

9 Citations (Scopus)

Abstract

Given a text, grammar-based compression is to construct a grammar that generates the text. There are many kinds of text compression techniques of this type. Each compression scheme is categorized as being either off-line or on-line, according to how a text is processed. One representative tactics for off-line compression is to substitute the longest repeated factors of a text with a production rule. In this paper, we present an algorithm that compresses a text basing on this longest-first principle, in linear time. The algorithm employs a suitable index structure for a text, and involves technically efficient operations on the structure.

Original languageEnglish
Title of host publicationLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
EditorsMario A. Nascimento, Edleno S. de Moura, Arlindo L. Oliveira
PublisherSpringer Verlag
Pages137-152
Number of pages16
ISBN (Print)3540201777, 9783540201779
DOIs
Publication statusPublished - 2003

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume2857
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

All Science Journal Classification (ASJC) codes

  • Theoretical Computer Science
  • General Computer Science

Fingerprint

Dive into the research topics of 'Linear-time off-line text compression by longest-first substitution'. Together they form a unique fingerprint.

Cite this