Linear-Time text compression by longest-first substitution

Ryosuke Nakamura, Shunsuke Inenaga, Hideo Bannai, Takashi Funamoto, Masayuki Takeda, Ayumi Shinohara

Research output: Contribution to journalArticlepeer-review

16 Citations (Scopus)


We consider grammar-based text compression with longest first substitution (LFS), where non-overlapping occurrences of a longest repeating factor of the input text are replaced by a new non-terminal symbol. We present the first linear-time algorithm for LFS. Our algorithm employs a new data structure called sparse lazy suffix trees. We also deal with a more sophisticated version of LFS, called LFS2, that allows better compression. The first linear-time algorithm for LFS2 is also presented.

Original languageEnglish
Pages (from-to)1429-1448
Number of pages20
Issue number4
Publication statusPublished - Dec 2009

All Science Journal Classification (ASJC) codes

  • Theoretical Computer Science
  • Numerical Analysis
  • Computational Theory and Mathematics
  • Computational Mathematics


Dive into the research topics of 'Linear-Time text compression by longest-first substitution'. Together they form a unique fingerprint.

Cite this