Compressed pattern matching for Sequitur

S. Mitarai, M. Hirao, T. Matsumoto, A. Shinohara, M. Takeda, S. Arikawa

Research output: Contribution to journalConference articlepeer-review

12 Citations (Scopus)

Abstract

Sequitur due to Nevill-Manning and Witten. [19] is a powerful program to infer a phrase hierarchy from the input text, that also provides extremely effective compression of large quantities of semi-structured text [18]. In this paper, we address the problem of searching in Sequitur compressed text directly. We show a compressed pattern matching algorithm that finds a pattern in compressed text without explicit decompression. We show that our algorithm is approximately 1.27 times faster than a decompression followed by an ordinal search.

Original languageEnglish
Pages (from-to)469-478
Number of pages10
JournalData Compression Conference Proceedings
Publication statusPublished - 2001
EventData Compression Conference - Snowbird, UT, United States
Duration: Mar 27 2001Mar 29 2001

All Science Journal Classification (ASJC) codes

  • Computer Networks and Communications

Fingerprint

Dive into the research topics of 'Compressed pattern matching for Sequitur'. Together they form a unique fingerprint.

Cite this