Abstract
A framework of context-sensitive grammar transform for speeding-up compressed pattern matching (CPM) is proposed. A greedy compression algorithm with the transform model is presented as well as a Knuth-Morris-Pratt (KMP)-type compressed pattern matching algorithm. The compression ratio is a match for gzip and Re-Pair, and the search speed of our CPM algorithm is almost twice faster than the KMP-type CPM algorithm on Byte-Pair-Encoding by Shibata et al. [18], and in the case of short patterns, faster than the Boyer-Moore-Horspool algorithm with the stopper encoding by Rautio et al. [14], which is regarded as one of the best combinations that allows a practically fast search.
Original language | English |
---|---|
Pages (from-to) | 219-226 |
Number of pages | 8 |
Journal | IEICE Transactions on Information and Systems |
Volume | E93-D |
Issue number | 2 |
DOIs | |
Publication status | Published - 2010 |
All Science Journal Classification (ASJC) codes
- Software
- Hardware and Architecture
- Computer Vision and Pattern Recognition
- Electrical and Electronic Engineering
- Artificial Intelligence