We present a run-time efficient implementation of compressed pattern matching automata (CPMA) of Kida et al. (2003), where a text is given as a truncation-free collage system 〈D,S〉 such that variable sequence S is encoded by any prefix code. We first build CPMA directly from P and D in O(|D||P|) time and space, and then convert it into the decoder-embedded CPMA (DECPMA), where |P| is the pattern length and |D| is the number of variables defined in D. The bound O(|D||P|) improves the bound O(|D||P| + |P|2) achieved by a straightforward application of the method of Kida et al. We experimentally show that a combination of recursive-pairing compression and byte-oriented Huffman coding allows both a high compression ratio and a high speed CPM.
|Number of pages
|International Journal of Foundations of Computer Science
|Published - Aug 2009
All Science Journal Classification (ASJC) codes
- Computer Science (miscellaneous)