<technical report>
Pattern Matching Machine for Text Compressed Using Finite State Model

Creator
Language
Publisher
Date
Source Title
Vol
Publication Type
Access Rights
Related DOI
Related URI
Relation
Abstract The classical pattern matching problem is to find all occurrences of patterns in a text. In many practical cases, since the text is very large and stored in the secondary storage, most of the time for... the pattern matching is dominated by data transmission of the text. Therefore the text compression can speed-up the pattern matching. In this framework it is required to develop an efficient pattern matching algorithm for searching the compressed text directly without decoding. In 1992, Fukamachi et al. proposed a method of constructing pattern matching machine that runs on Huffman coded text, based on the Aho-Coracick algorithm. However, since the Huffman code is optimal only under the assumption of the memoryless source model, the compression ratio is not very high. On the other hand, it is known that English text can be highly compressed by the compression method based on the Markov model. In this paper, we focus our attention on the finite-state model, which subsumes the Markov model as an important special case, and show an algorithm for constructing pattern matching machine for text compressed under the assumption of this model. We also give a proof of the correctness of the algorithm.show more

Hide fulltext details.

gz trcs142.ps gz 138 KB 218  
pdf trcs142 pdf 217 KB 186  

Details

Record ID
Peer-Reviewed
Type
Created Date 2009.04.22
Modified Date 2018.08.31

People who viewed this item also viewed