Creator |
|
|
|
|
|
Language |
|
Publisher |
|
|
Date |
|
Source Title |
|
Vol |
|
Publication Type |
|
Access Rights |
|
Related DOI |
|
|
Related URI |
|
|
Relation |
|
|
Abstract |
In this paper, we address the problem of approximate string matching on compressed text. We consider this problem for a text string described in terms of collage system, which is a formal system propo...sed by Kida et al. (1999) that captures various dictionary-based compression methods. We present an algorithm that exploits bit-parallelism, assuming that our problem fits in a single machine word, e.g., $ (m − k + 1)(k + 1) leq L $, where m is the pattern length, $ k $ is the number of allowed errors, and $ L $ is the length in bits of the machine word. For a class of simple collage systems, the algorithm runs in $ O(k^2 ( left |D\right |+ mid S mid)+ km) $ time using $ O(k^2 ( left |D\right |) $ space, where $ left |D\right | $ is the size of dictionary $ D $ and $ mid S mid $ is the number of tokens in $ S $. The LZ78 and the LZW compression methods are covered by this class. Since we can regard $ n = left |D\right | + mid S mid $ as the compressed length, the time and the space complexities are $ O(k^2n + km) $ and $ O(k^2n) $, respectively.show more
|