The string-to-dictionary matching problem

Shmuel T. Klein, Dana Shapira

Research output: Contribution to journalArticlepeer-review

Abstract

The String-to-Dictionary Matching Problem is defined, in which a string is searched for in all the possible concatenations of the elements of a given dictionary, with applications to compressed matching in variable to fixed-length encodings, such as Tunstall's. Two algorithms based on suffix trees are suggested, the one focusing on the dictionary, the other on the pattern to be searched for. The problem is then extended to deal also with patterns that include gaps. Experiments on natural language text suggest that compressed search might use less comparisons for long enough patterns, in spite of a potentially large number of encodings.

Original languageEnglish
Pages (from-to)1347-1356
Number of pages10
JournalComputer Journal
Volume55
Issue number11
DOIs
StatePublished - Nov 2012
Externally publishedYes

Keywords

  • compressed matching
  • suffix trees
  • tunstall

Fingerprint

Dive into the research topics of 'The string-to-dictionary matching problem'. Together they form a unique fingerprint.

Cite this