A new family of string classifiers based on local relatedness

Yasuto Higa, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

This paper introduces a new family of string classifiers based on local relatedness. We use three types of local relatedness measurements, namely, longest common substrings (LCStr's), longest common subsequences (LCSeq's), and window-accumulated longest common sub-sequences (wLCSeq's). We show that finding the optimal classier for given two sets of strings (the positive set and the negative set), is NP-hard for all of the above measurements. In order to achieve practically efficient algorithms for finding the best classifier, we investigate pruning heuristics and fast string matching techniques based on the properties of the local relatedness measurements.

Original languageEnglish
Title of host publicationDiscovery Science - 9th International Conference, DS 2006, Proceedings
PublisherSpringer Verlag
Pages114-124
Number of pages11
ISBN (Print)3540464913, 9783540464914
DOIs
Publication statusPublished - 2006
Event9th International Conference on Discovery Science, DS 2006 - Barcelona, Spain
Duration: Oct 7 2006Oct 10 2006

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume4265 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

Other9th International Conference on Discovery Science, DS 2006
Country/TerritorySpain
CityBarcelona
Period10/7/0610/10/06

All Science Journal Classification (ASJC) codes

  • Theoretical Computer Science
  • Computer Science(all)

Fingerprint

Dive into the research topics of 'A new family of string classifiers based on local relatedness'. Together they form a unique fingerprint.

Cite this