TY - GEN
T1 - A new family of string classifiers based on local relatedness
AU - Higa, Yasuto
AU - Inenaga, Shunsuke
AU - Bannai, Hideo
AU - Takeda, Masayuki
PY - 2006
Y1 - 2006
N2 - This paper introduces a new family of string classifiers based on local relatedness. We use three types of local relatedness measurements, namely, longest common substrings (LCStr's), longest common subsequences (LCSeq's), and window-accumulated longest common sub-sequences (wLCSeq's). We show that finding the optimal classier for given two sets of strings (the positive set and the negative set), is NP-hard for all of the above measurements. In order to achieve practically efficient algorithms for finding the best classifier, we investigate pruning heuristics and fast string matching techniques based on the properties of the local relatedness measurements.
AB - This paper introduces a new family of string classifiers based on local relatedness. We use three types of local relatedness measurements, namely, longest common substrings (LCStr's), longest common subsequences (LCSeq's), and window-accumulated longest common sub-sequences (wLCSeq's). We show that finding the optimal classier for given two sets of strings (the positive set and the negative set), is NP-hard for all of the above measurements. In order to achieve practically efficient algorithms for finding the best classifier, we investigate pruning heuristics and fast string matching techniques based on the properties of the local relatedness measurements.
UR - http://www.scopus.com/inward/record.url?scp=33750744852&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=33750744852&partnerID=8YFLogxK
U2 - 10.1007/11893318_14
DO - 10.1007/11893318_14
M3 - Conference contribution
AN - SCOPUS:33750744852
SN - 3540464913
SN - 9783540464914
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 114
EP - 124
BT - Discovery Science - 9th International Conference, DS 2006, Proceedings
PB - Springer Verlag
T2 - 9th International Conference on Discovery Science, DS 2006
Y2 - 7 October 2006 through 10 October 2006
ER -