TY - JOUR
T1 - Glycan classification with tree kernels
AU - Yamanishi, Yoshihiro
AU - Bach, Francis
AU - Vert, Jean Philippe
PY - 2007/5/15
Y1 - 2007/5/15
N2 - Motivation: Glycans are covalent assemblies of sugar that play crucial roles in many cellular processes. Recently, comprehensive data about the structure and function of glycans have been accumulated, therefore the need for methods and algorithms to analyze these data is growing fast. Results: This article presents novel methods for classifying glycans and detecting discriminative glycan motifs with support vector machines (SVM). We propose a new class of tree kernels to measure the similarity between glycans. These kernels are based on the comparison of tree substructures, and take into account several glycan features such as the sugar type, the sugar bound type or layer depth. The proposed methods are tested on their ability to classify human glycans into four blood components: leukemia cells, erythrocytes, plasma and serum. They are shown to outperform a previously published method. We also applied a feature selection approach to extract glycan motifs which are characteristic of each blood component. We confirmed that some leukemia-specific glycan motifs detected by our method corresponded to several results in the literature.
AB - Motivation: Glycans are covalent assemblies of sugar that play crucial roles in many cellular processes. Recently, comprehensive data about the structure and function of glycans have been accumulated, therefore the need for methods and algorithms to analyze these data is growing fast. Results: This article presents novel methods for classifying glycans and detecting discriminative glycan motifs with support vector machines (SVM). We propose a new class of tree kernels to measure the similarity between glycans. These kernels are based on the comparison of tree substructures, and take into account several glycan features such as the sugar type, the sugar bound type or layer depth. The proposed methods are tested on their ability to classify human glycans into four blood components: leukemia cells, erythrocytes, plasma and serum. They are shown to outperform a previously published method. We also applied a feature selection approach to extract glycan motifs which are characteristic of each blood component. We confirmed that some leukemia-specific glycan motifs detected by our method corresponded to several results in the literature.
UR - http://www.scopus.com/inward/record.url?scp=34447328438&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=34447328438&partnerID=8YFLogxK
U2 - 10.1093/bioinformatics/btm090
DO - 10.1093/bioinformatics/btm090
M3 - Article
C2 - 17344232
AN - SCOPUS:34447328438
SN - 1367-4803
VL - 23
SP - 1211
EP - 1216
JO - Bioinformatics
JF - Bioinformatics
IS - 10
ER -