Extracting sets of chemical substructures and protein domains governing drug-target interactions

Yoshihiro Yamanishi, Edouard Pauwels, Hiroto Saigo, Véronique Stoven

Research output: Contribution to journalArticlepeer-review

61 Citations (Scopus)


The identification of rules governing molecular recognition between drug chemical substructures and protein functional sites is a challenging issue at many stages of the drug development process. In this paper we develop a novel method to extract sets of drug chemical substructures and protein domains that govern drug-target interactions on a genome-wide scale. This is made possible using sparse canonical correspondence analysis (SCCA) for analyzing drug substructure profiles and protein domain profiles simultaneously. The method does not depend on the availability of protein 3D structures. From a data set of known drug-target interactions including enzymes, ion channels, G protein-coupled receptors, and nuclear receptors, we extract a set of chemical substructures shared by drugs able to bind to a set of protein domains. These two sets of extracted chemical substructures and protein domains form components that can be further exploited in a drug discovery process. This approach successfully clusters protein domains that may be evolutionary unrelated but that bind a common set of chemical substructures. As shown in several examples, it can also be very helpful for predicting new protein - ligand interactions and addressing the problem of ligand specificity. The proposed method constitutes a contribution to the recent field of chemogenomics that aims to connect the chemical space with the biological space. (Figure presented).

Original languageEnglish
Pages (from-to)1183-1194
Number of pages12
JournalJournal of Chemical Information and Modeling
Issue number5
Publication statusPublished - May 23 2011
Externally publishedYes

All Science Journal Classification (ASJC) codes

  • General Chemistry
  • General Chemical Engineering
  • Computer Science Applications
  • Library and Information Sciences


Dive into the research topics of 'Extracting sets of chemical substructures and protein domains governing drug-target interactions'. Together they form a unique fingerprint.

Cite this