TY - JOUR
T1 - Privacy-preserving search for chemical compound databases
AU - Shimizu, Kana
AU - Nuida, Koji
AU - Arai, Hiromi
AU - Mitsunari, Shigeo
AU - Attrapadung, Nuttapong
AU - Hamada, Michiaki
AU - Tsuda, Koji
AU - Hirokawa, Takatsugu
AU - Sakuma, Jun
AU - Hanaoka, Goichiro
AU - Asai, Kiyoshi
N1 - Publisher Copyright:
© 2015 Shimizu et al.
PY - 2015/12/9
Y1 - 2015/12/9
N2 - Background: Searching for similar compounds in a database is the most important process for in-silico drug screening. Since a query compound is an important starting point for the new drug, a query holder, who is afraid of the query being monitored by the database server, usually downloads all the records in the database and uses them in a closed network. However, a serious dilemma arises when the database holder also wants to output no information except for the search results, and such a dilemma prevents the use of many important data resources. Results: In order to overcome this dilemma, we developed a novel cryptographic protocol that enables database searching while keeping both the query holder's privacy and database holder's privacy. Generally, the application of cryptographic techniques to practical problems is difficult because versatile techniques are computationally expensive while computationally inexpensive techniques can perform only trivial computation tasks. In this study, our protocol is successfully built only from an additive-homomorphic cryptosystem, which allows only addition performed on encrypted values but is computationally efficient compared with versatile techniques such as general purpose multi-party computation. In an experiment searching ChEMBL, which consists of more than 1,200,000 compounds, the proposed method was 36,900 times faster in CPU time and 12,000 times as efficient in communication size compared with general purpose multi-party computation. Conclusion: We proposed a novel privacy-preserving protocol for searching chemical compound databases. The proposed method, easily scaling for large-scale databases, may help to accelerate drug discovery research by making full use of unused but valuable data that includes sensitive information.
AB - Background: Searching for similar compounds in a database is the most important process for in-silico drug screening. Since a query compound is an important starting point for the new drug, a query holder, who is afraid of the query being monitored by the database server, usually downloads all the records in the database and uses them in a closed network. However, a serious dilemma arises when the database holder also wants to output no information except for the search results, and such a dilemma prevents the use of many important data resources. Results: In order to overcome this dilemma, we developed a novel cryptographic protocol that enables database searching while keeping both the query holder's privacy and database holder's privacy. Generally, the application of cryptographic techniques to practical problems is difficult because versatile techniques are computationally expensive while computationally inexpensive techniques can perform only trivial computation tasks. In this study, our protocol is successfully built only from an additive-homomorphic cryptosystem, which allows only addition performed on encrypted values but is computationally efficient compared with versatile techniques such as general purpose multi-party computation. In an experiment searching ChEMBL, which consists of more than 1,200,000 compounds, the proposed method was 36,900 times faster in CPU time and 12,000 times as efficient in communication size compared with general purpose multi-party computation. Conclusion: We proposed a novel privacy-preserving protocol for searching chemical compound databases. The proposed method, easily scaling for large-scale databases, may help to accelerate drug discovery research by making full use of unused but valuable data that includes sensitive information.
UR - http://www.scopus.com/inward/record.url?scp=84961594119&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84961594119&partnerID=8YFLogxK
U2 - 10.1186/1471-2105-16-S18-S6
DO - 10.1186/1471-2105-16-S18-S6
M3 - Article
C2 - 26678650
AN - SCOPUS:84961594119
SN - 1471-2105
VL - 16
JO - BMC bioinformatics
JF - BMC bioinformatics
IS - 18
M1 - S6
ER -