Secure wavelet matrix: Alphabet-friendly privacy-preserving string search for bioinformatics

Hiroki Sudo, Masanobu Jimbo, Koji Nuida, Kana Shimizu

Research output: Contribution to journalArticlepeer-review

13 Citations (Scopus)

Abstract

Biomedical data often includes personal information, and the technology is demanded that enables the searching of such sensitive data while protecting privacy. We consider a case in which a server has a text database and a user searches the database to find substring matches. The user wants to conceal his/her query and the server wants to conceal the database except for the search results. The previous approach for this problem is based on a linear-time algorithm in terms of alphabet size $\mathbf{|\Sigma |}$||, and it cannot search on the database of large alphabet such as biomedical documents. We present a novel algorithm that can search a string in logarithmic time of $\mathbf{|\Sigma |}$||. In our algorithm, named secure wavelet matrix sWM, we use an additively homomorphic encryption to build an efficient data structure called a wavelet matrix. In an experiment using a simulated string of length 10,000 whose alphabet size ranges from 4 to 1024, the run time of the sWM was up to around two orders of magnitude faster than that of the previous method. sWM enables the searching of a private database efficiently and thus it will facilitate utilizing sensitive biomedical information.

Original languageEnglish
Article number3370684
Pages (from-to)1675-1684
Number of pages10
JournalIEEE/ACM Transactions on Computational Biology and Bioinformatics
Volume16
Issue number5
DOIs
Publication statusPublished - Sept 2019
Externally publishedYes

All Science Journal Classification (ASJC) codes

  • Biotechnology
  • Genetics
  • Applied Mathematics

Fingerprint

Dive into the research topics of 'Secure wavelet matrix: Alphabet-friendly privacy-preserving string search for bioinformatics'. Together they form a unique fingerprint.

Cite this