Full-privacy secured search engine empowered by efficient genome-mapping algorithms
Yuan-Yu Chang, Sheng-Tang Wong, Emmanuel O Salawu, Yu-Xuan Wang,, Jui-Hung Hung, Lee-Wei Yang

TL;DR
This paper introduces S.A.V.E., a privacy-preserving search engine using genome-mapping algorithms that encodes queries into biological sequences, enabling secure and efficient web content search and plagiarism detection without revealing user data.
Contribution
The paper presents a novel privacy-preserving search method using genome-mapping algorithms and local encoding, enhancing security and efficiency in web searches and plagiarism detection.
Findings
S.A.V.E. achieves correct results with a false positive rate <0.8%.
Runs at similar speed as Bowtie, 4 orders faster than BLAST.
Provides a secure search mode that protects user privacy.
Abstract
Since the 90s, keyword-based search engines have been helping people locate relevant web content via a simple query, so have the recent full-text-based search engines mainly used for plagiarism detection following an article upload. However, these "free" or paid services operate by storing users' search queries and preferences for personal profiling and targeted ads delivery, while user-uploaded articles can further profit the service providers as part of their expanding databases. In short, search engine privacy has not been an option for web exploration in the past decades. Here we demonstrate that a database or internet search, provided with the entire article as a query, can be correctly carried out without revealing users' sensitive queries by an irreversible encoding scheme and an efficient FM-index search routine that is generally used in the NGS of genomes. In our solution,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Topic Modeling · Hate Speech and Cyberbullying Detection
Methodstravel james · SPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
