ER-index: a referential index for encrypted genomic databases
Ferdinando Montecuollo, Giovannni Schmid

TL;DR
The paper introduces ER-index, a novel encrypted, highly compressed full-text index optimized for genomic data, enabling fast pattern searches while ensuring privacy and multi-user access control.
Contribution
It presents ER-index, a new reference-free, encrypted genomic index that achieves superior compression and supports multi-user search permissions, advancing privacy-preserving genomic data analysis.
Findings
Achieves an order of magnitude better compression than previous methods.
Maintains high search performance on highly similar sequences.
Supports multi-user, multi-keys encryption for privacy control.
Abstract
Huge DBMSs storing genomic information are being created and engineerized for doing large-scale, comprehensive and in-depth analysis of human beings and their diseases. However, recent regulations like the GDPR require that sensitive data are stored and elaborated thanks to privacy-by-design methods and software. We designed and implemented ER-index, a new full-text index in minute space which was optimized for compressing and encrypting collections of genomic sequences, and for performing on them fast pattern-search queries. Our new index complements the E2FM-index, which was introduced to compress and encrypt collections of nucleotide sequences without relying on a reference sequence. When used on collections of highly similar sequences, the ER-index allows to obtain compression ratios which are an order of magnitude smaller than those achieved with the E2FM-index, but maintaining its…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
