Efficient and scalable geometric hashing method for searching protein 3D structures
Gook-Pil Roh, Seung-won Hwang, and Byoung-Kee Yi

TL;DR
This paper introduces an efficient geometric hashing-based method for searching protein 3D structures, focusing on large-scale data handling and reducing computational costs with novel query types and optimization techniques.
Contribution
It proposes a new scalable algorithm for protein structure search using geometric hashing, with improvements for large datasets and practical application.
Findings
True positive rate of at least 0.8 in experiments
Effective reduction in storage and execution time
Reliable matching performance demonstrated
Abstract
As the structural databases continue to expand, efficient methods are required to search similar structures of the query structure from the database. There are many previous works about comparing protein 3D structures and scanning the database with a query structure. However, they generally have limitations on practical use because of large computational and storage requirements. We propose two new types of queries for searching similar sub-structures on the structural database: LSPM (Local Spatial Pattern Matching) and RLSPM (Reverse LSPM). Between two types of queries, we focus on RLSPM problem, because it is more practical and general than LSPM. As a naive algorithm, we adopt geometric hashing techniques to RLSPM problem and then propose our proposed algorithm which improves the baseline algorithm to deal with large-scale data and provide an efficient matching algorithm. We employ…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression · Advanced Image and Video Retrieval Techniques · Metabolism, Diabetes, and Cancer
