DIMS: Distributed Index for Similarity Search in Metric Spaces
Yifan Zhu,Chengyang Luo,Tang Qian,Lu Chen,Yunjun Gao,Baihua Zheng

TL;DR
DIMS introduces a novel distributed indexing approach for similarity search in metric spaces, effectively balancing workload and optimizing communication and computation costs, thereby significantly improving efficiency and scalability over existing methods.
Contribution
The paper presents a new three-stage heterogeneous partitioning and indexing structure for distributed similarity search, along with concurrent search techniques and a cost-based optimization model.
Findings
DIMS outperforms existing distributed similarity search methods in efficiency.
The proposed approach achieves better workload balance and scalability.
Extensive experiments validate the effectiveness of DIMS.
Abstract
Similarity search finds objects that are similar to a given query object based on a similarity metric. As the amount and variety of data continue to grow, similarity search in metric spaces has gained significant attention. Metric spaces can accommodate any type of data and support flexible distance metrics, making similarity search in metric spaces beneficial for many real-world applications, such as multimedia retrieval, personalized recommendation, trajectory analytics, data mining, decision planning, and distributed servers. However, existing studies mostly focus on indexing metric spaces on a single machine, which faces efficiency and scalability limitations with increasing data volume and query amount. Recent advancements in similarity search turn towards distributed methods, while they face challenges including inefficient local data management, unbalanced workload, and low…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Management and Algorithms · Data Mining Algorithms and Applications · Advanced Database Systems and Queries
