VectorMaton: Efficient Vector Search with Pattern Constraints via an Enhanced Suffix Automaton
Haoxuan Xie, Siqiang Luo

TL;DR
VectorMaton introduces an automaton-based index for efficient vector search with pattern constraints on sequence attributes, enabling fast and accurate retrieval in vector databases with sequence data.
Contribution
It presents a novel index structure that combines pattern filtering with vector search, supporting pattern predicates over sequence attributes in vector databases.
Findings
Up to 10x higher query throughput compared to baselines
Up to 18x reduction in index size
Maintains comparable accuracy with existing methods
Abstract
Approximate nearest neighbor search (ANNS) has become a cornerstone in modern vector database systems. Given a query vector, ANNS retrieves the closest vectors from a set of base vectors. In real-world applications, vectors are often accompanied by additional information, such as sequences or structured attributes, motivating the need for fine-grained vector search with constraints on this auxiliary data. Existing methods support attribute-based filtering or range-based filtering on categorical and numerical attributes, but they do not support pattern predicates over sequence attributes. In relational databases, predicates such as LIKE and CONTAINS are fundamental operators for filtering records based on substring patterns. As vector databases increasingly adopt SQL-style query interfaces, enabling pattern predicates over sequence attributes (e.g., texts and biological sequences)…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Algorithms and Data Compression · Data Management and Algorithms
