Formal Languages and Algorithms for Similarity based Retrieval from Sequence Databases
A. Prasad Sistla

TL;DR
This paper introduces a similarity-based framework for sequence retrieval using automata, temporal logic, and regular expressions, employing distance measures to evaluate how closely sequences match queries, along with efficient algorithms for retrieval.
Contribution
It proposes a novel similarity semantics for formal query languages over sequences and provides algorithms for efficient retrieval based on these measures.
Findings
Distance measures range from 0 to 1 indicating similarity levels.
Algorithms for computing similarity are efficient and practical.
The framework enables retrieval of sequences closely matching complex queries.
Abstract
The paper considers various formalisms based on Automata, Temporal Logic and Regular Expressions for specifying queries over sequences. Unlike traditional binary semantics, the paper presents a similarity based semantics for thse formalisms. More specifically, a distance measure in the range [0,1] is associated with a sequence, query pair denoting how closely the sequence satisfies the query. These measures are defined using a spectrum of normed vector distance measures. Various distance measures based on the syntax and the traditional semantics of the query are presented. Efficient algorithms for computing these distance measure are presented. These algorithms can be employed for retrieval of sequence from a database that closely satisfy a given.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Management and Algorithms · Advanced Database Systems and Queries · Algorithms and Data Compression
