Efficient Geometric-based Computation of the String Subsequence Kernel
Slimane Bellaouar, Hadda Cherroun, and Djelloul Ziadi

TL;DR
This paper introduces a geometric-based method for efficiently computing the string subsequence kernel by reducing the problem to range queries and utilizing a layered range sum tree, significantly improving performance for long strings.
Contribution
The paper presents a novel geometric approach and a layered range sum tree data structure to compute the string subsequence kernel more efficiently than existing methods.
Findings
Efficient computation with O(p|L|log|L|) time complexity.
Outperforms dynamic and sparse programming approaches on large datasets.
Most effective for long strings and large alphabet sizes.
Abstract
Kernel methods are powerful tools in machine learning. They have to be computationally efficient. In this paper, we present a novel Geometric-based approach to compute efficiently the string subsequence kernel (SSK). Our main idea is that the SSK computation reduces to range query problem. We started by the construction of a match list where and are the strings to be compared; such match list contains only the required data that contribute to the result. To compute efficiently the SSK, we extended the layered range tree data structure to a layered range sum tree, a range-aggregation data structure. The whole process takes time and space, where is the size of the match list and is the length of the SSK. We present empiric evaluations of our approach against the dynamic and the sparse programming approaches…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
