Metric and non-metric proximity transformations at linear costs
Andrej Gisbrecht, Frank-Michael Schleif

TL;DR
This paper introduces a scalable method to transform large, potentially non-metric dissimilarity matrices into valid kernel matrices using Nystroem approximation and eigenvalue correction, enabling efficient kernel methods.
Contribution
It presents a novel, linear-cost technique for converting large-scale non-metric dissimilarities into positive semi-definite kernels, improving scalability and performance.
Findings
Achieves much better runtime performance than standard methods.
Maintains competitive model accuracy.
Enables kernel methods on large, non-metric dissimilarity data.
Abstract
Domain specific (dis-)similarity or proximity measures used e.g. in alignment algorithms of sequence data, are popular to analyze complex data objects and to cover domain specific data properties. Without an underlying vector space these data are given as pairwise (dis-)similarities only. The few available methods for such data focus widely on similarities and do not scale to large data sets. Kernel methods are very effective for metric similarity matrices, also at large scale, but costly transformations are necessary starting with non-metric (dis-) similarities. We propose an integrative combination of Nystroem approximation, potential double centering and eigenvalue correction to obtain valid kernel matrices at linear costs in the number of samples. By the proposed approach effective kernel approaches, become accessible. Experiments with several larger (dis-)similarity data sets show…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Domain Adaptation and Few-Shot Learning · Genomics and Phylogenetic Studies
