TL;DR
The paper introduces FUJI, a new scale-invariant similarity measure for ranked lists that improves stability and accuracy over traditional methods, with theoretical analysis, efficient computation, and practical machine learning applications.
Contribution
It presents FUJI, a novel fuzzy Jaccard-based score for comparing ordered lists, with theoretical properties, an efficient algorithm, and demonstrated advantages in high-dimensional feature ranking.
Findings
FUJI outperforms benchmark similarity scores in robustness and efficiency.
Empirical tests show FUJI's effectiveness in synthetic scenarios.
Application to feature ranking improves interpretability and predictive performance.
Abstract
We propose Fuzzy Jaccard Index (FUJI) -- a scale-invariant score for assessment of the similarity between two ranked/ordered lists. FUJI improves upon the Jaccard index by incorporating a membership function which takes into account the particular ranks, thus producing both more stable and more accurate similarity estimates. We provide theoretical insights into the properties of the FUJI score as well as propose an efficient algorithm for computing it. We also present empirical evidence of its performance on different synthetic scenarios. Finally, we demonstrate its utility in a typical machine learning setting -- comparing feature ranking lists relevant to a given machine learning task. In real-life, and in particular high-dimensional domains, where only a small percentage of the whole feature space might be relevant, a robust and confident feature ranking leads to interpretable…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
