Interpolating between the Jaccard distance and an analogue of the normalized information distance
Bj{\o}rn Kjos-Hanssen

TL;DR
This paper explores a family of metrics interpolating between the Jaccard distance and an analogue of normalized information distance, providing formal verification and characterizing conditions for metric properties.
Contribution
It characterizes when the symmetric Tversky ratio models form metrics and introduces new interpolating metrics, verified in the Lean proof assistant.
Findings
D_{α,β} is a metric iff 0≤α≤1/2 and β≥1/(1−α)
Extreme points include the Jaccard distance and an analogue of normalized information distance
Introduces a family of metrics V_p interpolating between these distances
Abstract
Jim\'enez, Becerra, and Gelbukh (2013) defined a family of "symmetric Tversky ratio models" , , . Each function is a semimetric on the powerset of a given finite set. We show that is a metric if and only if and . This result is formally verified in the Lean proof assistant. The extreme points of this parametrized space of metrics are , the Jaccard distance, and , an analogue of the normalized information distance of M. Li, Chen, X. Li, Ma, and Vit\'anyi (2004). As a second interpolation, in general we also show that is a metric, , where $$\mathcal V_p(A,B)=\frac{\Delta_p(A,B)}{|A\cap B| +…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGame Theory and Voting Systems · Bayesian Modeling and Causal Inference
