FLARE: Task-agnostic embedding model evaluation through a normalization process
Jingzhou Jiang, Yixuan Tang, Yi Yang, and Kar Yan Tam

TL;DR
FLARE is a flow-based, task-agnostic embedding evaluation method that accurately estimates information sufficiency without labels, especially effective in high-dimensional spaces.
Contribution
It introduces a normalization-based evaluation approach that overcomes limitations of existing labelless measures in high-dimensional embedding spaces.
Findings
FLARE achieved a Spearman's ρ of 0.90 on 11 datasets.
It remained stable in high-dimensional embeddings ($d \, \geq\ 3,584$).
Existing labelless baselines collapsed in high-dimensional settings.
Abstract
When task-specific labels are not available, it becomes difficult to select an embedding model for a specific target corpus. Existing labelless measures based on kernel estimators or Gaussian mixes fail in high-dimensional space, resulting in unstable rankings. We propose a flow-based labelless representation embedding evaluation (FLARE), which utilizes normalized streams to estimate information sufficiency directly from log-likelihood and avoid distance-based density estimation. We give a finite sample boundary, indicating that the estimation error depends on the intrinsic dimension of the data manifold rather than the original embedding dimension. On 11 datasets and 8 embedders, FLARE reached Spearman's of 0.90 under the supervised benchmark and remained stable in high-dimensional embeddings () as the existing labelless baseline collapsed.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
