CLaRe: Compact near-lossless Latent Representations of High-Dimensional Object Data
Emma Zohner, Edward Gunning, Giles Hooker, Jeffrey Morris

TL;DR
CLaRe is a framework for evaluating high-dimensional data representations that emphasizes controlling worst-case errors over average errors, ensuring more reliable statistical analysis in latent spaces.
Contribution
The paper introduces CLaRe, a novel framework for assessing latent representations by focusing on worst-case errors, along with GLaRe, an open-source tool for its implementation.
Findings
CLaRe effectively balances compactness and information preservation.
Optimal representations vary across datasets, highlighting the need for flexible evaluation.
GLaRe provides graphical summaries of error distributions.
Abstract
Latent feature representation methods play an important role in the dimension reduction and statistical modeling of high-dimensional complex data objects. However, existing approaches to assess the quality of these methods often rely on aggregated statistics that reflect the central tendency of the distribution of information losses, such as average or total loss, which can mask variation across individual observations. We argue that controlling average performance is insufficient to guarantee that statistical analysis in the latent space reflects the data-generating process and instead advocate for controlling the worst-case generalization error, or a tail quantile of the generalization error distribution. Our framework, CLaRe (Compact near-lossless Latent Representations), introduces a systematic way to balance compactness of the representation with preservation of information when…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Image Retrieval and Classification Techniques · Domain Adaptation and Few-Shot Learning
