Context Steering: A New Paradigm for Compression-based Embeddings by Synthesizing Relevant Information Features
Guillermo Sarasa, Ana Granados, Francisco de Borja Rodr\'iguez

TL;DR
This paper introduces 'context steering', a novel supervised method to generate task-specific embeddings from compression-based dissimilarities, improving clustering and classification across diverse datasets.
Contribution
It presents a new methodology that actively guides feature shaping in compression-based embeddings, enabling task-oriented representations from data redundancies.
Findings
Robust embeddings improve classification accuracy.
Enhanced cluster quality across heterogeneous datasets.
Effective in both text and audio data.
Abstract
Compression-based dissimilarities (CD) offer a flexible and domain-agnostic means of measuring similarity by identifying implicit information through redundancies between data objects. However, as similarity features are derived from the data, rather than defined as an input, it often proves difficult to align with the task at hand, particularly in complex clustering or classification settings. To address this issue, we introduce "context steering", a novel methodology that actively guides the feature-shaping process. Instead of passively accepting the emergent data structure (typically a hierarchy derived from clustering CDs), our approach "steers" the process by systematically analyzing how each object influences the relational context within a clustering framework. This process generates a custom-tailored embedding that isolates and amplifies class-distinctive information. We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
