Clustering by Attention: Leveraging Prior Fitted Transformers for Data Partitioning

Ahmed Shokry; Ayman Khalafallah

arXiv:2507.20369·cs.LG·July 29, 2025

Clustering by Attention: Leveraging Prior Fitted Transformers for Data Partitioning

Ahmed Shokry, Ayman Khalafallah

PDF

TL;DR

This paper introduces a novel, parameter-free clustering method using a pre-trained transformer that leverages a few pre-clustered samples to accurately partition large datasets, outperforming existing techniques.

Contribution

The paper proposes a new clustering approach based on meta-learning with a pre-trained transformer, eliminating parameter tuning and improving accuracy with minimal pre-clustered samples.

Findings

01

Outperforms state-of-the-art clustering methods.

02

Works effectively with few pre-clustered samples.

03

Scales well to large datasets.

Abstract

Clustering is a core task in machine learning with wide-ranging applications in data mining and pattern recognition. However, its unsupervised nature makes it inherently challenging. Many existing clustering algorithms suffer from critical limitations: they often require careful parameter tuning, exhibit high computational complexity, lack interpretability, or yield suboptimal accuracy, especially when applied to large-scale datasets. In this paper, we introduce a novel clustering approach based on meta-learning. Our approach eliminates the need for parameter optimization while achieving accuracy that outperforms state-of-the-art clustering techniques. The proposed technique leverages a few pre-clustered samples to guide the clustering process for the entire dataset in a single forward pass. Specifically, we employ a pre-trained Prior-Data Fitted Transformer Network (PFN) to perform…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.