Efficient Latent Semantic Clustering for Scaling Test-Time Computation of LLMs

Sungjae Lee; Hoyoung Kim; Jeongyeon Hwang; Eunhyeok Park; Jungseul Ok

arXiv:2506.00344·cs.CL·June 3, 2025

Efficient Latent Semantic Clustering for Scaling Test-Time Computation of LLMs

Sungjae Lee, Hoyoung Kim, Jeongyeon Hwang, Eunhyeok Park, Jungseul Ok

PDF

Open Access 1 Video

TL;DR

This paper introduces Latent Semantic Clustering (LSC), a lightweight method that uses internal hidden states of LLMs to efficiently cluster outputs semantically, reducing computational overhead in test-time scaling.

Contribution

The paper presents LSC, a novel internal-state-based clustering method that improves efficiency and context-awareness without external models for large language models.

Findings

01

LSC reduces computational costs significantly.

02

LSC maintains or improves clustering accuracy.

03

LSC generalizes across various LLMs and datasets.

Abstract

Scaling test-time computation--generating and analyzing multiple or sequential outputs for a single input--has become a promising strategy for improving the reliability and quality of large language models (LLMs), as evidenced by advances in uncertainty quantification and multi-step reasoning. A key shared component is semantic clustering, which groups outputs that differ in form but convey the same meaning. Semantic clustering enables estimation of the distribution over the semantics of outputs and helps avoid redundant exploration of reasoning paths. However, existing approaches typically rely on external models, which introduce substantial computational overhead and often fail to capture context-aware semantics. We propose Latent Semantic Clustering (LSC), a lightweight and context-sensitive method that leverages the generator LLM's internal hidden states for clustering, eliminating…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Efficient Latent Semantic Clustering for Scaling Test-Time Computation of LLMs· underline

Taxonomy

TopicsNatural Language Processing Techniques · Software Testing and Debugging Techniques · Topic Modeling