Representation Consistency for Accurate and Coherent LLM Answer Aggregation
Junqi Jiang, Tom Bewley, Salim I. Amoukou, Francesco Leofante, Antonio Rago, Saumitra Mishra, Francesca Toni

TL;DR
This paper introduces representation consistency (RC), a test-time scaling method that improves LLM answer aggregation by considering internal activation consistency, leading to more accurate and coherent responses without extra model queries.
Contribution
The paper proposes a novel RC method that leverages internal activation consistency for answer aggregation, enhancing inference accuracy in LLMs without additional queries.
Findings
Up to 4% accuracy improvement over baselines
Representation consistency correlates with coherent reasoning
Effective across multiple LLMs and datasets
Abstract
Test-time scaling improves large language models' (LLMs) performance by allocating more compute budget during inference. To achieve this, existing methods often require intricate modifications to prompting and sampling strategies. In this work, we introduce representation consistency (RC), a test-time scaling method for aggregating answers drawn from multiple candidate responses of an LLM regardless of how they were generated, including variations in prompt phrasing and sampling strategy. RC enhances answer aggregation by not only considering the number of occurrences of each answer in the candidate response set, but also the consistency of the model's internal activations while generating the set of responses leading to each answer. These activations can be either dense (raw model activations) or sparse (encoded via pretrained sparse autoencoders). Our rationale is that if the model's…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications
