Refine Thought: A Test-Time Inference Method for Embedding Model Reasoning
Guangzhi Wang, Kai Li, Yinghao Jiao, Zhi Liu

TL;DR
RT is a test-time inference method that refines text embeddings through multiple passes, significantly improving semantic reasoning in embedding models without sacrificing general understanding capabilities.
Contribution
Introduces RT, a novel test-time inference approach that enhances semantic reasoning in text embedding models by iterative refinement during inference.
Findings
RT improves semantic reasoning performance on BRIGHT and PJBenchmark.
RT maintains consistent performance on general semantic understanding tasks.
RT activates semantic reasoning abilities learned during pretraining.
Abstract
We propose RT (Refine Thought), a method that can enhance the semantic rea-soning ability of text embedding models. The method obtains the final semanticrepresentation by running multiple forward passes of the text embedding model.Experiments show that RT achieves significant improvements on semantic reason-ing tasks in BRIGHT and the person job matching benchmark PJBenchmark1, while maintaining consistent performance on general-purpose semantic under-standing tasks such as C-MTEB. Our results indicate that RT is effective becauseit further activates the semantic reasoning ability learned during pretraining bydecoder-only text embedding models(e.g., Qwen3-Embedding-8B). RT canbe seen as a test-time inference method.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Multimodal Machine Learning Applications · Machine Learning in Healthcare
