TL;DR
Batch-of-Thought (BoT) is a cross-instance learning method that jointly processes related queries to improve reasoning accuracy, consistency, and efficiency in large language models without additional training.
Contribution
We introduce BoT, a training-free approach enabling cross-instance reasoning, error detection, and cost reduction, demonstrated within a multi-agent reflection architecture across multiple benchmarks.
Findings
BoT improves accuracy across three model families and six benchmarks.
BoT reduces inference costs by up to 61%.
BoT enhances confidence calibration and reasoning quality.
Abstract
Current Large Language Model reasoning systems process queries independently, discarding valuable cross-instance signals such as shared reasoning patterns and consistency constraints. We introduce Batch-of-Thought (BoT), a training-free method that processes related queries jointly to enable cross-instance learning. By performing comparative analysis across batches, BoT identifies high-quality reasoning templates, detects errors through consistency checks, and amortizes computational costs. We instantiate BoT within a multi-agent reflection architecture (BoT-R), where a Reflector performs joint evaluation to unlock mutual information gain unavailable in isolated processing. Experiments across three model families and six benchmarks demonstrate that BoT-R consistently improves accuracy and confidence calibration while reducing inference costs by up to 61%. Our theoretical and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
