ToW: Thoughts of Words Improve Reasoning in Large Language Models

Zhikun Xu; Ming Shen; Jacob Dineen; Zhaonan Li; Xiao Ye; Shijie Lu,; Aswin RRV; Chitta Baral; Ben Zhou

arXiv:2410.16235·cs.CL·January 31, 2025

ToW: Thoughts of Words Improve Reasoning in Large Language Models

Zhikun Xu, Ming Shen, Jacob Dineen, Zhaonan Li, Xiao Ye, Shijie Lu,, Aswin RRV, Chitta Baral, Ben Zhou

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper presents ToW, a novel data-augmentation method that enhances large language models' reasoning by injecting fine-grained thoughts during training, reducing hallucinations and improving reasoning performance.

Contribution

The paper introduces ToW, a new training approach that incorporates thoughts of words to improve reasoning and reduce hallucinations in language models, using distillation from larger models.

Findings

01

Improves reasoning performance by 7-9% after training with ToW annotations.

02

Reduces model hallucination by up to 10%.

03

Is task-agnostic and introduces no additional biases.

Abstract

We introduce thoughts of words (ToW), a novel training-time data-augmentation method for next-word prediction. ToW views next-word prediction as a core reasoning task and injects fine-grained thoughts explaining what the next word should be and how it is related to the previous contexts in pre-training texts. Our formulation addresses two fundamental drawbacks of existing next-word prediction learning schemes: they induce factual hallucination and are inefficient for models to learn the implicit reasoning processes in raw texts. While there are many ways to acquire such thoughts of words, we explore the first step of acquiring ToW annotations through distilling from larger models. After continual pre-training with only 70K ToW annotations, we effectively improve models' reasoning performances by 7% to 9% on average and reduce model hallucination by up to 10%. At the same time, ToW is…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ARC-ASU/fine-nwp
pytorchOfficial

Videos

ToW: Thoughts of Words Improve Reasoning in Large Language Models· underline

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Semantic Web and Ontologies