Advancing Process Verification for Large Language Models via Tree-Based   Preference Learning

Mingqian He; Yongliang Shen; Wenqi Zhang; Zeqi Tan; Weiming Lu

arXiv:2407.00390·cs.CL·July 2, 2024

Advancing Process Verification for Large Language Models via Tree-Based Preference Learning

Mingqian He, Yongliang Shen, Wenqi Zhang, Zeqi Tan, Weiming Lu

PDF

Open Access 1 Video

TL;DR

This paper introduces Tree-PLV, a novel preference learning verifier that constructs reasoning trees to more effectively evaluate and improve large language models' reasoning accuracy, significantly outperforming existing methods.

Contribution

The paper proposes Tree-PLV, a new tree-based preference learning approach that captures nuanced reasoning step preferences, enhancing LLM verification and reasoning performance.

Findings

01

Tree-PLV outperforms baseline methods on multiple reasoning benchmarks.

02

Step-level preference learning improves evaluation accuracy.

03

Significant performance gains on GSM8K, MATH, CSQA, and StrategyQA.

Abstract

Large Language Models (LLMs) have demonstrated remarkable potential in handling complex reasoning tasks by generating step-by-step rationales.Some methods have proven effective in boosting accuracy by introducing extra verifiers to assess these paths. However, existing verifiers, typically trained on binary-labeled reasoning paths, fail to fully utilize the relative merits of intermediate steps, thereby limiting the effectiveness of the feedback provided. To overcome this limitation, we propose Tree-based Preference Learning Verifier (Tree-PLV), a novel approach that constructs reasoning trees via a best-first search algorithm and collects step-level paired data for preference training. Compared to traditional binary classification, step-level preferences more finely capture the nuances between reasoning steps, allowing for a more precise evaluation of the complete reasoning path. We…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Advancing Process Verification for Large Language Models via Tree-Based Preference Learning· underline

Taxonomy

TopicsSemantic Web and Ontologies · Business Process Modeling and Analysis · Natural Language Processing Techniques