Abstract2Appendix: Academic Reviews Enhance LLM Long-Context   Capabilities

Shengzhi Li; Kittipat Kampa; Rongyu Lin; Bohang Li; Shichao Pei

arXiv:2411.05232·cs.CL·November 11, 2024

Abstract2Appendix: Academic Reviews Enhance LLM Long-Context Capabilities

Shengzhi Li, Kittipat Kampa, Rongyu Lin, Bohang Li, Shichao Pei

PDF

Open Access 1 Repo 1 Datasets

TL;DR

This paper demonstrates that fine-tuning large language models with high-quality academic peer review data using DPO significantly improves their long-context reading abilities, outperforming other methods and emphasizing the value of human reviews.

Contribution

It introduces the use of high-quality academic reviews and DPO for fine-tuning LLMs, showing superior performance in long-context tasks and benchmark results.

Findings

01

DPO outperforms SFT in data efficiency and effectiveness.

02

Fine-tuning with 2000 samples yields notable improvements.

03

High-quality human reviews are preferred over LLM responses even for advanced models.

Abstract

Large language models (LLMs) have shown remarkable performance across various tasks, yet their ability to handle long-context reading remains challenging. This study explores the effectiveness of leveraging high-quality academic peer review data for fine-tuning LLMs to enhance their long-context capabilities. We compare the Direct Preference Optimization (DPO) method with the Supervised Fine-Tuning (SFT) method, demonstrating DPO's superiority and data efficiency. Our experiments show that the fine-tuned model achieves a 4.04-point improvement over phi-3 and a 2.6\% increase on the Qasper benchmark using only 2000 samples. Despite facing limitations in data scale and processing costs, this study underscores the potential of DPO and high-quality data in advancing LLM performance. Additionally, the zero-shot benchmark results indicate that aggregated high-quality human reviews are…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

findalexli/abstract2appendix
noneOfficial

Datasets

alexshengzhili/Abstract2Appendix_v1_10k
dataset· 82 dl
82 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsBiomedical Text Mining and Ontologies

MethodsDirect Preference Optimization