Learning to Reduce: Towards Improving Performance of Large Language   Models on Structured Data

Younghun Lee; Sungchul Kim; Ryan A. Rossi; Tong Yu; Xiang Chen

arXiv:2407.02750·cs.CL·July 4, 2024·1 cites

Learning to Reduce: Towards Improving Performance of Large Language Models on Structured Data

Younghun Lee, Sungchul Kim, Ryan A. Rossi, Tong Yu, Xiang Chen

PDF

Open Access

TL;DR

This paper introduces Learning to Reduce, a fine-tuning framework for LLMs that simplifies structured data inputs, improving their performance on table question answering and demonstrating strong generalizability across datasets.

Contribution

The paper presents a novel fine-tuning framework, Learning to Reduce, that enhances LLMs' ability to process and understand structured data by generating reduced representations.

Findings

01

Learning to Reduce improves LLM performance on structured data tasks.

02

The framework generalizes well across different datasets.

03

Fine-tuned models perform better on long-context table QA tasks.

Abstract

Large Language Models (LLMs) have been achieving competent performance on a wide range of downstream tasks, yet existing work shows that inference on structured data is challenging for LLMs. This is because LLMs need to either understand long structured data or select the most relevant evidence before inference, and both approaches are not trivial. This paper proposes a framework, Learning to Reduce, that fine-tunes a language model with On-Policy Learning to generate a reduced version of an input structured data. When compared to state-of-the-art LLMs like GPT-4, Learning to Reduce not only achieves outstanding performance in reducing the input, but shows generalizability on different datasets. We further show that the model fine-tuned with our framework helps LLMs better perform on table QA tasks especially when the context is longer.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques

MethodsAttention Is All You Need · Linear Layer · Multi-Head Attention · Softmax · Byte Pair Encoding · Layer Normalization · Label Smoothing · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Adam