Data-efficient Fine-tuning for LLM-based Recommendation

Xinyu Lin; Wenjie Wang; Yongqi Li; Shuo Yang; Fuli Feng; Yinwei Wei,; Tat-Seng Chua

arXiv:2401.17197·cs.IR·June 5, 2024·5 cites

Data-efficient Fine-tuning for LLM-based Recommendation

Xinyu Lin, Wenjie Wang, Yongqi Li, Shuo Yang, Fuli Feng, Yinwei Wei,, Tat-Seng Chua

PDF

Open Access 1 Repo

TL;DR

This paper introduces a data pruning method for LLM-based recommendation that identifies influential samples using influence and effort scores, enabling effective few-shot fine-tuning with significantly reduced data and costs.

Contribution

It proposes a novel data pruning approach utilizing influence and effort scores, improving efficiency and effectiveness of LLM fine-tuning in recommendation systems.

Findings

01

Uses only 2% of data to outperform full fine-tuning

02

Reduces fine-tuning time costs by 97%

03

Validates effectiveness on three real-world datasets

Abstract

Leveraging Large Language Models (LLMs) for recommendation has recently garnered considerable attention, where fine-tuning plays a key role in LLMs' adaptation. However, the cost of fine-tuning LLMs on rapidly expanding recommendation data limits their practical application. To address this challenge, few-shot fine-tuning offers a promising approach to quickly adapt LLMs to new recommendation data. We propose the task of data pruning for efficient LLM-based recommendation, aimed at identifying representative samples tailored for LLMs' few-shot fine-tuning. While coreset selection is closely related to the proposed task, existing coreset selection methods often rely on suboptimal heuristic metrics or entail costly optimization on large-scale recommendation data. To tackle these issues, we introduce two objectives for the data pruning task in the context of LLM-based recommendation: 1)…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

linxyhaha/dealrec
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsWeb Data Mining and Analysis · Topic Modeling · Power Systems and Technologies

MethodsPruning