UQABench: Evaluating User Embedding for Prompting LLMs in Personalized   Question Answering

Langming Liu; Shilei Liu; Yujin Yuan; Yizhen Zhang; Bencheng Yan,; Zhiyuan Zeng; Zihao Wang; Jiaqi Liu; Di Wang; Wenbo Su; Pengjie Wang; Jian; Xu; Bo Zheng

arXiv:2502.19178·cs.IR·April 2, 2025

UQABench: Evaluating User Embedding for Prompting LLMs in Personalized Question Answering

Langming Liu, Shilei Liu, Yujin Yuan, Yizhen Zhang, Bencheng Yan,, Zhiyuan Zeng, Zihao Wang, Jiaqi Liu, Di Wang, Wenbo Su, Pengjie Wang, Jian, Xu, Bo Zheng

PDF

Open Access 1 Repo 1 Datasets

TL;DR

UQABench is a comprehensive benchmark designed to evaluate the effectiveness of user embeddings in prompting large language models for personalized question answering, addressing challenges of noise and length in user interaction data.

Contribution

This paper introduces UQABench, a standardized evaluation framework for assessing user embeddings in LLM prompting, including diverse tasks and analysis of scaling laws.

Findings

01

User embeddings can effectively improve personalization in LLMs.

02

Scaling laws reveal how embedding size impacts performance.

03

Benchmark facilitates fair comparison of different user embedding methods.

Abstract

Large language models (LLMs) achieve remarkable success in natural language processing (NLP). In practical scenarios like recommendations, as users increasingly seek personalized experiences, it becomes crucial to incorporate user interaction history into the context of LLMs to enhance personalization. However, from a practical utility perspective, user interactions' extensive length and noise present challenges when used directly as text prompts. A promising solution is to compress and distill interactions into compact embeddings, serving as soft prompts to assist LLMs in generating personalized responses. Although this approach brings efficiency, a critical concern emerges: Can user embeddings adequately capture valuable information and prompt LLMs? To address this concern, we propose \name, a benchmark designed to evaluate the effectiveness of user embeddings in prompting LLMs for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

OpenStellarTeam/UQABench
pytorchOfficial

Datasets

OpenStellarTeam/UQABench
dataset· 15 dl
15 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Advanced Graph Neural Networks · Expert finding and Q&A systems