Exploring the Potential of LLMs as Personalized Assistants: Dataset, Evaluation, and Analysis
Jisoo Mok, Ik-hwan Kim, Sangkwon Park, Sungroh Yoon

TL;DR
This paper introduces HiCUPID, a comprehensive benchmark including a dataset and evaluation model, to advance research on personalized responses from Large Language Models, addressing a key gap in open-source resources.
Contribution
The paper presents HiCUPID, an open-source dataset and evaluation framework specifically designed for assessing LLMs' personalization capabilities.
Findings
HiCUPID enables more accurate assessment of personalized responses.
The dataset facilitates benchmarking of LLMs in personalized AI tasks.
The evaluation model closely aligns with human preferences.
Abstract
Personalized AI assistants, a hallmark of the human-like capabilities of Large Language Models (LLMs), are a challenging application that intertwines multiple problems in LLM research. Despite the growing interest in the development of personalized assistants, the lack of an open-source conversational dataset tailored for personalization remains a significant obstacle for researchers in the field. To address this research gap, we introduce HiCUPID, a new benchmark to probe and unleash the potential of LLMs to deliver personalized responses. Alongside a conversational dataset, HiCUPID provides a Llama-3.2-based automated evaluation model whose assessment closely mirrors human preferences. We release our dataset, evaluation model, and code at https://github.com/12kimih/HiCUPID.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsLaw, AI, and Intellectual Property · Artificial Intelligence in Law · Legal Education and Practice Innovations
