HELM: A Human-Centered Evaluation Framework for LLM-Powered Recommender Systems

Sushant Mehta

arXiv:2601.19197·cs.IR·January 28, 2026

HELM: A Human-Centered Evaluation Framework for LLM-Powered Recommender Systems

Sushant Mehta

PDF

Open Access

TL;DR

This paper introduces HELM, a comprehensive human-centered evaluation framework for LLM-powered recommender systems, assessing multiple qualitative dimensions beyond traditional accuracy metrics to better capture user experience.

Contribution

It presents HELM, a novel evaluation framework that systematically measures human-centered qualities of LLM-based recommenders across five key dimensions.

Findings

01

GPT-4 has the highest explanation quality and interaction naturalness.

02

GPT-4 shows significant popularity bias compared to traditional methods.

03

HELM reveals critical quality aspects invisible to traditional metrics.

Abstract

The integration of Large Language Models (LLMs) into recommendation systems has introduced unprecedented capabilities for natural language understanding, explanation generation, and conversational interactions. However, existing evaluation methodologies focus predominantly on traditional accuracy metrics, failing to capture the multifaceted human-centered qualities that determine the real-world user experience. We introduce \framework{} (\textbf{H}uman-centered \textbf{E}valuation for \textbf{L}LM-powered reco\textbf{M}menders), a comprehensive evaluation framework that systematically assesses LLM-powered recommender systems across five human-centered dimensions: \textit{Intent Alignment}, \textit{Explanation Quality}, \textit{Interaction Naturalness}, \textit{Trust \& Transparency}, and \textit{Fairness \& Diversity}. Through extensive experiments involving three state-of-the-art…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Artificial Intelligence in Healthcare and Education · Topic Modeling