Hierarchical Conversational Preference Elicitation with Bandit Feedback

Jinhang Zuo; Songwen Hu; Tong Yu; Shuai Li; Handong Zhao; Carlee; Joe-Wong

arXiv:2209.06129·cs.IR·September 14, 2022

Hierarchical Conversational Preference Elicitation with Bandit Feedback

Jinhang Zuo, Songwen Hu, Tong Yu, Shuai Li, Handong Zhao, Carlee, Joe-Wong

PDF

Open Access

TL;DR

This paper introduces a hierarchical bandit framework for conversational recommendation systems that adaptively decide between asking key-term preferences and recommending items, improving user experience and reducing regret.

Contribution

It formulates a new hierarchical bandit problem, proposes two algorithms leveraging key-term and item relationships, and provides theoretical regret bounds and empirical validation.

Findings

01

Key-term rewards are influenced by representative item rewards.

02

Hier-UCB and Hier-LinUCB algorithms outperform baselines.

03

Regret bounds are improved by leveraging hierarchical structure.

Abstract

The recent advances of conversational recommendations provide a promising way to efficiently elicit users' preferences via conversational interactions. To achieve this, the recommender system conducts conversations with users, asking their preferences for different items or item categories. Most existing conversational recommender systems for cold-start users utilize a multi-armed bandit framework to learn users' preference in an online manner. However, they rely on a pre-defined conversation frequency for asking about item categories instead of individual items, which may incur excessive conversational interactions that hurt user experience. To enable more flexible questioning about key-terms, we formulate a new conversational bandit problem that allows the recommender system to choose either a key-term or an item to recommend at each round and explicitly models the rewards of these…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Recommender Systems and Techniques · Smart Grid Energy Management