Multi-level Feedback Web Links Selection Problem: Learning and   Optimization

Kechao Cai; Kun Chen; Longbo Huang; John C.S. Lui

arXiv:1709.02664·cs.LG·September 11, 2017

Multi-level Feedback Web Links Selection Problem: Learning and Optimization

Kechao Cai, Kun Chen, Longbo Huang, John C.S. Lui

PDF

Open Access

TL;DR

This paper models the web links selection problem as a constrained multi-armed bandit with multi-level feedback, proposing an algorithm that learns link structures to maximize revenue while satisfying feedback constraints.

Contribution

It introduces the first model for links selection with multi-level feedback as a constrained bandit problem and provides an effective algorithm with theoretical guarantees.

Findings

01

The multi-level feedback structure of web links can be learned from real datasets.

02

The proposed LExp algorithm outperforms state-of-the-art bandit algorithms in links selection tasks.

03

The algorithm achieves sub-linear regret and constraint violation bounds.

Abstract

Selecting the right web links for a website is important because appropriate links not only can provide high attractiveness but can also increase the website's revenue. In this work, we first show that web links have an intrinsic \emph{multi-level feedback structure}. For example, consider a $2$ -level feedback web link: the $1$ st level feedback provides the Click-Through Rate (CTR) and the $2$ nd level feedback provides the potential revenue, which collectively produce the compound $2$ -level revenue. We consider the context-free links selection problem of selecting links for a homepage so as to maximize the total compound $2$ -level revenue while keeping the total $1$ st level feedback above a preset threshold. We further generalize the problem to links with $n (n \geq 2)$ -level feedback structure. The key challenge is that the links' multi-level feedback structures are unobservable unless…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Optimization and Search Problems · Reinforcement Learning in Robotics