Multi-level Feedback Web Links Selection Problem: Learning and Optimization
Kechao Cai, Kun Chen, Longbo Huang, John C.S. Lui

TL;DR
This paper models the web links selection problem as a constrained multi-armed bandit with multi-level feedback, proposing an algorithm that learns link structures to maximize revenue while satisfying feedback constraints.
Contribution
It introduces the first model for links selection with multi-level feedback as a constrained bandit problem and provides an effective algorithm with theoretical guarantees.
Findings
The multi-level feedback structure of web links can be learned from real datasets.
The proposed LExp algorithm outperforms state-of-the-art bandit algorithms in links selection tasks.
The algorithm achieves sub-linear regret and constraint violation bounds.
Abstract
Selecting the right web links for a website is important because appropriate links not only can provide high attractiveness but can also increase the website's revenue. In this work, we first show that web links have an intrinsic \emph{multi-level feedback structure}. For example, consider a -level feedback web link: the st level feedback provides the Click-Through Rate (CTR) and the nd level feedback provides the potential revenue, which collectively produce the compound -level revenue. We consider the context-free links selection problem of selecting links for a homepage so as to maximize the total compound -level revenue while keeping the total st level feedback above a preset threshold. We further generalize the problem to links with -level feedback structure. The key challenge is that the links' multi-level feedback structures are unobservable unless…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Optimization and Search Problems · Reinforcement Learning in Robotics
