Learning Linear Utility Functions From Pairwise Comparison Queries

Luise Ge; Brendan Juba; Yevgeniy Vorobeychik

arXiv:2405.02612·cs.LG·June 21, 2024

Learning Linear Utility Functions From Pairwise Comparison Queries

Luise Ge, Brendan Juba, Yevgeniy Vorobeychik

PDF

Open Access 3 Reviews

TL;DR

This paper investigates the learnability of linear utility functions from pairwise comparison queries, showing efficient learnability in active settings and highlighting a gap compared to passive learning.

Contribution

It introduces algorithms for learning linear utility functions actively, demonstrating a significant difference in learnability between passive and active query strategies.

Findings

01

Passive learning efficiently predicts responses under certain noise conditions.

02

Utility parameters are not learnable in passive settings without strong assumptions.

03

Active learning enables efficient recovery of utility parameters even with noisy responses.

Abstract

We study learnability of linear utility functions from pairwise comparison queries. In particular, we consider two learning objectives. The first objective is to predict out-of-sample responses to pairwise comparisons, whereas the second is to approximately recover the true parameters of the utility function. We show that in the passive learning setting, linear utilities are efficiently learnable with respect to the first objective, both when query responses are uncorrupted by noise, and under Tsybakov noise when the distributions are sufficiently "nice". In contrast, we show that utility parameters are not learnable for a large set of data distributions without strong modeling assumptions, even when query responses are noise-free. Next, we proceed to analyze the learning problem in an active learning setting. In this case, we show that even the second objective is efficiently…

Peer Reviews

Decision·Submitted to ICLR 2025

Reviewer 01Rating 3Confidence 4

Strengths

The problem studied is very clean and interesting to the learning theory community. The writing is very clear.

Weaknesses

I believe the primary weakness of this paper lies in its technical contributions. I didn't find anything particularly novel or technically interesting. For instance, the results in passive learning don't appear to be new. Theorems 1 and 3 have been previously addressed, and the observation in Theorem 2 seems straightforward—naturally, certain noisy models prevent learning. Similarly, the conclusion in Theorem 6 also seems obvious. In the active learning setting, the problem of active learning

Reviewer 02Rating 6Confidence 3

Strengths

- This paper studied a significant and practical problem of reward learning, when the observations are pairwise comparison queries. This can be a common scenario in real-world examples such as online recommendation platforms. Overall the paper is presented with a good clarity on the model, assumptions and results. - The theoretical analysis justified an interesting learnability gap between the passive and active learning settings, where the active queries significantly helped with the utility fu

Weaknesses

In the active learning setting, the estimation of the utility function relies on the ability to invert the embedding function, and obtaining the inverse of the embedding can be computationally challenging or infeasible particularly with neural network models.

Reviewer 03Rating 6Confidence 4

Strengths

- The paper is particularly well written and easy to follow. The organization is clear, the progression makes sense, and the decisions on what to include in the main text and what to defer to the supplementary material are well thought-through. - The results appear to be rigorous. I did not check every single proof, but the ones I checked appear to be correct and, in general, the arguments make sense. - Research on this topic is timely, given the increasing interest in practical applications, an

Weaknesses

- While the exact problem studied in the paper does not have a lot of prior work (to my knowledge), there is an extensive literature on learning RUMs from pairwise comparisons over a fixed set of items with an independent utility for each item in the set. This can be recast as sparse "one-hot" features in the model studied in the paper. - some of the insights the paper provides are well-known in that literature. For example, sample complexity results for the Bradley-Terry model typically dep

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsBayesian Modeling and Causal Inference · Data Mining Algorithms and Applications · Data Management and Algorithms

MethodsSparse Evolutionary Training