# Greedy Optimized Multileaving for Personalization

**Authors:** Kojiro Iizuka, Takeshi Yoneda, Yoshifumi Seki

arXiv: 1907.08346 · 2019-07-22

## TL;DR

This paper introduces Greedy Optimized Multileaving (GOM), a novel method for efficiently evaluating personalized rankings that outperforms traditional A/B testing in accuracy and sample size.

## Contribution

It presents the first optimization of multileaving for personalization, addressing existing challenges with a new credit feedback function and demonstrating empirical effectiveness.

## Key findings

- GOM is stable with increasing ranking lengths and number of rankers.
- GOM achieves more precise evaluation with significantly smaller sample sizes.
- GOM outperforms A/B testing in online performance evaluation.

## Abstract

Personalization plays an important role in many services. To evaluate personalized rankings, online evaluation, such as A/B testing, is widely used today. Recently, multileaving has been found to be an efficient method for evaluating rankings in information retrieval fields. This paper describes the first attempt to optimize the multileaving method for personalization settings. We clarify the challenges of applying this method to personalized rankings. Then, to solve these challenges, we propose greedy optimized multileaving (GOM) with a new credit feedback function. The empirical results showed that GOM was stable for increasing ranking lengths and the number of rankers. We implemented GOM on our actual news recommender systems, and compared its online performance. The results showed that GOM evaluated the personalized rankings precisely, with significantly smaller sample sizes (< 1/10) than A/B testing.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1907.08346/full.md

## Figures

3 figures with captions in the complete paper: https://tomesphere.com/paper/1907.08346/full.md

## References

12 references — full list in the complete paper: https://tomesphere.com/paper/1907.08346/full.md

---
Source: https://tomesphere.com/paper/1907.08346