# Learning to Advertise for Organic Traffic Maximization in E-Commerce   Product Feeds

**Authors:** Dagui Chen, Junqi Jin, Weinan Zhang, Fei Pan, Lvyin Niu, Chuan Yu, Jun, Wang, Han Li, Jian Xu, Kun Gai

arXiv: 1908.06698 · 2019-08-20

## TL;DR

This paper introduces a novel bidding strategy in e-commerce advertising that leverages consumer behavior data to optimize organic traffic, using a Markov Decision Process and a hybrid reinforcement learning algorithm, validated through real-world experiments.

## Contribution

It formulates the Leverage optimization as a Markov Decision Process and proposes the HTLB algorithm to improve learning efficiency in organic traffic maximization.

## Key findings

- The proposed method outperforms baseline approaches in offline tests.
- Online deployment shows significant traffic increase.
- Hybrid training accelerates learning and enhances stability.

## Abstract

Most e-commerce product feeds provide blended results of advertised products and recommended products to consumers. The underlying advertising and recommendation platforms share similar if not exactly the same set of candidate products. Consumers' behaviors on the advertised results constitute part of the recommendation model's training data and therefore can influence the recommended results. We refer to this process as Leverage. Considering this mechanism, we propose a novel perspective that advertisers can strategically bid through the advertising platform to optimize their recommended organic traffic. By analyzing the real-world data, we first explain the principles of Leverage mechanism, i.e., the dynamic models of Leverage. Then we introduce a novel Leverage optimization problem and formulate it with a Markov Decision Process. To deal with the sample complexity challenge in model-free reinforcement learning, we propose a novel Hybrid Training Leverage Bidding (HTLB) algorithm which combines the real-world samples and the emulator-generated samples to boost the learning speed and stability. Our offline experiments as well as the results from the online deployment demonstrate the superior performance of our approach.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1908.06698/full.md

## Figures

15 figures with captions in the complete paper: https://tomesphere.com/paper/1908.06698/full.md

## References

26 references — full list in the complete paper: https://tomesphere.com/paper/1908.06698/full.md

---
Source: https://tomesphere.com/paper/1908.06698