# Aggregating E-commerce Search Results from Heterogeneous Sources via   Hierarchical Reinforcement Learning

**Authors:** Ryuichi Takanobu, Tao Zhuang, Minlie Huang, Jun Feng, Haihong Tang, Bo, Zheng

arXiv: 1902.08882 · 2021-05-25

## TL;DR

This paper introduces a hierarchical reinforcement learning approach to dynamically aggregate and present e-commerce search results from multiple sources across all pages, improving relevance and user satisfaction.

## Contribution

It proposes a novel hierarchical reinforcement learning framework that models source selection and item presentation as sequential decision problems for better aggregation.

## Key findings

- Significant improvement in search performance metrics.
- Higher user satisfaction compared to baseline methods.
- Effective handling of heterogeneous source ranking challenges.

## Abstract

In this paper, we investigate the task of aggregating search results from heterogeneous sources in an E-commerce environment. First, unlike traditional aggregated web search that merely presents multi-sourced results in the first page, this new task may present aggregated results in all pages and has to dynamically decide which source should be presented in the current page. Second, as pointed out by many existing studies, it is not trivial to rank items from heterogeneous sources because the relevance scores from different source systems are not directly comparable. To address these two issues, we decompose the task into two subtasks in a hierarchical structure: a high-level task for source selection where we model the sequential patterns of user behaviors onto aggregated results in different pages so as to understand user intents and select the relevant sources properly; and a low-level task for item presentation where we formulate a slot filling process to sequentially present the items instead of giving each item a relevance score when deciding the presentation order of heterogeneous items. Since both subtasks can be naturally formulated as sequential decision problems and learn from the future user feedback on search results, we build our model with hierarchical reinforcement learning. Extensive experiments demonstrate that our model obtains remarkable improvements in search performance metrics, and achieves a higher user satisfaction.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1902.08882/full.md

## Figures

8 figures with captions in the complete paper: https://tomesphere.com/paper/1902.08882/full.md

## References

46 references — full list in the complete paper: https://tomesphere.com/paper/1902.08882/full.md

---
Source: https://tomesphere.com/paper/1902.08882