Generalized Ensemble Model for Document Ranking in Information Retrieval

Yanshan Wang; In-Chan Choi; Hongfang Liu

arXiv:1507.08586·cs.IR·February 3, 2017

Generalized Ensemble Model for Document Ranking in Information Retrieval

Yanshan Wang, In-Chan Choi, Hongfang Liu

PDF

TL;DR

This paper introduces a generalized ensemble model (gEnM) for document ranking that optimally combines multiple retrieval models to improve relevance ranking, utilizing both supervised and unsupervised learning algorithms, validated on benchmark datasets.

Contribution

The paper presents a novel generalized ensemble model for document ranking, with new algorithms for optimal linear combination using both supervised (batch and online) and unsupervised methods.

Findings

01

gEnM outperforms individual models on benchmark datasets

02

Supervised algorithms achieve higher accuracy than unsupervised methods

03

The proposed methods are effective in diverse information retrieval scenarios

Abstract

A generalized ensemble model (gEnM) for document ranking is proposed in this paper. The gEnM linearly combines basis document retrieval models and tries to retrieve relevant documents at high positions. In order to obtain the optimal linear combination of multiple document retrieval models or rankers, an optimization program is formulated by directly maximizing the mean average precision. Both supervised and unsupervised learning algorithms are presented to solve this program. For the supervised scheme, two approaches are considered based on the data setting, namely batch and online setting. In the batch setting, we propose a revised Newton's algorithm, gEnM.BAT, by approximating the derivative and Hessian matrix. In the online setting, we advocate a stochastic gradient descent (SGD) based algorithm---gEnM.ON. As for the unsupervised scheme, an unsupervised ensemble model (UnsEnM) by…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.