Weakly Supervised Label Smoothing

Gustavo Penha; Claudia Hauff

arXiv:2012.08575·cs.IR·December 17, 2020

Weakly Supervised Label Smoothing

Gustavo Penha, Claudia Hauff

PDF

1 Repo

TL;DR

This paper introduces Weakly Supervised Label Smoothing (WSLS), a novel method leveraging retrieval scores of negative samples to improve neural learning to rank models, demonstrating consistent gains across multiple retrieval tasks.

Contribution

The paper proposes WSLS, a simple, effective technique that enhances label smoothing with weak supervision from retrieval scores, without altering model architecture.

Findings

01

WSLS improves performance of BERT-based rankers across tasks.

02

Incorporating retrieval scores as weak supervision enhances label smoothing.

03

WSLS shows consistent effectiveness gains in experiments.

Abstract

We study Label Smoothing (LS), a widely used regularization technique, in the context of neural learning to rank (L2R) models. LS combines the ground-truth labels with a uniform distribution, encouraging the model to be less confident in its predictions. We analyze the relationship between the non-relevant documents-specifically how they are sampled-and the effectiveness of LS, discussing how LS can be capturing "hidden similarity knowledge" between the relevantand non-relevant document classes. We further analyze LS by testing if a curriculum-learning approach, i.e., starting with LS and after anumber of iterations using only ground-truth labels, is beneficial. Inspired by our investigation of LS in the context of neural L2R models, we propose a novel technique called Weakly Supervised Label Smoothing (WSLS) that takes advantage of the retrieval scores of the negative sampled documents…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Guzpenha/transformer_rankers
pytorch

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsLabel Smoothing