Pretrained deep models outperform GBDTs in Learning-To-Rank under label scarcity

Charlie Hou; Kiran Koshy Thekumparampil; Michael Shavlovsky; Giulia Fanti; Yesh Dattatreya; Sujay Sanghavi

arXiv:2308.00177·cs.LG·September 25, 2025

Pretrained deep models outperform GBDTs in Learning-To-Rank under label scarcity

Charlie Hou, Kiran Koshy Thekumparampil, Michael Shavlovsky, Giulia Fanti, Yesh Dattatreya, Sujay Sanghavi

PDF

Open Access 3 Reviews

TL;DR

This paper demonstrates that pretrained deep learning models can outperform Gradient Boosted Decision Trees in tabular Learning-to-Rank tasks when labeled data is scarce, especially by leveraging unlabeled data through pretraining.

Contribution

The study introduces the use of unsupervised pretraining for deep models in tabular Learning-to-Rank, showing significant improvements over GBDTs in label-scarce scenarios.

Findings

01

Pretrained DL rankers outperform GBDTs by up to 38%.

02

DL models excel on outliers in ranking tasks.

03

Unsupervised pretraining effectively exploits unlabeled data.

Abstract

On tabular data, a significant body of literature has shown that current deep learning (DL) models perform at best similarly to Gradient Boosted Decision Trees (GBDTs), while significantly underperforming them on outlier data. However, these works often study idealized problem settings which may fail to capture complexities of real-world scenarios. We identify a natural tabular data setting where DL models can outperform GBDTs: tabular Learning-to-Rank (LTR) under label scarcity. Tabular LTR applications, including search and recommendation, often have an abundance of unlabeled data, and scarce labeled data. We show that DL rankers can utilize unsupervised pretraining to exploit this unlabeled data. In extensive experiments over both public and proprietary datasets, we show that pretrained DL rankers consistently outperform GBDT rankers on ranking metrics -- sometimes by as much as 38%…

Peer Reviews

Decision·Submitted to ICLR 2024

Reviewer 01Rating 5· marginally below the acceptance thresholdConfidence 3

Strengths

- Representation learning in tabular LTR has not been studied much. - SimCLR-Rank provides a strategy specific to the structure of LTR task.

Weaknesses

- Technical novelty is limited. - Datasets in experiments are limited to three datasets and one private dataset. - Representation learning on tabular data not necessarily needs to consider LTR setting, since multi-layer MLP will be finetuned for LTR task. As a paper that proposes a new tabular self-supervised learning method, it lacks the comparison with other existing methods, such as - Hajiramezanali, E., Diamant, N. L., Scalia, G., & Shen, M. W. (2022, October). STab: Self-supervised Lear

Reviewer 02Rating 5· marginally below the acceptance thresholdConfidence 5

Strengths

1. This article is characterized by clear and comprehensible writing, presenting methods that are straightforward and easily implementable. 2. Through experimentation, this paper demonstrates that pre-trained deep models can achieve performance levels close to, or even surpass, GBDT in ranking tasks. This discovery holds practical value. 3. The paper introduces a pre-training approach that leverages the nature of learning to rank problems, demonstrating reasonable effectiveness, and in certain

Weaknesses

1. This paper exhibits notable deficiencies in the aspects of experimental comparisons and discussions on related work. The experimental comparison methodology only considers comparisons between fine-tuning or probing methods based on pre-trained models, as well as MLP models. I believe that in the realm of learning to rank and tabular data, there are likely more recent deep learning methods that could serve as baselines for comparison. Proper discussions about these methods should also be incor

Reviewer 03Rating 6· marginally above the acceptance thresholdConfidence 4

Strengths

S1: To the reviewer’s knowledge, this is the first work that shows some promises for pre-trained DNNs for the LTR task. The reviewer thought about the direction but It was not intuitively clear how to do it or if it has benefits. The paper still has several caveats but is a decent exploration in some aspects. S2: the motivation of the ranking contrastive loss is clear and easy to understand - it is clear what the hard negatives are for LTR problems, so it is good to leverage that. S3: It is go

Weaknesses

W1: change over SimCLR is incremental - the major weakness of SimCLR for non-ranking problems was complexity. The authors made a good point that hard negatives are clear for LTR (mentioned in S1), but SimCLR using small batches will likely largely resolve the issues? So the necessity for a new loss is not very convincing - in fact, the authors did not comprehensively compare with that baseline. - also, sometimes having easy negatives may improve the generalization of learning - this may need dee

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Machine Learning and Data Classification · Imbalanced Data Classification Techniques

MethodsBalanced Selection · Residual Block · Residual Connection · 1x1 Convolution · Batch Normalization · Color Jitter · Kaiming Initialization · Dense Connections · Random Resized Crop · *Communicated@Fast*How Do I Communicate to Expedia?