Distilled Neural Networks for Efficient Learning to Rank

F.M. Nardini; C. Rulli; S. Trani; R.Venturini

arXiv:2202.10728·cs.LG·October 28, 2024

Distilled Neural Networks for Efficient Learning to Rank

F.M. Nardini, C. Rulli, S. Trani, R.Venturini

PDF

1 Repo

TL;DR

This paper introduces a method combining distillation, pruning, and optimized matrix multiplication to create efficient neural networks for ranking tasks, achieving up to 4x faster scoring while maintaining effectiveness.

Contribution

It presents a novel approach that significantly speeds up neural network scoring in learning-to-rank by integrating distillation, pruning, and high-performance matrix multiplication techniques.

Findings

01

Neural networks can be distilled and pruned for efficient ranking.

02

The proposed method achieves up to 4x faster scoring.

03

Effectiveness is maintained despite increased efficiency.

Abstract

Recent studies in Learning to Rank have shown the possibility to effectively distill a neural network from an ensemble of regression trees. This result leads neural networks to become a natural competitor of tree-based ensembles on the ranking task. Nevertheless, ensembles of regression trees outperform neural models both in terms of efficiency and effectiveness, particularly when scoring on CPU. In this paper, we propose an approach for speeding up neural scoring time by applying a combination of Distillation, Pruning and Fast Matrix multiplication. We employ knowledge distillation to learn shallow neural networks from an ensemble of regression trees. Then, we exploit an efficiency-oriented pruning technique that performs a sparsification of the most computationally-intensive layers of the neural network that is then scored with optimized sparse matrix multiplication. Moreover, by…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

hpclab/efficient_nn_for_ltr
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsPruning · Knowledge Distillation