TL;DR
This paper compares BERT4Rec and SASRec in sequential recommendation tasks, revealing that SASRec outperforms BERT4Rec when trained with the same loss function or with larger negative sampling, challenging previous assumptions.
Contribution
It demonstrates that SASRec can outperform BERT4Rec when trained with the same loss or with extensive negative sampling, providing a new perspective on their relative effectiveness.
Findings
SASRec outperforms BERT4Rec when trained with the same loss.
Larger negative sampling improves SASRec's performance.
BERT4Rec's advantage diminishes when training conditions are aligned.
Abstract
Recently sequential recommendations and next-item prediction task has become increasingly popular in the field of recommender systems. Currently, two state-of-the-art baselines are Transformer-based models SASRec and BERT4Rec. Over the past few years, there have been quite a few publications comparing these two algorithms and proposing new state-of-the-art models. In most of the publications, BERT4Rec achieves better performance than SASRec. But BERT4Rec uses cross-entropy over softmax for all items, while SASRec uses negative sampling and calculates binary cross-entropy loss for one positive and one negative item. In our work, we show that if both models are trained with the same loss, which is used by BERT4Rec, then SASRec will significantly outperform BERT4Rec both in terms of quality and training speed. In addition, we show that SASRec could be effectively trained with negative…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsSoftmax
