Beyond Double Ascent via Recurrent Neural Tangent Kernel in Sequential Recommendation
Ruihong Qiu, Zi Huang, Hongzhi Yin

TL;DR
This paper introduces OverRec, a method using Recurrent Neural Tangent Kernel to achieve large-model performance in sequential recommendation without training, revealing overfitting as a temporary phenomenon and demonstrating state-of-the-art results.
Contribution
It proposes OverRec, leveraging RNTK for recommendation, enabling large-scale model performance without training, and provides theoretical proof of RNTK's suitability for recommendation tasks.
Findings
OverRec achieves state-of-the-art results on four datasets.
Overfitting in large models is temporary and can be mitigated.
RNTK effectively captures user sequence similarities without training.
Abstract
Overfitting has long been considered a common issue to large neural network models in sequential recommendation. In our study, an interesting phenomenon is observed that overfitting is temporary. When the model scale is increased, the trend of the performance firstly ascends, then descends (i.e., overfitting) and finally ascends again, which is named as double ascent in this paper. We therefore raise an assumption that a considerably larger model will generalise better with a higher performance. In an extreme case to infinite-width, performance is expected to reach the limit of this specific structure. Unfortunately, it is impractical to directly build a huge model due to the limit of resources. In this paper, we propose the Overparameterised Recommender (OverRec), which utilises a recurrent neural tangent kernel (RNTK) as a similarity measurement for user sequences to successfully…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRecommender Systems and Techniques · Generative Adversarial Networks and Image Synthesis · Machine Learning in Healthcare
