Aligning GPTRec with Beyond-Accuracy Goals with Reinforcement Learning
Aleksandr Petrov, Craig Macdonald

TL;DR
This paper introduces a two-stage training method for GPTRec, combining imitation learning and reinforcement learning, to optimize beyond-accuracy metrics like diversity and bias reduction in sequential recommendations.
Contribution
It proposes a novel training framework for GPTRec that aligns its recommendations with complex beyond-accuracy goals using reinforcement learning.
Findings
GPTRec achieves better tradeoffs between accuracy and diversity.
Reinforcement learning improves recommendation diversity and reduces popularity bias.
The approach outperforms traditional re-ranking methods in most cases.
Abstract
Adaptations of Transformer models, such as BERT4Rec and SASRec, achieve state-of-the-art performance in the sequential recommendation task according to accuracy-based metrics, such as NDCG. These models treat items as tokens and then utilise a score-and-rank approach (Top-K strategy), where the model first computes item scores and then ranks them according to this score. While this approach works well for accuracy-based metrics, it is hard to use it for optimising more complex beyond-accuracy metrics such as diversity. Recently, the GPTRec model, which uses a different Next-K strategy, has been proposed as an alternative to the Top-K models. In contrast with traditional Top-K recommendations, Next-K generates recommendations item-by-item and, therefore, can account for complex item-to-item interdependencies important for the beyond-accuracy measures. However, the original GPTRec paper…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Machine Learning and Algorithms
MethodsAttention Is All You Need · Linear Layer · Dropout · Multi-Head Attention · Layer Normalization · Absolute Position Encodings · Softmax · Dense Connections · Label Smoothing · Adam
