M2TRec: Metadata-aware Multi-task Transformer for Large-scale and Cold-start free Session-based Recommendations
Walid Shalaby, Sejoon Oh, Amir Afsharinejad, Srijan Kumar, Xiquan Cui

TL;DR
M2TRec is a metadata-aware multi-task transformer that improves session-based recommendations, especially for cold-start and sparse items, by learning item representations from metadata rather than individual embeddings.
Contribution
The paper introduces M2TRec, a novel item-ID free model that leverages item metadata and multi-task learning to enhance scalability and performance in large-scale, sparse, and cold-start scenarios.
Findings
Significant performance improvements on sparse items.
Effective handling of cold-start and unpopular items.
Faster convergence due to multi-task training.
Abstract
Session-based recommender systems (SBRSs) have shown superior performance over conventional methods. However, they show limited scalability on large-scale industrial datasets since most models learn one embedding per item. This leads to a large memory requirement (of storing one vector per item) and poor performance on sparse sessions with cold-start or unpopular items. Using one public and one large industrial dataset, we experimentally show that state-of-the-art SBRSs have low performance on sparse sessions with sparse items. We propose M2TRec, a Metadata-aware Multi-task Transformer model for session-based recommendations. Our proposed method learns a transformation function from item metadata to embeddings, and is thus, item-ID free (i.e., does not need to learn one embedding per item). It integrates item metadata to learn shared representations of diverse item attributes. During…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Layer Normalization · Byte Pair Encoding · Softmax · Dropout · Adam · Dense Connections · Residual Connection
