Spatial-Temporal Knowledge Distillation for Takeaway Recommendation
Shuyuan Zhao, Wei Chen, Boyan Shi, Liyong Zhou, Shuohao Lin, Huaiyu, Wan

TL;DR
This paper introduces a novel spatial-temporal knowledge distillation model for takeaway recommendation that effectively captures dynamic user preferences and integrates complex geospatial data with low computational costs.
Contribution
It proposes a two-stage training framework combining a spatial-temporal knowledge graph encoder and a Transformer, enabling efficient fusion of graph and sequence data for improved recommendations.
Findings
Outperforms state-of-the-art baselines on three datasets
Effectively models dynamic user preferences and geospatial information
Reduces computational costs compared to existing methods
Abstract
The takeaway recommendation system aims to recommend users' future takeaway purchases based on their historical purchase behaviors, thereby improving user satisfaction and boosting merchant sales. Existing methods focus on incorporating auxiliary information or leveraging knowledge graphs to alleviate the sparsity issue of user purchase sequences. However, two main challenges limit the performance of these approaches: (1) capturing dynamic user preferences on complex geospatial information and (2) efficiently integrating spatial-temporal knowledge from both graphs and sequence data with low computational costs. In this paper, we propose a novel spatial-temporal knowledge distillation model for takeaway recommendation (STKDRec) based on the two-stage training process. Specifically, during the first pre-training stage, a spatial-temporal knowledge graph (STKG) encoder is trained to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsData Management and Algorithms · Geographic Information Systems Studies · Topic Modeling
MethodsAttention Is All You Need · Linear Layer · Byte Pair Encoding · Absolute Position Encodings · Dense Connections · Knowledge Distillation · Multi-Head Attention · Position-Wise Feed-Forward Layer · Label Smoothing · Residual Connection
