HHFT: Hierarchical Heterogeneous Feature Transformer for Recommendation Systems
Liren Yu, Wenming Zhang, Silu Zhou, Tao Zhang, Zhixuan Zhang, Dan Ou

TL;DR
This paper introduces HHFT, a Transformer-based model for industrial CTR prediction that effectively captures heterogeneous feature interactions, leading to improved performance and business metrics in real-world deployment.
Contribution
The paper presents a novel hierarchical Transformer architecture tailored for heterogeneous features in CTR prediction, addressing semantic confusion and capturing high-order interactions.
Findings
Transformers outperform DNN baselines in CTR prediction.
HHFT achieves a +0.4% CTR AUC improvement.
Deployment on Taobao's platform increases GMV by +0.6%.
Abstract
We propose HHFT (Hierarchical Heterogeneous Feature Transformer), a Transformer-based architecture tailored for industrial CTR prediction. HHFT addresses the limitations of DNN through three key designs: (1) Semantic Feature Partitioning: Grouping heterogeneous features (e.g. user profile, item information, behaviour sequennce) into semantically coherent blocks to preserve domain-specific information; (2) Heterogeneous Transformer Encoder: Adopting block-specific QKV projections and FFNs to avoid semantic confusion between distinct feature types; (3) Hiformer Layer: Capturing high-order interactions across features. Our findings reveal that Transformers significantly outperform DNN baselines, achieving a +0.4% improvement in CTR AUC at scale. We have successfully deployed the model on Taobao's production platform, observing a significant uplift in key business metrics, including a +0.6%…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRecommender Systems and Techniques · Explainable Artificial Intelligence (XAI) · Topic Modeling
