1$^{st}$ Place Solution of WWW 2025 EReL@MIR Workshop Multimodal CTR   Prediction Challenge

Junwei Xu; Zehao Zhao; Xiaoyu Hu; Zhenjie Song

arXiv:2505.03543·cs.IR·May 7, 2025

1$^{st}$ Place Solution of WWW 2025 EReL@MIR Workshop Multimodal CTR Prediction Challenge

Junwei Xu, Zehao Zhao, Xiaoyu Hu, Zhenjie Song

PDF

Open Access 1 Repo

TL;DR

This paper presents the winning solution for a multimodal CTR prediction challenge, leveraging sequential modeling and feature interaction to improve click-through rate predictions using multimodal embeddings.

Contribution

The paper introduces a simple yet effective method of integrating multimodal embeddings with user-item interaction modeling for CTR prediction.

Findings

01

Achieved 0.9839 AUC on the challenge dataset

02

Outperformed baseline models significantly

03

Demonstrated effectiveness of multimodal embedding integration

Abstract

The WWW 2025 EReL@MIR Workshop Multimodal CTR Prediction Challenge focuses on effectively applying multimodal embedding features to improve click-through rate (CTR) prediction in recommender systems. This technical report presents our 1 $^{s t}$ place winning solution for Task 2, combining sequential modeling and feature interaction learning to effectively capture user-item interactions. For multimodal information integration, we simply append the frozen multimodal embeddings to each item embedding. Experiments on the challenge dataset demonstrate the effectiveness of our method, achieving superior performance with a 0.9839 AUC on the leaderboard, much higher than the baseline model. Code and configuration are available in our GitHub repository and the checkpoint of our model can be found in HuggingFace.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

pinskyrobin/www2025_mmctr
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Radiomics and Machine Learning in Medical Imaging · Natural Language Processing Techniques