FITRep: Attention-Guided Item Representation via MLLMs

Guoxiao Zhang; Ao Li; Tan Qu; Qianlong Xie; Xingxing Wang

arXiv:2511.21389·cs.IR·November 27, 2025

FITRep: Attention-Guided Item Representation via MLLMs

Guoxiao Zhang, Ao Li, Tan Qu, Qianlong Xie, Xingxing Wang

PDF

Open Access

TL;DR

This paper introduces FITRep, a novel attention-guided framework utilizing MLLMs for fine-grained item deduplication by preserving structural relationships, leading to improved online advertising performance.

Contribution

It presents the first white-box, attention-guided item representation method that leverages hierarchical semantic extraction and structure-preserving compression for deduplication.

Findings

01

Achieves +3.60% CTR in online tests.

02

Achieves +4.25% CPM in online tests.

03

Demonstrates effectiveness and real-world impact.

Abstract

Online platforms usually suffer from user experience degradation due to near-duplicate items with similar visuals and text. While Multimodal Large Language Models (MLLMs) enable multimodal embedding, existing methods treat representations as black boxes, ignoring structural relationships (e.g., primary vs. auxiliary elements), leading to local structural collapse problem. To address this, inspired by Feature Integration Theory (FIT), we propose FITRep, the first attention-guided, white-box item representation framework for fine-grained item deduplication. FITRep consists of: (1) Concept Hierarchical Information Extraction (CHIE), using MLLMs to extract hierarchical semantic concepts; (2) Structure-Preserving Dimensionality Reduction (SPDR), an adaptive UMAP-based method for efficient information compression; and (3) FAISS-Based Clustering (FBC), a FAISS-based clustering that assigns…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSentiment Analysis and Opinion Mining · Advanced Graph Neural Networks · Topic Modeling