LookBench: A Live and Holistic Open Benchmark for Fashion Image Retrieval
Gensmo.ai, Chao Gao, Siqiao Xue, Jiwen Fu, Tingyi Gu, Shanshan Li, Fan Zhou

TL;DR
LookBench is a comprehensive, live benchmark for fashion image retrieval that includes real and AI-generated images, aiming to evaluate and advance models in realistic e-commerce scenarios.
Contribution
The paper introduces LookBench, a new challenging, live, and periodically updated benchmark for fashion image retrieval, with open-source tools and a leaderboard.
Findings
Many models achieve below 60% Recall@1 on LookBench.
The proprietary model outperforms others on LookBench.
Both the proprietary and open-source models set new state-of-the-art on Fashion200K.
Abstract
In this paper, we present LookBench (We use the term "look" to reflect retrieval that mirrors how people shop -- finding the exact item, a close substitute, or a visually consistent alternative.), a live, holistic and challenging benchmark for fashion image retrieval in real e-commerce settings. LookBench includes both recent product images sourced from live websites and AI-generated fashion images, reflecting contemporary trends and use cases. Each test sample is time-stamped and we intend to update the benchmark periodically, enabling contamination-aware evaluation aligned with declared training cutoffs. Grounded in our fine-grained attribute taxonomy, LookBench covers single-item and outfit-level retrieval across. Our experiments reveal that LookBench poses a significant challenge on strong baselines, with many models achieving below Recall@1. Our proprietary model achieves…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
