OmniFashion: Towards Generalist Fashion Intelligence via Multi-Task Vision-Language Learning
Zhengwei Yang, Andi Long, Hao Li, Zechao Hu, Kui Jiang, Zheng Wang

TL;DR
OmniFashion introduces a unified vision-language framework trained on a large, exhaustively annotated fashion dataset, enabling multi-task reasoning and dialogue to advance generalist fashion intelligence.
Contribution
The paper presents OmniFashion, a novel multi-task vision-language model for fashion that unifies diverse tasks and is trained on the new FashionX dataset with detailed annotations.
Findings
Achieves strong accuracy on multiple fashion tasks
Demonstrates effective cross-task generalization
Enables interactive fashion dialogue
Abstract
Fashion intelligence spans multiple tasks, i.e., retrieval, recommendation, recognition, and dialogue, yet remains hindered by fragmented supervision and incomplete fashion annotations. These limitations jointly restrict the formation of consistent visual-semantic structures, preventing recent vision-language models (VLMs) from serving as a generalist fashion brain that unifies understanding and reasoning across tasks. Therefore, we construct FashionX, a million-scale dataset that exhaustively annotates visible fashion items within an outfit and organizes attributes from global to part-level. Built upon this foundation, we propose OmniFashion, a unified vision-language framework that bridges diverse fashion tasks under a unified fashion dialogue paradigm, enabling both multi-task reasoning and interactive dialogue. Experiments on multi-subtasks and retrieval benchmarks show that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Multimodal Machine Learning Applications · Face recognition and analysis
