Efficient Image Gallery Representations at Scale Through Multi-Task Learning
Benjamin Gutelman, Pavel Levin

TL;DR
This paper presents a multi-task learning approach to create a scalable, universal image gallery encoder that generalizes well across various recommendation and retrieval tasks, especially benefiting low-resource binary tasks.
Contribution
It introduces a practical multi-task learning framework for building generalizable image gallery representations and analyzes its effectiveness compared to more expensive solutions.
Findings
MTL-trained solutions achieve competitive performance
MTL helps address sparsity in low-resource tasks
Universal encoder improves scalability and generalization
Abstract
Image galleries provide a rich source of diverse information about a product which can be leveraged across many recommendation and retrieval applications. We study the problem of building a universal image gallery encoder through multi-task learning (MTL) approach and demonstrate that it is indeed a practical way to achieve generalizability of learned representations to new downstream tasks. Additionally, we analyze the relative predictive performance of MTL-trained solutions against optimal and substantially more expensive solutions, and find signals that MTL can be a useful mechanism to address sparsity in low-resource binary tasks.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
