A Multimodal In-Context Tuning Approach for E-Commerce Product Description Generation
Yunxin Li, Baotian Hu, Wenhan Luo, Lin Ma, Yuxin Ding, Min Zhang

TL;DR
This paper introduces ModICT, a multimodal in-context tuning method that leverages similar product samples and frozen large language models to generate more accurate and diverse e-commerce product descriptions from images and keywords.
Contribution
It proposes a novel in-context tuning approach that enhances product description generation by preserving language model capabilities and focusing on multimodal references, improving accuracy and diversity.
Findings
Significantly improves Rouge-L accuracy by up to 3.3%.
Enhances description diversity by up to 9.4% on D-5 metric.
Effective across various language model scales and product categories.
Abstract
In this paper, we propose a new setting for generating product descriptions from images, augmented by marketing keywords. It leverages the combined power of visual and textual information to create descriptions that are more tailored to the unique features of products. For this setting, previous methods utilize visual and textual encoders to encode the image and keywords and employ a language model-based decoder to generate the product description. However, the generated description is often inaccurate and generic since same-category products have similar copy-writings, and optimizing the overall framework on large-scale samples makes models concentrate on common words yet ignore the product features. To alleviate the issue, we present a simple and effective Multimodal In-Context Tuning approach, named ModICT, which introduces a similar product sample as the reference and utilizes the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Text Analysis Techniques · Web Applications and Data Management · Web Data Mining and Analysis
