TL;DR
This paper proposes a comprehensive, multi-source information integration framework for automatic video titling in e-commerce, utilizing graph neural networks and multi-grained analysis to generate product-aware, catchy titles.
Contribution
It introduces an end-to-end model combining granular interaction modeling and storyline summarization, leveraging heterogeneous data sources and graph neural networks for improved video titling.
Findings
Developed a large-scale dataset from Taobao for training and evaluation.
Achieved significant improvements over baseline methods in video titling accuracy.
Demonstrated the effectiveness of multi-source information integration and graph-based modeling.
Abstract
In e-commerce, consumer-generated videos, which in general deliver consumers' individual preferences for the different aspects of certain products, are massive in volume. To recommend these videos to potential consumers more effectively, diverse and catchy video titles are critical. However, consumer-generated videos seldom accompany appropriate titles. To bridge this gap, we integrate comprehensive sources of information, including the content of consumer-generated videos, the narrative comment sentences supplied by consumers, and the product attributes, in an end-to-end modeling framework. Although automatic video titling is very useful and demanding, it is much less addressed than video captioning. The latter focuses on generating sentences that describe videos as a whole while our task requires the product-aware multi-grained video analysis. To tackle this issue, the proposed method…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
