Employ Multimodal Machine Learning for Content quality analysis
Eric Du, Xiaoyong Li

TL;DR
This paper introduces a multimodal machine learning approach that combines image and text features using a Siamese network to improve content quality analysis accuracy on media sites.
Contribution
It proposes a novel multimodal quality recognition method utilizing separate feature extractors and a Siamese network, advancing beyond single-modal approaches.
Findings
Higher accuracy in quality score prediction compared to existing methods
Effective integration of image and text features
Demonstrated improvement in content quality assessment
Abstract
The task of identifying high-quality content becomes increasingly important, and it can improve overall reading time and CTR(click-through rate estimates). Generalizes quality analysis only focused on single Modal,such as image or text,but in today's mainstream media sites a lot of information is presented in graphic form.In this paper we propose a MultiModal quality recognition approach for the quality score. First we use two feature extractors,one for image and another for the text. After that we use an Siamese Network with the rank loss as the optimization objective.Compare with other approach,our approach get a more accuracy result.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Analysis and Summarization · Image Retrieval and Classification Techniques · Text and Document Classification Technologies
MethodsSiamese Network
