Don't only Feel Read: Using Scene text to understand advertisements
Arka Ujjal Dey, Suman K. Ghosh, Ernest Valveny

TL;DR
This paper presents a framework that enhances advertisement image classification by integrating visual features with textual cues extracted from embedded scene text, demonstrating improved semantic understanding.
Contribution
It introduces a novel approach combining visual and textual features for ad classification, emphasizing the importance of scene text in semantic interpretation.
Findings
Textual cues improve classification accuracy
Scene text provides meaningful semantic information
Framework outperforms visual-only methods
Abstract
We propose a framework for automated classification of Advertisement Images, using not just Visual features but also Textual cues extracted from embedded text. Our approach takes inspiration from the assumption that Ad images contain meaningful textual content, that can provide discriminative semantic interpretetion, and can thus aid in classifcation tasks. To this end, we develop a framework using off-the-shelf components, and demonstrate the effectiveness of Textual cues in semantic Classfication tasks.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Analysis and Summarization · Image Retrieval and Classification Techniques · Multimodal Machine Learning Applications
