A multimodal deep learning architecture for smoking detection with a small data approach
Robert Lakatos, Peter Pollner, Andras Hajdu, Tamas Joo

TL;DR
This paper introduces a multimodal deep learning system that detects smoking in media content using minimal training data, combining text and image analysis with human reinforcement for improved accuracy.
Contribution
The study presents a novel integrated deep learning model that effectively detects smoking in images and text with limited data, incorporating human reinforcement for enhanced performance.
Findings
Achieves 74% accuracy on images and 98% on text
Utilizes pre-trained multimodal models for small data scenarios
Incorporates human reinforcement to improve detection reliability
Abstract
Introduction: Covert tobacco advertisements often raise regulatory measures. This paper presents that artificial intelligence, particularly deep learning, has great potential for detecting hidden advertising and allows unbiased, reproducible, and fair quantification of tobacco-related media content. Methods: We propose an integrated text and image processing model based on deep learning, generative methods, and human reinforcement, which can detect smoking cases in both textual and visual formats, even with little available training data. Results: Our model can achieve 74\% accuracy for images and 98\% for text. Furthermore, our system integrates the possibility of expert intervention in the form of human reinforcement. Conclusions: Using the pre-trained multimodal, image, and text processing models available through deep learning makes it possible to detect smoking in different media…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDigital Media Forensic Detection · Smoking Behavior and Cessation · Radio, Podcasts, and Digital Media
