Multimodal Analysis of State-Funded News Coverage of the Israel-Hamas War on YouTube Shorts
Daniel Miehling, Sandra Kuebler

TL;DR
This paper introduces a multimodal pipeline for analyzing geopolitical event coverage in YouTube Shorts, combining transcription, sentiment analysis, and scene classification to reveal differences across outlets and visual cues.
Contribution
It presents a resource-efficient multimodal analysis pipeline and applies it to study war reporting in short-form videos, highlighting differences across outlets and visual cues.
Findings
Smaller domain-adapted models outperform large transformers and LLMs in sentiment analysis.
Sentiment in transcripts varies across outlets and over time.
Scene classifications reflect visual cues consistent with real-world events.
Abstract
YouTube Shorts have become central to news consumption on the platform, yet research on how geopolitical events are represented in this format remains limited. To address this gap, we present a multimodal pipeline that combines automatic transcription, aspect-based sentiment analysis (ABSA), and semantic scene classification. The pipeline is first assessed for feasibility and then applied to analyze short-form coverage of the Israel-Hamas war by state-funded outlets. Using over 2,300 conflict-related Shorts and more than 94,000 visual frames, we systematically examine war reporting across major international broadcasters. Our findings reveal that the sentiment expressed in transcripts regarding specific aspects differs across outlets and over time, whereas scene-type classifications reflect visual cues consistent with real-world events. Notably, smaller domain-adapted models outperform…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
