MOMENTA: Mixture-of-Experts Over Multimodal Embeddings with Neural Temporal Aggregation for Misinformation Detection

Yeganeh Abdollahinejad; Ahmad Mousavi; Naeemul Hassan; Kai Shu; Nathalie Japkowicz; Shahriar Khosravi; Amir Karami

arXiv:2604.16172·cs.MM·April 20, 2026

MOMENTA: Mixture-of-Experts Over Multimodal Embeddings with Neural Temporal Aggregation for Misinformation Detection

Yeganeh Abdollahinejad, Ahmad Mousavi, Naeemul Hassan, Kai Shu, Nathalie Japkowicz, Shahriar Khosravi, Amir Karami

PDF

TL;DR

MOMENTA is a comprehensive multimodal misinformation detection framework that models semantic inconsistencies, temporal evolution, and domain differences using specialized modules and attention mechanisms.

Contribution

It introduces a unified architecture with mixture-of-experts, bidirectional co-attention, and temporal aggregation to improve robustness and accuracy in misinformation detection.

Findings

01

Achieves strong performance across multiple datasets and metrics.

02

Effectively captures temporal dynamics and cross-modal inconsistencies.

03

Demonstrates robustness across heterogeneous domains.

Abstract

The widespread dissemination of multimodal content on social media has made misinformation detection increasingly challenging, as misleading narratives often arise not only from textual or visual content alone, but also from semantic inconsistencies between modalities and their evolution over time. Existing multimodal misinformation detection methods typically model cross-modal interactions statically and often show limited robustness across heterogeneous datasets, domains, and narrative settings. To address these challenges, we propose MOMENTA, a unified framework for multimodal misinformation detection that captures modality heterogeneity, cross-modal inconsistency, temporal dynamics, and cross-domain generalization within a single architecture. MOMENTA employs modality-specific mixture-of-experts modules to model diverse misinformation patterns, bidirectional co-attention to align…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.