MMCOMET: A Large-Scale Multimodal Commonsense Knowledge Graph for Contextual Reasoning

Eileen Wang; Hiba Arnaout; Dhita Pratama; Shuo Yang; Dangyang Liu; Jie Yang; Josiah Poon; Jeff Pan; Caren Han

arXiv:2603.01055·cs.AI·March 3, 2026

MMCOMET: A Large-Scale Multimodal Commonsense Knowledge Graph for Contextual Reasoning

Eileen Wang, Hiba Arnaout, Dhita Pratama, Shuo Yang, Dangyang Liu, Jie Yang, Josiah Poon, Jeff Pan, Caren Han

PDF

Open Access

TL;DR

MMCOMET is a comprehensive multimodal knowledge graph integrating visual, physical, social, and event knowledge, enabling advanced reasoning and storytelling tasks.

Contribution

It extends the ATOMIC2020 knowledge graph with a visual dimension, creating over 900K multimodal triples for improved reasoning.

Findings

01

Enables richer, coherent storytelling

02

Supports complex reasoning tasks

03

Addresses limitations of existing MMKGs

Abstract

We present MMCOMET, the first multimodal commonsense knowledge graph (MMKG) that integrates physical, social, and eventive knowledge. MMCOMET extends the ATOMIC2020 knowledge graph to include a visual dimension, through an efficient image retrieval process, resulting in over 900K multimodal triples. This new resource addresses a major limitation of existing MMKGs in supporting complex reasoning tasks like image captioning and storytelling. Through a standard visual storytelling experiment, we show that our holistic approach enables the generation of richer, coherent, and contextually grounded stories than those produced using text-only knowledge. This resource establishes a new foundation for multimodal commonsense reasoning and narrative generation.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Advanced Graph Neural Networks · Topic Modeling