MMIS: Multimodal Dataset for Interior Scene Visual Generation and   Recognition

Hozaifa Kassab; Ahmed Mahmoud; Mohamed Bahaa; Ammar Mohamed; Ali; Hamdi

arXiv:2407.05980·cs.CV·July 9, 2024

MMIS: Multimodal Dataset for Interior Scene Visual Generation and Recognition

Hozaifa Kassab, Ahmed Mahmoud, Mohamed Bahaa, Ammar Mohamed, Ali, Hamdi

PDF

Open Access

TL;DR

The MMIS dataset offers a large, multimodal collection of interior scene images with text and audio annotations, supporting advancements in multi-modal scene generation and recognition tasks.

Contribution

We introduce MMIS, a comprehensive multimodal dataset with images, text, and audio for interior scene understanding and generation, facilitating multi-modal learning research.

Findings

01

Supports diverse interior scene tasks like generation, retrieval, captioning, and classification.

02

Enables research on multi-modal representation learning.

03

Contains nearly 160,000 richly annotated images.

Abstract

We introduce MMIS, a novel dataset designed to advance MultiModal Interior Scene generation and recognition. MMIS consists of nearly 160,000 images. Each image within the dataset is accompanied by its corresponding textual description and an audio recording of that description, providing rich and diverse sources of information for scene generation and recognition. MMIS encompasses a wide range of interior spaces, capturing various styles, layouts, and furnishings. To construct this dataset, we employed careful processes involving the collection of images, the generation of textual descriptions, and corresponding speech annotations. The presented dataset contributes to research in multi-modal representation learning tasks such as image generation, retrieval, captioning, and classification.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsCultural Heritage Management and Preservation · Remote Sensing and Land Use