Multi-Modal Experience Inspired AI Creation

Qian Cao; Xu Chen; Ruihua Song; Hao Jiang; Guang Yang; Zhao Cao

arXiv:2209.02427·cs.AI·September 5, 2024

Multi-Modal Experience Inspired AI Creation

Qian Cao, Xu Chen, Ruihua Song, Hao Jiang, Guang Yang, Zhao Cao

PDF

1 Repo 1 Models

TL;DR

This paper introduces a novel multi-modal, sequential AI creation framework inspired by human experiences, utilizing a multi-channel sequence-to-sequence model with attention and curriculum negative sampling, validated on a new dataset.

Contribution

It proposes a new multi-modal, sequential AI creation task, along with a multi-channel architecture, a curriculum negative sampling strategy, and a new dataset for benchmarking.

Findings

01

Significant improvements over baselines in automatic metrics

02

Effective modeling of multi-modal sequential information

03

Validated on a newly labeled multi-modal experience dataset

Abstract

AI creation, such as poem or lyrics generation, has attracted increasing attention from both industry and academic communities, with many promising models proposed in the past few years. Existing methods usually estimate the outputs based on single and independent visual or textual information. However, in reality, humans usually make creations according to their experiences, which may involve different modalities and be sequentially correlated. To model such human capabilities, in this paper, we define and solve a novel AI creation problem based on human experiences. More specifically, we study how to generate texts based on sequential multi-modal information. Compared with the previous works, this task is much more difficult because the designed model has to well understand and adapt the semantics among different modalities and effectively convert them into the output in a sequential…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Aman-4-Real/MMTG
pytorchOfficial

Models

🤗
Aman/MMTG
model· ♡ 2
♡ 2

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.