i-Code Studio: A Configurable and Composable Framework for Integrative AI
Yuwei Fang, Mahmoud Khademi, Chenguang Zhu, Ziyi Yang, Reid Pryzant,, Yichong Xu, Yao Qian, Takuya Yoshioka, Lu Yuan, Michael Zeng, Xuedong, Huang

TL;DR
The paper introduces i-Code Studio, a flexible framework that orchestrates multiple pre-trained models for complex multimodal tasks without finetuning, advancing the development of integrative AI towards Artificial General Intelligence.
Contribution
It presents a configurable, composable platform for integrating models to perform diverse multimodal tasks efficiently without additional training.
Findings
Achieves strong zero-shot performance on multimodal tasks
Enables rapid development of multimodal agents
Demonstrates versatility across various modalities
Abstract
Artificial General Intelligence (AGI) requires comprehensive understanding and generation capabilities for a variety of tasks spanning different modalities and functionalities. Integrative AI is one important direction to approach AGI, through combining multiple models to tackle complex multimodal tasks. However, there is a lack of a flexible and composable platform to facilitate efficient and effective model composition and coordination. In this paper, we propose the i-Code Studio, a configurable and composable framework for Integrative AI. The i-Code Studio orchestrates multiple pre-trained models in a finetuning-free fashion to conduct complex multimodal tasks. Instead of simple model composition, the i-Code Studio provides an integrative, flexible, and composable setting for developers to quickly and easily compose cutting-edge services and technologies tailored to their specific…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsMultimodal Machine Learning Applications · Topic Modeling · Speech and dialogue systems
