A unified multimodal understanding and generation model for cross-disciplinary scientific research
Xiaomeng Yang, Zhiyu Tan, Xiaohui Zhong, Mengping Yang, Qiusheng Huang, Lei Chen, Libo Wu, and Hao Li

TL;DR
FuXi-Uni is a unified multimodal model that advances scientific understanding and generation across disciplines, outperforming state-of-the-art models in Earth science and biomedicine tasks by integrating heterogeneous data within a shared architecture.
Contribution
The paper introduces FuXi-Uni, a novel unified model capable of understanding and generating multimodal scientific data across disciplines within a single architecture.
Findings
Outperforms SOTA physical models in weather forecasting and cyclone prediction.
Achieves superior biomedical visual question answering results.
Supports cross-disciplinary scientific tasks with high fidelity.
Abstract
Scientific discovery increasingly relies on integrating heterogeneous, high-dimensional data across disciplines nowadays. While AI models have achieved notable success across various scientific domains, they typically remain domain-specific or lack the capability of simultaneously understanding and generating multimodal scientific data, particularly for high-dimensional data. Yet, many pressing global challenges and scientific problems are inherently cross-disciplinary and require coordinated progress across multiple fields. Here, we present FuXi-Uni, a native unified multimodal model for scientific understanding and high-fidelity generation across scientific domains within a single architecture. Specifically, FuXi-Uni aligns cross-disciplinary scientific tokens within natural language tokens and employs science decoder to reconstruct scientific tokens, thereby supporting both natural…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Topic Modeling · Data Visualization and Analytics
