MIMIC: A Generative Multimodal Foundation Model for Biomolecules

Siavash Golkar; Jake Kovalic; Irina Espejo Morales; Samuel Sledzieski; Minhuan Li; Ksenia Sokolova; Geraud Krawezik; Alberto Bietti; Claudia Skok Gibbs; Roman Klypa; Shengwei Xiong; Francois Lanusse; Liam Parker; Kyunghyun Cho; Miles Cranmer; Tom Hehir; Michael McCabe; Lucas Meyer; Rudy Morel; Payel Mukhopadhyay; Mariel Pettee; Helen Qu; Jeff Shen; David Fouhey; Hadi Sotoudeh; Vikram Mulligan; Pilar Cossio; Sonya M. Hanson; Alisha N. Jones; Olga G. Troyanskaya; Shirley Ho

arXiv:2604.24506·cs.AI·April 28, 2026

MIMIC: A Generative Multimodal Foundation Model for Biomolecules

Siavash Golkar, Jake Kovalic, Irina Espejo Morales, Samuel Sledzieski, Minhuan Li, Ksenia Sokolova, Geraud Krawezik, Alberto Bietti, Claudia Skok Gibbs, Roman Klypa, Shengwei Xiong, Francois Lanusse, Liam Parker, Kyunghyun Cho, Miles Cranmer, Tom Hehir, Michael McCabe

PDF

TL;DR

MIMIC is a multimodal generative model for biomolecules that integrates sequence, structure, and functional data to improve prediction, design, and understanding of biological molecules across various modalities.

Contribution

The paper introduces MIMIC, a novel multimodal foundation model trained on a curated dataset, enabling integrated biomolecular modeling and design across multiple modalities.

Findings

01

MIMIC achieves state-of-the-art splicing prediction.

02

Multimodal conditioning improves sequence reconstruction.

03

MIMIC enables targeted biomolecular design and editing.

Abstract

Biological function emerges from coupled constraints across sequence, structure, regulation, evolution, and cellular context, yet most foundation models in biology are trained within one modality or for a fixed forward task. We present MIMIC, a generative multimodal foundation model trained on our newly curated and aligned dataset, LORE, linking nucleic acid, protein, evolutionary, structural, regulatory, and semantic/contextual modalities within partially observed biomolecular states. MIMIC uses a split-track encoder-decoder architecture to condition on arbitrary subsets of observed modalities and reconstruct or generate missing components of molecular state across the genome, transcriptome, and proteome. Multimodal conditioning consistently improves MIMIC's sequence reconstruction relative to sequence-only inputs, while its learned representations enable state-of-the-art performance…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.