Breaking Bad Molecules: Are MLLMs Ready for Structure-Level Molecular Detoxification?
Fei Lin, Ziyang Gong, Cong Wang, Tengchao Zhang, Yonglin Tian, Yining Jiang, Ji Dai, Chao Guo, Xiaotong Yu, Xue Yang, Gen Luo, Fei-Yue Wang

TL;DR
This paper introduces ToxiMol, a benchmark for evaluating multimodal large language models on molecular toxicity repair, highlighting current capabilities and challenges in generating less toxic, structurally valid molecules.
Contribution
The paper presents ToxiMol, a comprehensive benchmark dataset and evaluation framework for assessing MLLMs in molecular toxicity repair tasks, filling a critical gap in systematic evaluation.
Findings
MLLMs show promising toxicity understanding capabilities.
Current models struggle with candidate diversity and failure attribution.
Evaluation metrics need further refinement for better assessment.
Abstract
Toxicity remains a leading cause of early-stage drug development failure. Despite advances in molecular design and property prediction, the task of molecular toxicity repair, generating structurally valid molecular alternatives with reduced toxicity, has not yet been systematically defined or benchmarked. To fill this gap, we introduce ToxiMol, the first benchmark task for general-purpose Multimodal Large Language Models (MLLMs) focused on molecular toxicity repair. We construct a standardized dataset covering 11 primary tasks and 660 representative toxic molecules spanning diverse mechanisms and granularities. We design a prompt annotation pipeline with mechanism-aware and task-adaptive capabilities, informed by expert toxicological knowledge. In parallel, we propose an automated evaluation framework, ToxiEval, which integrates toxicity endpoint prediction, synthetic accessibility,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputational Drug Discovery Methods · Machine Learning in Materials Science · Topic Modeling
