The Role of Model Architecture and Scale in Predicting Molecular   Properties: Insights from Fine-Tuning RoBERTa, BART, and LLaMA

Lee Youngmin; Lang S.I.D. Andrew; Cai Duoduo; Wheat R. Stephen

arXiv:2405.00949·cs.LG·May 3, 2024·2 cites

The Role of Model Architecture and Scale in Predicting Molecular Properties: Insights from Fine-Tuning RoBERTa, BART, and LLaMA

Lee Youngmin, Lang S.I.D. Andrew, Cai Duoduo, Wheat R. Stephen

PDF

Open Access 2 Repos

TL;DR

This paper systematically compares the performance of RoBERTa, BART, and LLaMA models in predicting molecular properties from SMILES data, highlighting the impact of model architecture, size, and training data on effectiveness.

Contribution

It introduces a uniform framework for evaluating LLMs in cheminformatics, revealing the influence of model type, size, and dataset scale on molecular property prediction performance.

Findings

01

LLaMA models generally achieve lower validation loss.

02

Model size significantly influences performance, more than architecture.

03

Absolute validation loss is not always indicative of model quality.

Abstract

This study introduces a systematic framework to compare the efficacy of Large Language Models (LLMs) for fine-tuning across various cheminformatics tasks. Employing a uniform training methodology, we assessed three well-known models-RoBERTa, BART, and LLaMA-on their ability to predict molecular properties using the Simplified Molecular Input Line Entry System (SMILES) as a universal molecular representation format. Our comparative analysis involved pre-training 18 configurations of these models, with varying parameter sizes and dataset scales, followed by fine-tuning them on six benchmarking tasks from DeepChem. We maintained consistent training environments across models to ensure reliable comparisons. This approach allowed us to assess the influence of model type, size, and training dataset size on model performance. Specifically, we found that LLaMA-based models generally offered the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning in Materials Science

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Linear Layer · Adam · Layer Normalization · Multi-Head Attention · Dropout · Softmax · Byte Pair Encoding · Dense Connections