MolTRES: Improving Chemical Language Representation Learning for Molecular Property Prediction
Jun-Hyung Park, Yeachan Kim, Mingyu Lee, Hyuntae Park, SangKeun Lee, (Korea University)

TL;DR
MolTRES is a novel framework that enhances chemical language representations for molecular property prediction by incorporating generator-discriminator training and external literature knowledge, leading to improved performance.
Contribution
Introduces MolTRES, a new chemical language learning method combining generator-discriminator training and literature-based knowledge transfer for better molecular property prediction.
Findings
Outperforms existing models on key benchmarks
Learns from challenging structural examples
Effectively integrates external scientific literature
Abstract
Chemical representation learning has gained increasing interest due to the limited availability of supervised data in fields such as drug and materials design. This interest particularly extends to chemical language representation learning, which involves pre-training Transformers on SMILES sequences -- textual descriptors of molecules. Despite its success in molecular property prediction, current practices often lead to overfitting and limited scalability due to early convergence. In this paper, we introduce a novel chemical language representation learning framework, called MolTRES, to address these issues. MolTRES incorporates generator-discriminator training, allowing the model to learn from more challenging examples that require structural understanding. In addition, we enrich molecular representations by transferring knowledge from scientific literature by integrating external…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsComputational Drug Discovery Methods · Machine Learning in Materials Science · History and advancements in chemistry
