Superior Molecular Representations from Intermediate Encoder Layers
Luis Pinto

TL;DR
This paper demonstrates that using intermediate encoder layers rather than just final layers in pretrained molecular models can significantly enhance property prediction performance and efficiency.
Contribution
It provides a comprehensive analysis of information retention across encoder layers and empirically shows the benefits of leveraging intermediate layers for molecular tasks.
Findings
Intermediate layers retain more general features.
Using frozen intermediate layers improves performance by up to 28.6%.
Finetuning truncated encoders achieves up to 40.8% performance gains.
Abstract
Pretrained molecular encoders have become indispensable in computational chemistry for tasks such as property prediction and molecular generation. However, the standard practice of relying solely on final-layer embeddings for downstream tasks may discard valuable information. In this work, we first analyze the information flow in five diverse molecular encoders and find that intermediate layers retain more general-purpose features, whereas the final-layer specializes and compresses information. We then perform an empirical layer-wise evaluation across 22 property prediction tasks. We find that using frozen embeddings from optimal intermediate layers improves downstream performance by an average of 5.4%, up to 28.6%, compared to the final-layer. Furthermore, finetuning encoders truncated at intermediate depths achieves even greater average improvements of 8.5%, with increases as high as…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Topicsthermodynamics and calorimetric analyses · Computational Drug Discovery Methods · Machine Learning in Materials Science
