Unveiling Latent Knowledge in Chemistry Language Models through Sparse Autoencoders
Jaron Cohen, Alexander G. Hasson, Sara Tanovic

TL;DR
This paper introduces a method using sparse autoencoders to interpret and analyze the internal chemical knowledge encoded in large chemistry language models, revealing meaningful features related to chemical concepts.
Contribution
The study extends sparse autoencoder techniques to uncover and analyze interpretable chemical features within large language models, providing a new framework for understanding their internal representations.
Findings
Models encode diverse chemical concepts including structural motifs and properties.
Latent features correlate with specific chemical knowledge domains.
Framework can be applied to other chemistry-focused AI systems.
Abstract
Since the advent of machine learning, interpretability has remained a persistent challenge, becoming increasingly urgent as generative models support high-stakes applications in drug and material discovery. Recent advances in large language model (LLM) architectures have yielded chemistry language models (CLMs) with impressive capabilities in molecular property prediction and molecular generation. However, how these models internally represent chemical knowledge remains poorly understood. In this work, we extend sparse autoencoder techniques to uncover and examine interpretable features within CLMs. Applying our methodology to the Foundation Models for Materials (FM4M) SMI-TED chemistry foundation model, we extract semantically meaningful latent features and analyse their activation patterns across diverse molecular datasets. Our findings reveal that these models encode a rich landscape…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Materials Science · Computational Drug Discovery Methods · Topic Modeling
