Learning disentangled latent representations facilitates discovery and design of functional materials

Jaehoon Cha; Tingyao Lu; Matthew Walker; Keith T. Butler

arXiv:2507.19602·cond-mat.mtrl-sci·July 29, 2025

Learning disentangled latent representations facilitates discovery and design of functional materials

Jaehoon Cha, Tingyao Lu, Matthew Walker, Keith T. Butler

PDF

TL;DR

This paper demonstrates that Disentangling Autoencoders can learn meaningful, interpretable spectral features related to photovoltaic performance in an unsupervised way, improving material discovery efficiency.

Contribution

It introduces the use of DAEs for unsupervised learning of spectral features that correlate with material efficiency, outperforming PCA and beta-VAE in reconstruction and discovery tasks.

Findings

01

DAEs capture physically meaningful spectral features.

02

DAEs outperform PCA and beta-VAE in reconstruction fidelity.

03

DAEs enable more efficient discovery of high-performing materials.

Abstract

The discovery of new materials is often constrained by the need for large labelled datasets or expensive simulations. In this study, we explore the use of Disentangling Autoencoders (DAEs) to learn compact and interpretable representations of spectral data in an entirely unsupervised manner. We demonstrate that the DAE captures physically meaningful features in optical absorption spectra, relevant to photovoltaic (PV) performance, including a latent dimension strongly correlated with the Spectroscopic Limited Maximum Efficiency (SLME)--despite being trained without access to SLME labels. This feature corresponds to a well-known spectral signature: the transition from direct to indirect optical band gaps. Compared to Principal Component Analysis (PCA) and a beta-Variational Autoencoder (beta-VAE), the DAE achieves superior reconstruction fidelity, improved correlation with efficiency…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.