Joint Embedding Predictive Architecture for self-supervised pretraining on polymer molecular graphs

Francesco Piccoli; Gabriel Vogel; Jana M. Weber

arXiv:2506.18194·cs.LG·June 25, 2025

Joint Embedding Predictive Architecture for self-supervised pretraining on polymer molecular graphs

Francesco Piccoli, Gabriel Vogel, Jana M. Weber

PDF

TL;DR

This paper explores the use of the Joint Embedding Predictive Architecture (JEPA) for self-supervised pretraining on polymer molecular graphs, demonstrating improved downstream property prediction especially in low-label data scenarios.

Contribution

It introduces JEPA for self-supervised learning on polymer graphs and shows its effectiveness in improving downstream tasks with limited labeled data.

Findings

01

JEPA-based SSL improves downstream performance.

02

Pretraining benefits are most significant with scarce labels.

03

All tested datasets show performance gains.

Abstract

Recent advances in machine learning (ML) have shown promise in accelerating the discovery of polymers with desired properties by aiding in tasks such as virtual screening via property prediction. However, progress in polymer ML is hampered by the scarcity of high-quality labeled datasets, which are necessary for training supervised ML models. In this work, we study the use of the very recent 'Joint Embedding Predictive Architecture' (JEPA), a type of architecture for self-supervised learning (SSL), on polymer molecular graphs to understand whether pretraining with the proposed SSL strategy improves downstream performance when labeled data is scarce. Our results indicate that JEPA-based self-supervised pretraining on polymer graphs enhances downstream performance, particularly when labeled data is very scarce, achieving improvements across all tested datasets.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.