Dual-Axis RCCL: Representation-Complete Convergent Learning for Organic Chemical Space
Dejun Hu, Zhiming Li, Jia-Rui Shen, Jia-Ning Tu, Zi-Hao Ye, Junliang Zhang

TL;DR
This paper introduces a novel dual-axis molecular representation framework, RCCL, enabling convergent learning across vast chemical spaces and demonstrating high accuracy and generalization in molecular property prediction.
Contribution
The paper presents the RCCL framework combining GCN and NBG encodings, and the FD25 dataset, to achieve representation-complete convergent learning in organic chemistry modeling.
Findings
Graph neural networks trained on FD25 achieve ~1.0 kcal/mol MAE.
RCCL framework formalizes representation completeness and supports out-of-distribution generalization.
FD25 covers over 165,000 topologies, nearly exhaustively representing organic molecules.
Abstract
Machine learning is profoundly reshaping molecular and materials modeling; however, given the vast scale of chemical space (10^30-10^60), it remains an open scientific question whether models can achieve convergent learning across this space. We introduce a Dual-Axis Representation-Complete Convergent Learning (RCCL) strategy, enabled by a molecular representation that integrates graph convolutional network (GCN) encoding of local valence environments, grounded in modern valence bond theory, together with no-bridge graph (NBG) encoding of ring/cage topologies, providing a quantitative measure of chemical-space coverage. This framework formalizes representation completeness, establishing a principled basis for constructing datasets that support convergent learning for large models. Guided by this RCCL framework, we develop the FD25 dataset, systematically covering 13,302 local valence…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Materials Science · Advanced Graph Neural Networks · Computational Drug Discovery Methods
