When Molecular Similarity Works: Property Cliffs Reveal Hidden Errors

Di Hu; Kun Li; Haojie Rao; Longtao Hu; Jiameng Chen; Wenbin Hu; Yizhen Zheng; Jiajun Yu; Duanhua Cao

arXiv:2605.17265·cs.LG·May 19, 2026

When Molecular Similarity Works: Property Cliffs Reveal Hidden Errors

Di Hu, Kun Li, Haojie Rao, Longtao Hu, Jiameng Chen, Wenbin Hu, Yizhen Zheng, Jiajun Yu, Duanhua Cao

PDF

1 Repo

TL;DR

This paper introduces CliffSplit and CliffLoss, new methods to evaluate and improve molecular property prediction models by focusing on regions where similar molecules have sharply different properties.

Contribution

The paper presents a novel cliff-aware evaluation protocol and a model-agnostic training mechanism to address localized failure modes in molecular property prediction.

Findings

01

CliffSplit reveals at least 15% higher error in cliff-heavy regions.

02

CliffLoss reduces the error gap by up to 30% on Lipophilicity.

03

Overall MAE improves by 9.7% with CliffLoss.

Abstract

Accurate prediction of molecular properties underpins drug discovery and material design, yet even state-of-the-art models remain vulnerable to localized failure modes that aggregate metrics cannot detect. The places where molecular similarity should be most helpful are also places where standard evaluation can be most misleading. Property cliffs expose this gap: structurally similar molecules can still differ sharply in target property, so models with competitive overall performance may fail in high-risk local neighborhoods. To expose and mitigate this failure mode, CliffSplit, a cliff-aware evaluation protocol that constructs locally supported, cliff-exposed test cases, and CliffLoss, a model-agnostic train-only mitigation mechanism for cliff-sensitive errors, are introduced. Experiments on three QM9 targets and three MoleculeNet tasks across five backbones show that CliffSplit…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

https://anonymous.4open.science/r/Cliff_Loss
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.