Chemical knowledge-informed framework for privacy-aware retrosynthesis learning
Guikun Chen, Xu Zhang, Xiaolin Hu, Yong Liu, Yi Yang, Wenguan Wang

TL;DR
This paper introduces CKIF, a privacy-preserving framework for retrosynthesis learning that enables multiple chemical organizations to collaboratively train models without sharing proprietary data, leveraging chemical knowledge for model aggregation.
Contribution
The paper presents a novel distributed learning framework that uses chemical knowledge to ensure privacy while improving retrosynthesis prediction accuracy.
Findings
CKIF outperforms baseline methods on various reaction datasets.
Chemical properties guide adaptive model aggregation effectively.
Framework maintains data confidentiality across organizations.
Abstract
Chemical reaction data is a pivotal asset, driving advances in competitive fields such as pharmaceuticals, materials science, and industrial chemistry. Its proprietary nature renders it sensitive, as it often includes confidential insights and competitive advantages organizations strive to protect. However, in contrast to this need for confidentiality, the current standard training paradigm for machine learning-based retrosynthesis gathers reaction data from multiple sources into one single edge to train prediction models. This paradigm poses considerable privacy risks as it necessitates broad data availability across organizational boundaries and frequent data transmission between entities, potentially exposing proprietary information to unauthorized access or interception during storage and transfer. In the present study, we introduce the chemical knowledge-informed framework (CKIF),…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Materials Science · Advanced Graph Neural Networks · Computational Drug Discovery Methods
