Ontology Pre-training for Poison Prediction
Martin Glauer, Fabian Neuhaus, Till Mossakowski, Janna Hastings

TL;DR
This paper introduces ontology pre-training, a method that embeds structured human knowledge into neural networks, improving toxicity prediction accuracy, interpretability, and training efficiency in life sciences chemistry.
Contribution
The paper presents a novel ontology pre-training approach that integrates ontological knowledge into Transformer networks for toxicity prediction tasks.
Findings
Improved toxicity prediction accuracy over state-of-the-art methods.
Model attention focuses on more meaningful chemical groups.
Reduced training time after ontology pre-training.
Abstract
Integrating human knowledge into neural networks has the potential to improve their robustness and interpretability. We have developed a novel approach to integrate knowledge from ontologies into the structure of a Transformer network which we call ontology pre-training: we train the network to predict membership in ontology classes as a way to embed the structure of the ontology into the network, and subsequently fine-tune the network for the particular prediction task. We apply this approach to a case study in predicting the potential toxicity of a small molecule based on its molecular structure, a challenging task for machine learning in life sciences chemistry. Our approach improves on the state of the art, and moreover has several additional benefits. First, we are able to show that the model learns to focus attention on more meaningful chemical groups when making predictions with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputational Drug Discovery Methods · Machine Learning in Materials Science · Biomedical Text Mining and Ontologies
MethodsMulti-Head Attention · Attention Is All You Need · Position-Wise Feed-Forward Layer · Linear Layer · Dropout · Byte Pair Encoding · Residual Connection · Label Smoothing · Dense Connections · Layer Normalization
