AGATHA: Automatic Graph-mining And Transformer based Hypothesis generation Approach
Justin Sybrandt, Ilya Tyagin, Michael Shtutman, Ilya Safro

TL;DR
AGATHA is a deep-learning system that leverages graph mining and transformers to predict plausible biomedical hypotheses, enabling earlier insights in drug discovery and research direction identification.
Contribution
It introduces a novel data-driven hypothesis generation approach combining graph mining and transformer models, validated across biomedical sub-domains and relationship types.
Findings
Achieves best-in-class performance on established benchmarks.
Successfully predicts connections introduced after 2015 using prior data.
Demonstrates high recommendation scores across multiple biomedical subdomains.
Abstract
Medical research is risky and expensive. Drug discovery, as an example, requires that researchers efficiently winnow thousands of potential targets to a small candidate set for more thorough evaluation. However, research groups spend significant time and money to perform the experiments necessary to determine this candidate set long before seeing intermediate results. Hypothesis generation systems address this challenge by mining the wealth of publicly available scientific information to predict plausible research directions. We present AGATHA, a deep-learning hypothesis generation system that can introduce data-driven insights earlier in the discovery process. Through a learned ranking criteria, this system quickly prioritizes plausible term-pairs among entity sets, allowing us to recommend new research directions. We massively validate our system with a temporal holdout wherein we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBiomedical Text Mining and Ontologies · Computational Drug Discovery Methods · Bioinformatics and Genomic Networks
