Predicting the pathway involvement of metabolites annotated in the MetaCyc knowledgebase
Erik D. Huckvale, Hunter N. B. Moseley

TL;DR
This paper trains machine learning models on the MetaCyc database to predict metabolite pathway associations, achieving performance comparable to KEGG.
Contribution
The study demonstrates that MetaCyc can be used effectively for pathway prediction, with performance improvements in metabolic pathways.
Findings
Models trained on MetaCyc achieved a mean MCC of 0.845 for pathway predictions.
MetaCyc showed a 5.6% improvement in metabolic pathway prediction over KEGG.
The results indicate MetaCyc can be used at state-of-the-art performance levels for pathway prediction.
Abstract
The associations of metabolites with biochemical pathways are highly useful information for interpreting molecular datasets generated in biological and biomedical research. However, such pathway annotations are sparse in most molecular datasets, limiting their utility for pathway level interpretation. To address these shortcomings, several past publications have presented machine learning models for predicting the pathway association of small biomolecule (metabolite and xenobiotic) using data from the Kyoto Encyclopedia of Genes and Genomes (KEGG). But other similar knowledgebases exist, for example MetaCyc, which has more compound entries and pathway definitions than KEGG. As a logical next step, we trained and evaluated multilayer perceptron models on compound entries and pathway annotations obtained from MetaCyc. From the models trained on this dataset, we observed a mean Matthews…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 10
Figure 11
Figure 12
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8
Figure 9Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBioinformatics and Genomic Networks · Computational Drug Discovery Methods · Machine Learning in Bioinformatics
