A deep language model to predict metabolic network equilibria
Fran\c{c}ois Charton, Amaury Hayat, Sean T. McQuade, Nathaniel J., Merrill, Benedetto Piccoli

TL;DR
This paper demonstrates that small, shallow deep learning models, especially Transformers, can accurately predict the equilibria of metabolic networks using large synthetic datasets, with potential applications in biology and pharmacology.
Contribution
It introduces a novel approach of training deep learning models on large synthetic datasets to predict metabolic network equilibria, showing high accuracy and generalization capabilities.
Findings
Models predict network equilibrium with over 99% accuracy on random graphs.
Models generalize well to different graph structures.
Accurately predict equilibria of known biological networks.
Abstract
We show that deep learning models, and especially architectures like the Transformer, originally intended for natural language, can be trained on randomly generated datasets to predict to very high accuracy both the qualitative and quantitative features of metabolic networks. Using standard mathematical techniques, we create large sets (40 million elements) of random networks that can be used to train our models. These trained models can predict network equilibrium on random graphs in more than 99% of cases. They can also generalize to graphs with different structure than those encountered at training. Finally, they can predict almost perfectly the equilibria of a small set of known biological networks. Our approach is both very economical in experimental data and uses only small and shallow deep-learning model, far from the large architectures commonly used in machine translation. Such…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBioinformatics and Genomic Networks · Computational Drug Discovery Methods · Microbial Metabolic Engineering and Bioproduction
MethodsAttention Is All You Need · Linear Layer · Dropout · Position-Wise Feed-Forward Layer · Layer Normalization · Byte Pair Encoding · Label Smoothing · Multi-Head Attention · Absolute Position Encodings · Softmax
