Application of Protein Structure Encodings and Sequence Embeddings for Transporter Substrate Prediction
Andreas Denger, Volkhard Helms

TL;DR
This paper explores using deep learning and protein structure data to better predict what molecules membrane transporters move across cell membranes.
Contribution
The study introduces new deep learning features combining sequence and structure data for improved transporter substrate prediction.
Findings
Deep learning features and FNN models outperformed previous methods in transporter substrate classification.
Structure encodings from FoldSeek and ProstT5 matched the performance of top sequence embeddings like ProtT5-XL.
The approach was tested on sugar and amino acid carriers in A. thaliana and human ion channels with consistent results.
Abstract
Membrane transporters play a crucial role in any cell. Identifying the substrates they translocate across membranes is important for many fields of research, such as metabolomics, pharmacology, and biotechnology. In this study, we leverage recent advances in deep learning, such as amino acid sequence embeddings with protein language models (pLMs), highly accurate 3D structure predictions with AlphaFold 2, and structure-encoding 3Di sequences from FoldSeek, for predicting substrates of membrane transporters. We test new deep learning features derived from both sequence and structure, and compare them to the previously best-performing protein encodings, which were made up of amino acid k-mer frequencies and evolutionary information from PSSMs. Furthermore, we compare the performance of these features either using a previously developed SVM model, or with a regularized feedforward neural…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Bioinformatics · RNA and protein synthesis mechanisms · Protein Structure and Dynamics
