Seq-SetNet: Exploring Sequence Sets for Inferring Structures
Fusong Ju, Jianwei Zhu, Guozheng Wei, Qi Zhang, Shiwei Sun, Dongbo Bu

TL;DR
Seq-SetNet is a novel neural network framework designed to process sequence sets directly, effectively capturing structural information from MSAs for protein secondary structure prediction, outperforming existing methods.
Contribution
The paper introduces Seq-SetNet, a neural network that handles unordered sequence sets using symmetric functions, enabling direct exploitation of MSAs for structural inference.
Findings
Outperforms state-of-the-art in protein secondary structure prediction by 3.6%
Demonstrates robustness to sequence order in MSAs
Applicable to various fields beyond bioinformatics
Abstract
Sequence set is a widely-used type of data source in a large variety of fields. A typical example is protein structure prediction, which takes an multiple sequence alignment (MSA) as input and aims to infer structural information from it. Almost all of the existing approaches exploit MSAs in an indirect fashion, i.e., they transform MSAs into position-specific scoring matrices (PSSM) that represent the distribution of amino acid types at each column. PSSM could capture column-wise characteristics of MSA, however, the column-wise characteristics embedded in each individual component sequence were nearly totally neglected. The drawback of PSSM is rooted in the fact that an MSA is essentially an unordered sequence set rather than a matrix. Specifically, the interchange of any two sequences will not affect the whole MSA. In contrast, the pixels in an image essentially form a matrix since…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Bioinformatics · Protein Structure and Dynamics · Genomics and Phylogenetic Studies
