Triangle Multiplication Is All You Need For Biomolecular Structure Representations
Jeffrey Ouyang-Zhang, Pranav Murugan, Daniel J. Diaz, Gianluca Scarpellini, Richard Strong Bowen, Nate Gruver, Adam Klivans, Philipp Kr\"ahenb\"uhl, Aleksandra Faust, Maruan Al-Shedivat

TL;DR
Pairmixer is a novel protein structure prediction model that replaces triangle attention with a more efficient approach, significantly reducing computational costs while maintaining high accuracy, enabling large-scale applications.
Contribution
The paper introduces Pairmixer, a new method that simplifies geometric reasoning in protein modeling, improving efficiency over existing models like Pairformer.
Findings
Up to 4x faster inference on long sequences
Reduces training cost by 34%
Scales to sequences 30% longer than previous models
Abstract
AlphaFold has transformed protein structure prediction, but emerging applications such as virtual ligand screening, proteome-wide folding, and de novo binder design demand predictions at a massive scale, where runtime and memory costs become prohibitive. A major bottleneck lies in the Pairformer backbone of AlphaFold3-style models, which relies on computationally expensive triangular primitives-especially triangle attention-for pairwise reasoning. We introduce Pairmixer, a streamlined alternative that eliminates triangle attention while preserving higher-order geometric reasoning capabilities that are critical for structure prediction. Pairmixer substantially improves computational efficiency, matching state-of-the-art structure predictors across folding and docking benchmarks, delivering up to 4x faster inference on long sequences while reducing training cost by 34%. Its efficiency…
Peer Reviews
Decision·ICLR 2026 Poster
1. Interesting conceptual direction: The idea of moving from pairwise attention to triangle-based relational modeling is novel and aligns with emerging research on geometric and higher-order attention. 2. Simplicity of the operator: The formulation is elegant and could potentially be a computationally efficient substitute for attention in specific contexts. 3. Potential for extension: The proposed mechanism could inspire further work in 3D molecular or graph-structured domains, where triplet re
- Limited Comparative Breadth The experiments benchmark against a few baselines but omit several directly relevant contemporary models, including: 1. Higher-order attention variants (e.g., Tensor Attention, Relational Transformer) 2. Geometric and 3D reasoning frameworks (e.g., SE(3)-Transformer, EGNN) 3. Diffusion-based relational models and equivariant graph networks. Without these comparisons, it is difficult to judge whether Triangle Multiplication provides a fundamentally better abstra
Reviewer appreciates the following contributions: - **impactful and practical**: the paper addresses a real bottleneck in AlphaFold-style models by removing triangle attention for faster, more scalable inference and training. Furthermore, it enables long-sequence and large-complex modeling that was previously infeasible due to memory or computational limits. - **Empirical validation**: Demonstrates near-identical accuracy to AlphaFold3-class baselines across folding, docking, and binder design
- **Method Novelty**: Novelty somewhat incremental: Prior works (e.g., Genie2, MiniFold) already suggested triangle multiplication is key; this paper mainly extends the idea to AF3 scale rather than introducing it conceptually. This requires further analysis to highlight key differences between Pairmixer versus prior works, for e.g., what should we do to adapt for the protein structure design task? - **Limited generalization tests:** While the paper benchmarks extensively on protein–protein and
1. This paper focuses on an important research problem, i.e., to accelerate AlphaFold and make it more lightweight, which is critital for down-stream applications like virtual screening; 2. This paper is well-written and clearly-structured. The figures effectively support the understanding of the proposed method, and the authors provide sufficient background and preliminaries to contextualize their work.
1. The contribution of this paper appears limited. The proposed method can be viewed primarily as an engineering optimization of the original Pairformer, without introducing substantial new insights. Without deeper analysis or justification of the design choices, the current contribution may not meet the novelty threshold typically expected for a venue such as ICLR. Furthermore, the finding that the triangle attention module contributes minimally to performance is not particularly surprising; th
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsProtein Structure and Dynamics · Computational Drug Discovery Methods · Machine Learning in Bioinformatics
