Orthogonal Quadratic Complements for Vision Transformer Feed-Forward Networks
Wang Zixian

TL;DR
This paper introduces Orthogonal Quadratic Complements (OQC), a novel method for vision transformer feed-forward networks that enhances accuracy by adding orthogonal quadratic features, improving representation without redundancy.
Contribution
The paper proposes OQC, a new design principle for quadratic auxiliary features orthogonal to the main branch, with efficient low-rank and gated variants, improving vision transformer performance.
Findings
OQC improves CIFAR-100 accuracy from 64.25% to 65.59%.
OQC-LR achieves similar accuracy with better speed-accuracy tradeoff.
OQC-dynamic outperforms baseline on TinyImageNet, achieving 51.88%.
Abstract
Recent bilinear feed-forward replacements for vision transformers can substantially improve accuracy, but they often conflate two effects: stronger second-order interactions and increased redundancy relative to the main branch. We study a complementary design principle in which auxiliary quadratic features contribute only information not already captured by the dominant hidden representation. To this end, we propose Orthogonal Quadratic Complements (OQC), which construct a low-rank quadratic auxiliary branch and explicitly project it onto the orthogonal complement of the main branch before injection. We further study an efficient low-rank realization (OQC-LR) and gated extensions (OQC-static and OQC-dynamic). Under a parameter-matched Deep-ViT and CIFAR-100 protocol with a fixed penultimate residual readout, full OQC improves an AFBO baseline from 64.25 +/- 0.22 to 65.59 +/- 0.22,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
