BSA: Ball Sparse Attention for Large-scale Geometries
Catalin E. Brita, Hieu Nguyen, Lohithsai Yadala Chanchu, Domonkos Nagy, Maksim Zhdanov

TL;DR
This paper introduces Ball Sparse Attention (BSA), a novel method for efficiently applying self-attention to large, irregular geometric data by leveraging ball tree structures, achieving near full attention accuracy with reduced computational costs.
Contribution
BSA adapts Native Sparse Attention to irregular geometries using ball tree structures, enabling scalable self-attention for large-scale physical systems.
Findings
Achieves accuracy comparable to full attention on airflow pressure prediction.
Reduces computational complexity from quadratic to sub-quadratic.
Provides an implementation available on GitHub.
Abstract
Self-attention scales quadratically with input size, limiting its use for large-scale physical systems. Although sparse attention mechanisms provide a viable alternative, they are primarily designed for regular structures such as text or images, making them inapplicable for irregular geometries. In this work, we present Ball Sparse Attention (BSA), which adapts Native Sparse Attention (NSA) (Yuan et al., 2025) to unordered point sets by imposing regularity using the Ball Tree structure from the Erwin Transformer (Zhdanov et al., 2025). We modify NSA's components to work with ball-based neighborhoods, yielding a global receptive field at sub-quadratic cost. On an airflow pressure prediction task, we achieve accuracy comparable to Full Attention while significantly reducing the theoretical computational complexity. Our implementation is available at https://github.com/britacatalin/bsa.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage and Object Detection Techniques · Augmented Reality Applications · Medical Imaging and Analysis
MethodsAbsolute Position Encodings · Layer Normalization · Byte Pair Encoding · Label Smoothing · Softmax · Dropout · Dense Connections · Transformer · Attention Is All You Need
