HATSolver: Learning Groebner Bases with Hierarchical Attention Transformers
Mohamed Malhou, Ludovic Perret, Kristin Lauter

TL;DR
This paper introduces HATSolver, a Hierarchical Attention Transformer-based method that significantly improves the efficiency of computing Groebner bases for polynomial systems, enabling larger problem instances to be solved.
Contribution
We adapt Hierarchical Attention Transformers for Groebner bases computation, incorporating a tree structure that captures hierarchical data relationships, leading to computational savings and scalability.
Findings
Achieves significant computational savings over flat attention models
Successfully solves larger polynomial systems than previous transformer-based methods
Provides detailed cost analysis and generalizes to arbitrary depths
Abstract
At NeurIPS 2024, Kera et al. introduced the use of transformers for computing Groebner bases, a central object in computer algebra with numerous practical applications. In this paper, we improve this approach by applying Hierarchical Attention Transformers (HATs) to solve systems of multivariate polynomial equations via Groebner bases computation. The HAT architecture incorporates a tree-structured inductive bias that enables the modeling of hierarchical relationships present in the data and thus achieves significant computational savings compared to conventional flat attention models. We generalize to arbitrary depths and include a detailed computational cost analysis. Combined with curriculum learning, our method solves instances that are much larger than those in Kera et al. (2024 Learning to compute Groebner bases)
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPolynomial and algebraic computation · Numerical Methods and Algorithms · Model Reduction and Neural Networks
