THG: Transformer with Hyperbolic Geometry
Zhe Liu, Yibin Xu

TL;DR
This paper introduces THG, a Transformer model leveraging hyperbolic geometry to improve efficiency and generalization, demonstrating effectiveness across various NLP tasks.
Contribution
The paper proposes a novel hyperbolic linear transformation within Transformer architecture, combining Euclidean and hyperbolic spaces for enhanced performance.
Findings
Improves sequence labeling, reading comprehension, and classification tasks.
Alleviates overfitting in Transformer models.
Demonstrates generalizability across multiple NLP tasks.
Abstract
Transformer model architectures have become an indispensable staple in deep learning lately for their effectiveness across a range of tasks. Recently, a surge of "X-former" models have been proposed which improve upon the original Transformer architecture. However, most of these variants make changes only around the quadratic time and memory complexity of self-attention, i.e. the dot product between the query and the key. What's more, they are calculate solely in Euclidean space. In this work, we propose a novel Transformer with Hyperbolic Geometry (THG) model, which take the advantage of both Euclidean space and Hyperbolic space. THG makes improvements in linear transformations of self-attention, which are applied on the input sequence to get the query and the key, with the proposed hyperbolic linear. Extensive experiments on sequence labeling task, machine reading comprehension task…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGraph Theory and Algorithms · Advanced Graph Neural Networks · Topic Modeling
MethodsLinear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Layer Normalization · Byte Pair Encoding · Multi-Head Attention · Attention Is All You Need · Adam · Label Smoothing · Residual Connection
