Betti numbers of attention graphs is all you really need

Laida Kushnareva; Dmitri Piontkovski; Irina Piontkovskaya

arXiv:2207.01903·cs.CL·July 6, 2022

Betti numbers of attention graphs is all you really need

Laida Kushnareva, Dmitri Piontkovski, Irina Piontkovskaya

PDF

Open Access 1 Repo

TL;DR

This paper demonstrates that topological features, specifically Betti numbers, of attention graphs in BERT can be used for text classification, achieving results comparable to traditional methods.

Contribution

It introduces the novel application of persistent topological features to analyze attention graphs in neural networks for NLP tasks.

Findings

01

Betti numbers effectively classify text data.

02

Topological features match conventional classification accuracy.

03

First topological analysis of attention-based neural networks.

Abstract

We apply methods of topological analysis to the attention graphs, calculated on the attention heads of the BERT model ( arXiv:1810.04805v2 ). Our research shows that the classifier built upon basic persistent topological features (namely, Betti numbers) of the trained neural network can achieve classification results on par with the conventional classification method. We show the relevance of such topological text representation on three text classification benchmarks. For the best of our knowledge, it is the first attempt to analyze the topology of an attention-based neural network, widely used for Natural Language Processing.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

upunaprosk/la-tda
none

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsCognitive Computing and Networks · Advanced Graph Neural Networks · Topological and Geometric Data Analysis

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Linear Layer · Adam · Linear Warmup With Linear Decay · Weight Decay · Layer Normalization · WordPiece · Softmax · Multi-Head Attention