On Bi-gram Graph Attributes

Thomas Konstantinovsky; Matan Mizrachi

arXiv:2107.02128·cs.LG·July 30, 2021

On Bi-gram Graph Attributes

Thomas Konstantinovsky, Matan Mizrachi

PDF

1 Repo

TL;DR

This paper introduces a bi-gram graph representation for text and corpus analysis, demonstrating its computational efficiency, versatility, and scalability for various semantic and corpus-level insights.

Contribution

It presents a novel bi-gram graph method for text analysis, highlighting its computational simplicity and broad applicability for large datasets.

Findings

01

Bi-gram graphs are computationally cheap to create.

02

The approach provides unique insights through graph attributes.

03

Scalable to large datasets with diverse use-cases.

Abstract

We propose a new approach to text semantic analysis and general corpus analysis using, as termed in this article, a "bi-gram graph" representation of a corpus. The different attributes derived from graph theory are measured and analyzed as unique insights or against other corpus graphs. We observe a vast domain of tools and algorithms that can be developed on top of the graph representation; creating such a graph proves to be computationally cheap, and much of the heavy lifting is achieved via basic graph calculations. Furthermore, we showcase the different use-cases for the bi-gram graphs and how scalable it proves to be when dealing with large datasets.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

MuteJester/BiGramGraph
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.