Text analysis and deep learning: A network approach
Ingo Marquart

TL;DR
This paper introduces a novel unsupervised method combining transformer models with network analysis to extract semantic networks from text, enabling insights into language use and model behavior over time.
Contribution
It presents the first unsupervised approach to derive semantic networks directly from deep language models, reducing subjective choices in representation.
Findings
Semantic ties reflect discourse semantics over time
Clusters of semantic and syntactic relations are identified
Method can inform analysis of deep learning model behavior
Abstract
Much information available to applied researchers is contained within written language or spoken text. Deep language models such as BERT have achieved unprecedented success in many applications of computational linguistics. However, much less is known about how these models can be used to analyze existing text. We propose a novel method that combines transformer models with network analysis to form a self-referential representation of language use within a corpus of interest. Our approach produces linguistic relations strongly consistent with the underlying model as well as mathematically well-defined operations on them, while reducing the amount of discretionary choices of representation and distance measures. It represents, to the best of our knowledge, the first unsupervised method to extract semantic networks directly from deep language models. We illustrate our approach in a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Advanced Text Analysis Techniques · Complex Network Analysis Techniques
MethodsAttention Is All You Need · Linear Layer · Refunds@Expedia|||How do I get a full refund from Expedia? · Weight Decay · Adam · Multi-Head Attention · Residual Connection · Dropout · WordPiece · Layer Normalization
