Architectures of Meaning, A Systematic Corpus Analysis of NLP Systems
Oskar Wysocki, Malina Florea, Donal Landers, Andre Freitas

TL;DR
This paper introduces a new statistical corpus analysis framework to interpret NLP system architectures at scale, revealing coherent patterns and enabling data-driven understanding of the field.
Contribution
It presents a novel combination of saturation-based lexicon construction, statistical analysis, and graph collocations for systematic NLP architecture interpretation.
Findings
Identified coherent architectural patterns in NLP systems
Validated framework on Semeval corpus
Provides a systematic method for interpreting NLP architectures
Abstract
This paper proposes a novel statistical corpus analysis framework targeted towards the interpretation of Natural Language Processing (NLP) architectural patterns at scale. The proposed approach combines saturation-based lexicon construction, statistical corpus analysis methods and graph collocations to induce a synthesis representation of NLP architectural patterns from corpora. The framework is validated in the full corpus of Semeval tasks and demonstrated coherent architectural patterns which can be used to answer architectural questions on a data-driven fashion, providing a systematic mechanism to interpret a largely dynamic and exponentially growing field.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Text Analysis Techniques · Natural Language Processing Techniques · Sentiment Analysis and Opinion Mining
