CompText: Visualizing, Comparing & Understanding Text Corpus
Suvi Varshney, Divjeet Singh Jas

TL;DR
CompText introduces a visualization method for comparing text corpora by focusing on sentiment-carrying words, providing insights into emotional differences rather than just topical similarities.
Contribution
This paper proposes a novel approach to compare text corpora based on sentiment words, emphasizing emotional content over traditional topic-based comparisons.
Findings
Sentiment-focused comparison reveals emotional differences between corpora.
Highlighting sentiment words helps identify key emotional pivot points.
The method offers a new perspective beyond topic similarity analysis.
Abstract
A common practice in Natural Language Processing (NLP) is to visualize the text corpus without reading through the entire literature, still grasping the central idea and key points described. For a long time, researchers focused on extracting topics from the text and visualizing them based on their relative significance in the corpus. However, recently, researchers started coming up with more complex systems that not only expose the topics of the corpus but also word closely related to the topic to give users a holistic view. These detailed visualizations spawned research on comparing text corpora based on their visualization. Topics are often compared to idealize the difference between corpora. However, to capture greater semantics from different corpora, researchers have started to compare texts based on the sentiment of the topics related to the text. Comparing the words carrying the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Text Analysis Techniques · Semantic Web and Ontologies · Natural Language Processing Techniques
