Semantic Novelty Detection and Characterization in Factual Text Involving Named Entities
Nianzu Ma, Sahisnu Mazumder, Alexander Politowicz, Bing Liu, Eric, Robertson, Scott Grigsby

TL;DR
This paper introduces PAT-SND, a model for fine-grained semantic novelty detection in factual texts involving named entities, outperforming existing methods and including a new annotated dataset.
Contribution
The paper presents a novel model, PAT-SND, for semantic-level novelty detection involving named entities, and provides an annotated dataset for evaluation.
Findings
PAT-SND outperforms 10 baseline methods significantly.
The model effectively characterizes types of novelty.
An annotated dataset for semantic novelty detection is introduced.
Abstract
Much of the existing work on text novelty detection has been studied at the topic level, i.e., identifying whether the topic of a document or a sentence is novel or not. Little work has been done at the fine-grained semantic level (or contextual level). For example, given that we know Elon Musk is the CEO of a technology company, the sentence "Elon Musk acted in the sitcom The Big Bang Theory" is novel and surprising because normally a CEO would not be an actor. Existing topic-based novelty detection methods work poorly on this problem because they do not perform semantic reasoning involving relations between named entities in the text and their background knowledge. This paper proposes an effective model (called PAT-SND) to solve the problem, which can also characterize the novelty. An annotated dataset is also created. Evaluation shows that PAT-SND outperforms 10 baselines by large…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Text Analysis Techniques · Topic Modeling · Sentiment Analysis and Opinion Mining
