Represent, Aggregate, and Constrain: A Novel Architecture for Machine Reading from Noisy Sources
Jason Naradowsky, Sebastian Riedel

TL;DR
This paper introduces a new neural architecture for extracting event information from noisy, cluster-based news data, effectively aggregating hypotheses and applying global constraints to improve accuracy significantly.
Contribution
The work presents a novel neural model that explicitly handles noisy data and incorporates global constraints via factor graphs, achieving state-of-the-art results with less annotation.
Findings
Over 12.1 F1 improvement over previous models
Up to 2.8 F1 points gain from factor graph constraints
50% relative improvement over prior state-of-the-art
Abstract
In order to extract event information from text, a machine reading model must learn to accurately read and interpret the ways in which that information is expressed. But it must also, as the human reader must, aggregate numerous individual value hypotheses into a single coherent global analysis, applying global constraints which reflect prior knowledge of the domain. In this work we focus on the task of extracting plane crash event information from clusters of related news articles whose labels are derived via distant supervision. Unlike previous machine reading work, we assume that while most target values will occur frequently in most clusters, they may also be missing or incorrect. We introduce a novel neural architecture to explicitly model the noisy nature of the data and to deal with these aforementioned learning issues. Our models are trained end-to-end and achieve an…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Text Analysis Techniques
