DSGAN: Generative Adversarial Training for Distant Supervision Relation Extraction
Pengda Qin, Weiran Xu, William Yang Wang

TL;DR
This paper introduces DSGAN, an adversarial training framework that improves distant supervision relation extraction by accurately identifying true positive samples at the sentence level, significantly enhancing performance.
Contribution
The paper proposes a novel adversarial learning approach, DSGAN, for sentence-level noise reduction in distant supervision relation extraction, outperforming existing soft bag-level methods.
Findings
Significant performance improvement over state-of-the-art systems
Effective filtering of false positive instances
Enhanced dataset quality for relation classification
Abstract
Distant supervision can effectively label data for relation extraction, but suffers from the noise labeling problem. Recent works mainly perform soft bag-level noise reduction strategies to find the relatively better samples in a sentence bag, which is suboptimal compared with making a hard decision of false positive samples in sentence level. In this paper, we introduce an adversarial learning framework, which we named DSGAN, to learn a sentence-level true-positive generator. Inspired by Generative Adversarial Networks, we regard the positive samples generated by the generator as the negative samples to train the discriminator. The optimal generator is obtained until the discrimination ability of the discriminator has the greatest decline. We adopt the generator to filter distant supervision training dataset and redistribute the false positive instances into the negative set, in which…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Text and Document Classification Technologies
