Information Extraction with Character-level Neural Networks and Free   Noisy Supervision

Philipp Meerkamp (Bloomberg LP); Zhengyi Zhou (AT&T Labs Research)

arXiv:1612.04118·cs.CL·January 25, 2017·2 cites

Information Extraction with Character-level Neural Networks and Free Noisy Supervision

Philipp Meerkamp (Bloomberg LP), Zhengyi Zhou (AT&T Labs Research)

PDF

Open Access

TL;DR

This paper introduces a novel information extraction architecture that combines character-level neural networks with noisy supervision from existing databases, enhancing precision over traditional constraint-based systems in financial text processing.

Contribution

It presents a hybrid neural architecture trained with noisy supervision, integrating deep learning with domain constraints for improved information extraction.

Findings

01

Significant precision improvements over existing systems

02

Effective use of noisy supervision from databases

03

Combines neural networks with constraint-based methods

Abstract

We present an architecture for information extraction from text that augments an existing parser with a character-level neural network. The network is trained using a measure of consistency of extracted data with existing databases as a form of noisy supervision. Our architecture combines the ability of constraint-based information extraction systems to easily incorporate domain knowledge and constraints with the ability of deep neural networks to leverage large amounts of data to learn complex features. Boosting the existing parser's precision, the system led to large improvements over a mature and highly tuned constraint-based production information extraction system used at Bloomberg for financial language text.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Handwritten Text Recognition Techniques · Mathematics, Computing, and Information Processing