Foreground and Background Lexicons and Word Sense Disambiguation for Information Extraction
Adam Kilgarriff (ITRI, University of Brighton)

TL;DR
This paper explores how foreground and background lexicons are used in information extraction, emphasizing the role of word sense disambiguation and proposing a distinction to improve IE accuracy.
Contribution
It introduces a clear distinction between foreground and background lexicons in IE, linking human lexicography to foreground and automatic methods to background, aligning WSD techniques with these roles.
Findings
Foreground lexicon requires human lexicography
Background lexicon can be acquired automatically
WSD techniques are suited mainly for background lexicon
Abstract
Lexicon acquisition from machine-readable dictionaries and corpora is currently a dynamic field of research, yet it is often not clear how lexical information so acquired can be used, or how it relates to structured meaning representations. In this paper I look at this issue in relation to Information Extraction (hereafter IE), and one subtask for which both lexical and general knowledge are required, Word Sense Disambiguation (WSD). The analysis is based on the widely-used, but little-discussed distinction between an IE system's foreground lexicon, containing the domain's key terms which map onto the database fields of the output formalism, and the background lexicon, containing the remainder of the vocabulary. For the foreground lexicon, human lexicography is required. For the background lexicon, automatic acquisition is appropriate. For the foreground lexicon, WSD will occur as a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Semantic Web and Ontologies
