Treatment of Semantic Heterogeneity in Information Retrieval
Heiko Hellweg, J\"urgen Krause, Thomas Mandl, Jutta Marx, Matthias, N.O. M\"uller, Peter Mutschke, Robert Str\"otgen

TL;DR
This paper discusses methods to address semantic heterogeneity in information retrieval by enriching document metadata through rule-based extraction and mapping between different terminologies using various transfer modules.
Contribution
It introduces a set of cascading extraction rules and transfer modules, including intellectual, statistical, and neural network approaches, for handling semantic heterogeneity.
Findings
Effective extraction rules for social science documents
Implementation of transfer modules for terminology mapping
Enhanced metadata enrichment improves retrieval accuracy
Abstract
The first step to handle semantic heterogeneity should be the attempt to enrich the semantic information about documents, i.e. to fill up the gaps in the documents meta-data automatically. Section 2 describes a set of cascading deductive and heuristic extraction rules, which were developed in the project CARMEN for the domain of Social Sciences. The mapping between different terminologies can be done by using intellectual, statistical and/or neural network transfer modules. Intellectual transfers use cross-concordances between different classification schemes or thesauri. Section 3 describes the creation, storage and handling of such transfers.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSemantic Web and Ontologies · Natural Language Processing Techniques · Advanced Text Analysis Techniques
