Building a Large-Scale Knowledge Base for Machine Translation
Kevin Knight (USC/Information Sciences Institute), Steve K. Luk, (USC/Information Sciences Institute)

TL;DR
This paper presents the development of a large-scale, semi-automatically constructed knowledge base to enhance knowledge-based machine translation systems, aiming to scale from constrained domains to broader, real-world texts.
Contribution
It introduces a semi-automatic method for building a large ontology with 70,000 entries, merging multiple resources for improved machine translation support.
Findings
Constructed a 70,000-entry ontology for KBMT
Developed semi-automatic methods for knowledge base merging
Enabled bilingual indexing for multiple languages
Abstract
Knowledge-based machine translation (KBMT) systems have achieved excellent results in constrained domains, but have not yet scaled up to newspaper text. The reason is that knowledge resources (lexicons, grammar rules, world models) must be painstakingly handcrafted from scratch. One of the hypotheses being tested in the PANGLOSS machine translation project is whether or not these resources can be semi-automatically acquired on a very large scale. This paper focuses on the construction of a large ontology (or knowledge base, or world model) for supporting KBMT. It contains representations for some 70,000 commonly encountered objects, processes, qualities, and relations. The ontology was constructed by merging various online dictionaries, semantic networks, and bilingual resources, through semi-automatic methods. Some of these methods (e.g., conceptual matching of semantic taxonomies) are…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Semantic Web and Ontologies · Topic Modeling
