Building a Large-Scale Knowledge Base for Machine Translation

Kevin Knight (USC/Information Sciences Institute); Steve K. Luk; (USC/Information Sciences Institute)

arXiv:cmp-lg/9407029·cmp-lg·February 3, 2008·185 cites

Building a Large-Scale Knowledge Base for Machine Translation

Kevin Knight (USC/Information Sciences Institute), Steve K. Luk, (USC/Information Sciences Institute)

PDF

Open Access

TL;DR

This paper presents the development of a large-scale, semi-automatically constructed knowledge base to enhance knowledge-based machine translation systems, aiming to scale from constrained domains to broader, real-world texts.

Contribution

It introduces a semi-automatic method for building a large ontology with 70,000 entries, merging multiple resources for improved machine translation support.

Findings

01

Constructed a 70,000-entry ontology for KBMT

02

Developed semi-automatic methods for knowledge base merging

03

Enabled bilingual indexing for multiple languages

Abstract

Knowledge-based machine translation (KBMT) systems have achieved excellent results in constrained domains, but have not yet scaled up to newspaper text. The reason is that knowledge resources (lexicons, grammar rules, world models) must be painstakingly handcrafted from scratch. One of the hypotheses being tested in the PANGLOSS machine translation project is whether or not these resources can be semi-automatically acquired on a very large scale. This paper focuses on the construction of a large ontology (or knowledge base, or world model) for supporting KBMT. It contains representations for some 70,000 commonly encountered objects, processes, qualities, and relations. The ontology was constructed by merging various online dictionaries, semantic networks, and bilingual resources, through semi-automatic methods. Some of these methods (e.g., conceptual matching of semantic taxonomies) are…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Semantic Web and Ontologies · Topic Modeling