A Lexicalist Approach to the Translation of Colloquial Text
Fred Popowich, Davide Turcato, Olivier Laurens, Paul McFetridge, J., Devlan Nicholson, Patrick McGivern, Maricela Corzo Pena, Lisa Pidruchney, and, Scott MacDonald (Simon Fraser University, Burnaby, Canada; TCC, Communications, Victoria, Canada)

TL;DR
This paper introduces a lexicalist, fully automatic translation system for colloquial English, focusing on translating TV caption text into simple target sentences using a resource-based approach.
Contribution
It presents a novel large-scale multilingual translation system specifically designed for colloquial English, utilizing a lexicalist paradigm and addressing theoretical and implementation challenges.
Findings
Successfully translates colloquial English TV captions into Spanish.
Demonstrates the effectiveness of a lexicalist approach for colloquial language translation.
System is scalable and adaptable to multiple languages.
Abstract
Colloquial English (CE) as found in television programs or typical conversations is different than text found in technical manuals, newspapers and books. Phrases tend to be shorter and less sophisticated. In this paper, we look at some of the theoretical and implementational issues involved in translating CE. We present a fully automatic large-scale multilingual natural language processing system for translation of CE input text, as found in the commercially transmitted closed-caption television signal, into simple target sentences. Our approach is based on the Whitelock's Shake and Bake machine translation paradigm, which relies heavily on lexical resources. The system currently translates from English to Spanish with the translation modules for Brazilian Portuguese under development.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Speech and dialogue systems · Language, Metaphor, and Cognition
