Designing the Business Conversation Corpus
Mat\=iss Rikters, Ryokan Ri, Tong Li, Toshiaki Nakazawa

TL;DR
This paper introduces a new Japanese-English business conversation corpus to improve machine translation of spoken dialogues, providing analysis, challenging examples, and demonstrating its benefits in translation systems.
Contribution
The paper presents a newly constructed parallel corpus for Japanese-English business conversations and evaluates its impact on machine translation quality.
Findings
The corpus enhances translation accuracy for conversational texts.
Analysis reveals challenging translation examples in business dialogues.
Adding the corpus improves machine translation system performance.
Abstract
While the progress of machine translation of written text has come far in the past several years thanks to the increasing availability of parallel corpora and corpora-based training technologies, automatic translation of spoken text and dialogues remains challenging even for modern systems. In this paper, we aim to boost the machine translation quality of conversational texts by introducing a newly constructed Japanese-English business conversation parallel corpus. A detailed analysis of the corpus is provided along with challenging examples for automatic translation. We also experiment with adding the corpus in a machine translation training scenario and show how the resulting system benefits from its use.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Speech and dialogue systems
MethodsLinear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · SentencePiece · Residual Connection · Label Smoothing · Dropout · Adam · Dense Connections · Softmax
