FFN: a Fine-grained Chinese-English Financial Domain Parallel Corpus
Yuxin Fu, Shijing Si, Leyi Mai, Xi-ang Li

TL;DR
This paper introduces FFN, a detailed Chinese-English financial news corpus, and evaluates the translation performance of LLMs like ChatGPT and ERNIE-bot, highlighting challenges and the need for domain-specific optimization.
Contribution
The creation of a fine-grained, manually corrected Chinese-English financial news corpus and an analysis of LLM translation performance within the financial domain.
Findings
LLMs show limitations in financial translation accuracy
OpenNMT achieves competitive results on the dataset
Analysis highlights specific challenges faced by LLMs in financial translation
Abstract
Large Language Models (LLMs) have stunningly advanced the field of machine translation, though their effectiveness within the financial domain remains largely underexplored. To probe this issue, we constructed a fine-grained Chinese-English parallel corpus of financial news called FFN. We acquired financial news articles spanning between January 1st, 2014, to December 31, 2023, from mainstream media websites such as CNN, FOX, and China Daily. The dataset consists of 1,013 main text and 809 titles, all of which have been manually corrected. We measured the translation quality of two LLMs -- ChatGPT and ERNIE-bot, utilizing BLEU, TER and chrF scores as the evaluation metrics. For comparison, we also trained an OpenNMT model based on our dataset. We detail problems of LLMs and provide in-depth analysis, intending to stimulate further research and solutions in this largely uncharted…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStock Market Forecasting Methods · Mathematics, Computing, and Information Processing · Natural Language Processing Techniques
