Predictability and Causality in Spanish and English Natural Language   Generation

Andrea Busto-Casti\~neira; Francisco J. Gonz\'alez-Casta\~no and; Silvia Garc\'ia-M\'endez; Francisco de Arriba-P\'erez

arXiv:2408.14283·cs.CL·August 27, 2024

Predictability and Causality in Spanish and English Natural Language Generation

Andrea Busto-Casti\~neira, Francisco J. Gonz\'alez-Casta\~no and, Silvia Garc\'ia-M\'endez, Francisco de Arriba-P\'erez

PDF

TL;DR

This paper investigates how causal and non-causal language models perform in English and Spanish NLG, revealing language-dependent differences in predictability and optimal generation strategies.

Contribution

It introduces a novel information-theoretic metric to compare causal and non-causal models across languages and demonstrates their differing effectiveness in English and Spanish.

Findings

01

Spanish is more predictable with non-causal models.

02

English performs better with causal models.

03

Language structure influences optimal NLG strategies.

Abstract

In recent years, the field of Natural Language Generation (NLG) has been boosted by the recent advances in deep learning technologies. Nonetheless, these new data-intensive methods introduce language-dependent disparities in NLG as the main training data sets are in English. Also, most neural NLG systems use decoder-only (causal) transformer language models, which work well for English, but were not designed with other languages in mind. In this work we depart from the hypothesis that they may introduce generation bias in target languages with less rigid word ordering, subject omission, or different attachment preferences for relative clauses, so that for these target languages other language generation strategies may be more desirable. This paper first compares causal and non-causal language modeling for English and Spanish, two languages with different grammatical structures and over…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.