Corpus non align\'es et ADT. Essai de comparaison entre les pr\'esidents   fran\c{c}ais et br\'esiliens de l'\`ere contemporaine

Carlos Maciel; Damon Mayaffre; Laurent Vanni

arXiv:2211.10197·stat.AP·November 21, 2022

Corpus non align\'es et ADT. Essai de comparaison entre les pr\'esidents fran\c{c}ais et br\'esiliens de l'\`ere contemporaine

Carlos Maciel, Damon Mayaffre, Laurent Vanni

PDF

Open Access

TL;DR

This paper investigates whether an ADT method can handle non-aligned bilingual corpora and examines if textual genre makes speeches comparable across languages, using a large corpus of French and Brazilian presidential speeches from 1950-2020.

Contribution

It proposes a methodological approach from frequency dictionaries to factorial analysis to compare presidential speeches across languages and genres.

Findings

01

ADT can be adapted for non-aligned corpora

02

Genre influences speech comparability across languages

03

A large bilingual corpus enables cross-national discourse analysis

Abstract

Is there an ADT method that can deal with non-aligned bilingual corpora? Does the textual genre exert a sufficiently strong constraint on the discourse that would make texts written in different languages comparable, provided they are of identical genre? To answer these two questions, one methodological, the other linguistic, this contribution gathers in a single corpus French and Brazilian presidential speeches of the contemporary era (1950-2020), from de Gaulle to Macron, from Kubitschek to Lula, i.e. 15 million words. A methodological path is proposed from the simple frequency dictionary to the factorial treatment of the cooccurrencial profiles of words, in order to establish a generic transnational presidential speech.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsLinguistics and Discourse Analysis · Linguistic Studies and Language Acquisition · linguistics and terminology studies