# Computational Register Analysis and Synthesis

**Authors:** Shlomo Engelson Argamon

arXiv: 1901.02543 · 2019-01-10

## TL;DR

This paper reviews computational approaches to register analysis and synthesis, advocating for their integration to enable large-scale, empirical studies of language varieties and improve linguistic theory validation.

## Contribution

It proposes integrating computational register analysis and synthesis to facilitate large-scale, empirical register studies and theoretical validation across multiple languages.

## Key findings

- Supports large-scale mapping of language varieties
- Enables empirical validation of register theories
- Facilitates cross-linguistic register analysis

## Abstract

The study of register in computational language research has historically been divided into register analysis, seeking to determine the registerial character of a text or corpus, and register synthesis, seeking to generate a text in a desired register. This article surveys the different approaches to these disparate tasks. Register synthesis has tended to use more theoretically articulated notions of register and genre than analysis work, which often seeks to categorize on the basis of intuitive and somewhat incoherent notions of prelabeled 'text types'. I argue that an integration of computational register analysis and synthesis will benefit register studies as a whole, by enabling a new large-scale research program in register studies. It will enable comprehensive global mapping of functional language varieties in multiple languages, including the relationships between them. Furthermore, computational methods together with high coverage systematically collected and analyzed data will thus enable rigorous empirical validation and refinement of different theories of register, which will have also implications for our understanding of linguistic variation in general.

---
Source: https://tomesphere.com/paper/1901.02543