# ConversationAlign: Open-source software for analyzing patterns of lexical use and alignment in conversation transcripts

**Authors:** Benjamin Sacks, Virginia Ulichney, Anna Duncan, Chelsea Helion, Sarah M. Weinstein, Tania Giovannetti, Gus Cooney, Jamie Reilly

PMC · DOI: 10.3758/s13428-026-02954-w · 2026-02-20

## TL;DR

This paper introduces ConversationAlign, an open-source tool for analyzing language patterns and alignment in conversation transcripts.

## Contribution

The novel contribution is an R package that computes alignment indices across multiple lexical and affective dimensions in naturalistic conversations.

## Key findings

- ConversationAlign transforms raw language data into time series for analysis of lexical alignment.
- The tool identifies both local and global alignment patterns between conversation partners.
- A use case with Terry Gross interviews demonstrates the package's utility in analyzing long-term conversational trends.

## Abstract

Much of our scientific understanding of language processing has been informed by controlled experiments divorced from the real-world demands of naturalistic communication. Conversation requires synchronization of rate, amplitude, lexical complexity, affective coloring, shared reference, and countless other verbal and nonverbal dimensions. Conversation is not merely a vector for information transfer but also serves as a mechanism for establishing or maintaining social relationships. This process of language calibration between interlocutors is known as linguistic alignment. We developed an open-source R package, ConversationAlign, capable of computing novel indices of linguistic alignment and main effects of language use between interlocutors by evaluating word choice across numerous semantic, affective, and lexical dimensions (e.g., valence, concreteness, frequency, word length). We describe the operations of ConversationAlign, including its primary functions of cleaning and transforming raw language data into simultaneous time series objects aggregated by interlocutor, turn, and conversation. We then outline mathematical operations involved in computing complementary indices of linguistic alignment that capture both local (synchrony in turn-by-turn scores) and global relations (overall proximity) between interlocutors. We present a use case of ConversationAlign applied to interview transcripts from American radio legend Terry Gross and her many guests spanning 15 years. We identify caveats for use and potential sources of bias (e.g., polysemy, missing data, robustness to brief language samples) and close with a discussion of potential applications to other populations. ConversationAlign (v 0.4.0) is freely available for download and use via CRAN or GitHub. For technical instructions and download, visit https://github.com/Reilly-ConceptsCognitionLab/ConversationAlign.

## Full-text entities

- **Diseases:** Communication Disorders (MESH:D003147), Deafness (MESH:D003638)
- **Chemicals:** DC013063 (-)
- **Species:** Canis lupus familiaris (dog, subspecies) [taxon 9615], Homo sapiens (human, species) [taxon 9606]

## Figures

3 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12923439/full.md

---
Source: https://tomesphere.com/paper/PMC12923439