# Natural Selection in Transcription Factor–DNA Interaction Motifs: A Comparative and Population Genomics Perspective

**Authors:** Manas Joshi, Pablo Duchen, Adamandia Kapopoulou, Stefan Laurent

PMC · DOI: 10.1093/gbe/evaf212 · 2025-11-12

## TL;DR

This paper explores how natural selection shapes DNA regions involved in transcription factor interactions across different species.

## Contribution

The study provides a comparative and population genomics analysis of selection pressures on transcription factor-binding motifs.

## Key findings

- Purifying selection acts on transcription factor-binding domains, indicating their functional importance.
- Noncoding transcription factor-binding sites show similar constraint to coding regions in large populations.
- Selection efficiency on transcription factor-DNA interface elements correlates with effective population size.

## Abstract

Natural selection heavily influences the evolutionary trajectories of species by impacting their genotype-to-phenotype transitions. On the molecular level, these transitions are shaped by the regulatory sequences. In this study, we employed a combination of population and comparative genomics to investigate how natural selection affects specific regulatory sequence classes involved in the regulatory transcription factor–DNA interactions. These interactions consist of two motifs, namely: transcription factor-binding domains and transcription factor-binding sites. Using publicly available annotation data for Homo sapiens, Arabidopsis thaliana, and Drosophila melanogaster, we first constructed the species-specific lists of the transcription factor-binding domain regions. On applying some of the commonly used summary statistics, we found signals of purifying selection acting on transcription factor-binding domains, consistent with their functional importance. Next, using the biochemical assay-based annotations, we identified potential transcription factor-binding site regions and used variants within them as nonsynonymous equivalents. Interestingly, we also observed that noncoding transcription factor-binding site regions showed similar levels of constraint to that of coding regions for populations with large Ne. Signals of positive selection were limited. Nevertheless, McDonald–Kreitman estimates revealed that, in both fruit-fly and thale-cress, α for transcription factor-binding domains was consistently higher than for adjacent nonbinding domains, whereas no such difference was apparent in humans. Taken together, our comparative analysis shows that the efficiency of negative—and to a lesser extent positive—selection on transcription factor–DNA interface elements scales with effective population size. The dataset and analysis pipeline provide a baseline for future studies of regulatory evolution across coding and noncoding regions.

Graphical Abstract

## Linked entities

- **Species:** Homo sapiens (taxon 9606), Arabidopsis thaliana (taxon 3702), Drosophila melanogaster (taxon 7227)

## Full-text entities

- **Species:** Arabidopsis thaliana (mouse-ear cress, species) [taxon 3702], Homo sapiens (human, species) [taxon 9606], Drosophila melanogaster (fruit fly, species) [taxon 7227]

## Figures

7 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12645836/full.md

---
Source: https://tomesphere.com/paper/PMC12645836