# Identification of novel DNA sequence motifs that modulate transcription in T cells

**Authors:** Nicole Knoetze, Eric Yung, Anthony Bayega, Scott D. Brown, Robert A. Holt

PMC · DOI: 10.1186/s12864-025-12425-9 · BMC Genomics · 2026-01-09

## TL;DR

This study identifies new DNA sequence motifs that influence gene expression in T cells, revealing gaps in our understanding of how genes are regulated in these immune cells.

## Contribution

The paper presents a comprehensive discovery of 2,036 novel DNA motifs that modulate transcription in T cells, surpassing known transcription factor binding sites.

## Key findings

- Novel motifs were found to regulate gene transcription in T cells more effectively than some established transcription factor binding sites.
- Regulatory activity of motifs depends on their orientation, position, and copy number.
- STARR-seq experiments confirmed the functional relevance of these motifs in T-cell-specific gene regulation.

## Abstract

Considerable progress has been made towards associating transcription factor binding sites (TFBS) with cell-type-specific gene expression, however, the full repertoire of DNA sequence motifs that regulate transcription remains unknown. Improving our understanding of transcriptional regulation is especially important in T cells, given the enormous potential of genetically engineered T cells as an emerging class of therapeutics. Here, we report results from a comprehensive and unbiased survey investigating whether there are novel motifs enriched in regulatory regions of genes with the highest constitutive and selective expression across diverse T-cell subsets. Using computational and experimental methods, we identified 2,036 novel motifs and 629 previously curated TFBS that are enriched, both individually and in specific combinations, in the regulatory regions of genes exhibiting T-cell-specific gene expression. We then used the self-transcribing active regulatory region sequencing (STARR-seq) assay to evaluate all possible three-way combinations of a subset of 18 candidate motifs to test their ability to modulate transcription in immortalized lymphoblastic cell lines of T-cell origin (Jurkat E6) versus myeloid origin (K562). Our results revealed novel motifs that modulate gene transcription in T cells, with some exhibiting stronger regulatory effects than TFBS for TFs with established roles in T cells. The regulatory activity of these novel motifs was influenced by the motif’s orientation, position, and copy number. Overall, these results highlight our incomplete understanding of the relationship between sequence composition and T-cell gene regulation, and indicate that previously annotated TFBS represent only a subset of motifs capable of modulating gene transcription in T cells.

The online version contains supplementary material available at 10.1186/s12864-025-12425-9.

## Full-text entities

- **Genes:** CD8A (CD8 subunit alpha) [NCBI Gene 925] {aka CD8, CD8alpha, IMD116, Leu2, p32}, FOS (Fos proto-oncogene, AP-1 transcription factor subunit) [NCBI Gene 2353] {aka AP-1, C-FOS, p55}, SP1 (Sp1 transcription factor) [NCBI Gene 6667], CD4 (CD4 molecule) [NCBI Gene 920] {aka CD4mut, IMD79, Leu-3, OKT4D, T4}, F3 (coagulation factor III, tissue factor) [NCBI Gene 2152] {aka CD142, TF, TFA}, LEF1 (lymphoid enhancer binding factor 1) [NCBI Gene 51176] {aka ECTD1, ECTD17, LEF-1, TCF10, TCF1ALPHA, TCF7L3}, TCF7 (transcription factor 7) [NCBI Gene 6932] {aka TCF-1}, GATA3 (GATA binding protein 3) [NCBI Gene 2625] {aka HDR, HDRS}, TRC-GCA24-1 (tRNA-Cys (GCA) 24-1) [NCBI Gene 7183] {aka TRC, TRNAC1}, DNASE1 (deoxyribonuclease 1) [NCBI Gene 1773] {aka DNL1, DRNI}, JUN (Jun proto-oncogene, AP-1 transcription factor subunit) [NCBI Gene 3725] {aka AP-1, AP1, c-Jun, cJUN, p39}, CTCF (CCCTC-binding factor) [NCBI Gene 10664] {aka CFAP108, FAP108, MRD21}, FLI1 (Fli-1 proto-oncogene, ETS transcription factor) [NCBI Gene 2313] {aka BDPLT21, EWSR2, FLI-1, SIC-1}
- **Diseases:** PPMs (MESH:C536741)
- **Chemicals:** Poly(A) (MESH:D011061), streptomycin (MESH:D013307), Penicillin (MESH:D010406), HEPES (MESH:D006531), Carbenicillin (MESH:D002228), oligonucleotide (MESH:D009841), PBS (MESH:D007854), S (MESH:D013455), polypropylene (MESH:D011126), 5microl EB (-)
- **Species:** Homo sapiens (human, species) [taxon 9606], Bacteria Latreille et al. 1825 (Bacteria stick insect, genus) [taxon 629395], Mycoplasma (genus) [taxon 2093]
- **Cell lines:** Jurkat — Homo sapiens (Human), Childhood T acute lymphoblastic leukemia, Cancer cell line (CVCL_0065), CCL-243 — Mus musculus (Mouse), Undefined cell line type (CVCL_M023), Jurkat E6 — Homo sapiens (Human), Childhood T acute lymphoblastic leukemia, Cancer cell line (CVCL_0367), K562 — Homo sapiens (Human), Blast phase chronic myelogenous leukemia, BCR-ABL1 positive, Cancer cell line (CVCL_0004), K552 — Homo sapiens (Human), Xeroderma pigmentosum, complementation group D, Finite cell line (CVCL_ZR70)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12879379/full.md

## Figures

6 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12879379/full.md

## References

2 references — full list in the complete paper: https://tomesphere.com/paper/PMC12879379/full.md

---
Source: https://tomesphere.com/paper/PMC12879379