# Positional distribution of transcription factor binding sites in the human genome

**Authors:** Chun-Ping Yu, Zhi Thong Soh, Maloyjo Joyraj Bhattacharjee, Wen-Hsiung Li, Zhenguo Lin, Zhenguo Lin, Zhenguo Lin

PMC · DOI: 10.1371/journal.pone.0329226 · PLOS One · 2025-07-30

## TL;DR

This paper explores where transcription factors bind in the human genome, revealing that most binding sites are in introns and intergenic regions, with higher density near gene promoters.

## Contribution

The study provides new and revised transcription factor binding motifs and reveals positional patterns of binding sites in the human genome.

## Key findings

- Most TF binding sites are located in introns and intergenic regions.
- TFBS density peaks at the transcription start site in promoters.
- Tethered binding is more common than co-binding, which often requires co-factors.

## Abstract

As transcription factors (TFs) play a major role in gene regulation, we studied their binding motifs (positional weight matrices, PWMs) and binding sites (TFBSs) in the human genome, and how TFs bind DNA motifs, including the involvement of binding co-factors. Using the chromatin immunoprecipitation sequencing data recently released by ENCODE (Encyclopedia of DNA Elements), we obtained new PWMs for 196 TFs and revised PWMs for 119 TFs. From these and the PWMs previously obtained for 235 TFs, we inferred the canonical PWMs for 500 TFs, including 243 new PWMs. Analysis revealed that most TFBSs are in introns (42.6%) and intergenic regions (31.6%), with only 11.3% in promoters. However, the TFBS density is considerably higher in promoters, showing a bell-shaped distribution of TFBSs with a peak at the transcription start site. Many TFBSs lie close to CTCF (CCCTC-binding factor) binding sites. Tethered binding is far more frequent than co-binding, with the latter often requiring co-factors.

## Linked entities

- **Genes:** CTCF (CCCTC-binding factor) [NCBI Gene 10664]
- **Proteins:** tf.S (transferrin S homeolog), LINC02145 (long intergenic non-protein coding RNA 2145)

## Full-text entities

- **Genes:** CREB5 (cAMP responsive element binding protein 5) [NCBI Gene 9586] {aka CRE-BPA, CREB-5, CREBPA}, PBX3 (PBX homeobox 3) [NCBI Gene 5090], CBX4 (chromobox 4) [NCBI Gene 8535] {aka NBP16, PC2}, FOS (Fos proto-oncogene, AP-1 transcription factor subunit) [NCBI Gene 2353] {aka AP-1, C-FOS, p55}, CEBPB (CCAAT enhancer binding protein beta) [NCBI Gene 1051] {aka C/EBP-beta, IL6DBP, NF-IL6, TCF5}, YY1 (YY1 transcription factor) [NCBI Gene 7528] {aka DELTA, GADEVS, INO80S, NF-E1, UCRBP, YIN-YANG-1}, FOXC2 (forkhead box C2) [NCBI Gene 2303] {aka FKHL14, LD, MFH-1, MFH1}, GATA2 (GATA binding protein 2) [NCBI Gene 2624] {aka DCML, IMD21, MONOMAC, NFE1B}, TBX3 (T-box transcription factor 3) [NCBI Gene 6926] {aka TBX3-ISO, UMS, XHL}, CTCF (CCCTC-binding factor) [NCBI Gene 10664] {aka CFAP108, FAP108, MRD21}, ELK4 (ETS transcription factor ELK4) [NCBI Gene 2005] {aka SAP1}, QRSL1 (glutaminyl-tRNA amidotransferase subunit QRSL1) [NCBI Gene 55278] {aka COXPD40, GatA}, CISH (cytokine inducible SH2 containing protein) [NCBI Gene 1154] {aka BACTS2, CIS, CIS-1, G18, SOCS}, ATF4 (activating transcription factor 4) [NCBI Gene 468] {aka CREB-2, CREB2, TAXREB67, TXREB}, RXRA (retinoid X receptor alpha) [NCBI Gene 6256] {aka NR2B1, RXR-alpha, RXRalpha}, GATA3 (GATA binding protein 3) [NCBI Gene 2625] {aka HDR, HDRS}, JUND (JunD proto-oncogene, AP-1 transcription factor subunit) [NCBI Gene 3727] {aka AP-1}, ZNF239 (zinc finger protein 239) [NCBI Gene 8187] {aka HOK-2, MOK2}, FOSL2 (FOS like 2, AP-1 transcription factor subunit) [NCBI Gene 2355] {aka ACED, FRA2}, PCSK1 (proprotein convertase subtilisin/kexin type 1) [NCBI Gene 5122] {aka BMIQ12, NEC1, PC1, PC1/3, PC3, SPC3}, FOXA1 (forkhead box A1) [NCBI Gene 3169] {aka HNF3A, TCF3A}, IKZF1 (IKAROS family zinc finger 1) [NCBI Gene 10320] {aka CVID13, Hs.54452, IK1, IKAROS, LYF1, LyF-1}, JUN (Jun proto-oncogene, AP-1 transcription factor subunit) [NCBI Gene 3725] {aka AP-1, AP1, c-Jun, cJUN, p39}, HOXA3 (homeobox A3) [NCBI Gene 3200] {aka HOX1, HOX1E}, FOXA2 (forkhead box A2) [NCBI Gene 3170] {aka HNF-3-beta, HNF3B, TCF3B}, MAZ (MYC associated zinc finger protein) [NCBI Gene 4150] {aka PUR1, Pur-1, SAF-1, SAF-2, SAF-3, ZF87}, F3 (coagulation factor III, tissue factor) [NCBI Gene 2152] {aka CD142, TF, TFA}
- **Diseases:** PWM (MESH:D015431), cancer (MESH:D009369)
- **Chemicals:** PONE-D-25-30370R1 (-)
- **Species:** Saccharomyces cerevisiae (baker's yeast, species) [taxon 4932], Homo sapiens (human, species) [taxon 9606], Arabidopsis thaliana (mouse-ear cress, species) [taxon 3702], Drosophila melanogaster (fruit fly, species) [taxon 7227]
- **Mutations:** T2T, X > Y, Y > X

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12310040/full.md

## Figures

5 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12310040/full.md

## References

36 references — full list in the complete paper: https://tomesphere.com/paper/PMC12310040/full.md

---
Source: https://tomesphere.com/paper/PMC12310040