# Pentanucleotide guanine-rich WGGGW repeats, including CANVAS AGGGA repeats, form a variety of noncanonical structures

**Authors:** Jiawei Wang, Dehui Qiu, Jun Zhou, Jean-Louis Mergny, Patrizia Alberti

PMC · DOI: 10.1093/nar/gkag051 · 2026-01-30

## TL;DR

This paper explores how specific DNA repeats, like AGGGA, form complex structures that may contribute to a genetic disease called CANVAS.

## Contribution

The study reveals that WGGGW pentanucleotide repeats can form diverse structures, including G-quadruplexes and non-G4 forms, depending on sequence and ionic conditions.

## Key findings

- DNA WGGGW motifs can adopt multiple structures depending on sequence and ionic conditions.
- AGGGA repeats form G-quadruplexes under physiological K⁺ conditions.
- Structural diversity of these repeats may explain their genomic instability and pathogenicity.

## Abstract

Short tandem repeats (STRs) are an important component of the human genome as they contribute to genetic diversity and can influence gene expression and disease susceptibility. STRs are important in the context of CANVAS (Cerebellar Ataxia, Neuropathy, Vestibular Areflexia Syndrome) genetic disease as expansions of AGGGA repeats within the RFC1 gene are associated with the development of this neurodegenerative disorder. Interestingly, the RFC1 expanded motifs are pentanucleotides that differ from the nonpathogenic AGAAA pentanucleotide motif present in reference genomes. The molecular mechanisms underlying the pathogenicity of the mutated pentanucleotide expansion in CANVAS are still unknown. Several groups have shown that DNA and RNA containing AGGGA repeats fold into G-quadruplexes (G4s) under physiological K⁺ conditions. In this study, we reveal a more complex than expected behavior, in which DNA WGGGW motifs (where W is A or T) may adopt different G4 and non-G4 structures depending on sequence, repeat number and ionic conditions. These findings are relevant as they may help explain the genomic instability and pathogenicity specifically associated with AGGGA repeats among the WGGGW motifs.

Graphical Abstract

## Linked entities

- **Genes:** RFC1 (replication factor C subunit 1) [NCBI Gene 5981]
- **Chemicals:** K⁺ (PubChem CID 813)
- **Diseases:** CANVAS (MONDO:0044720)

## Full-text entities

- **Genes:** RFC1 (replication factor C subunit 1) [NCBI Gene 5981] {aka A1, CANVAS, MHCBFB, PO-GA, RECC1, RFC}, VEGFA (vascular endothelial growth factor A) [NCBI Gene 7422] {aka L-VEGF, MVCD1, VEGF, VPF}, RPA1 (replication protein A1) [NCBI Gene 6117] {aka HSSB, MST075, PFBMFT6, REPA1, RF-A, RP-A}
- **Diseases:** Friedreich's ataxia (MESH:D005621), SCA31 (MESH:C566146), neurological disorders (MESH:D009461), Vestibular Areflexia Syndrome (MESH:D000071699), NMM (MESH:C563209), Cerebellar Ataxia (MESH:D002524), Parkinson disease (MESH:D010300), CANVAS (MESH:C000726747), genetic anomaly (MESH:D020022), neurogenerative disease (MESH:D001750), neurodegenerative diseases (MESH:D019636), Neuropathy (MESH:D009422), toxicity (MESH:D064420), genetic disease (MESH:D030342), FAME (MESH:C564313), spinocerebellar ataxia (MESH:D020754)
- **Chemicals:** DMSO (MESH:D004121), K (MESH:D011188), N-methyl mesoporphyrin IX (MESH:C065420), d (MESH:D003903), ThT (MESH:C009462), sucrose (MESH:D013395), FAM (MESH:C031179), poly(A) (MESH:D011061), T (MESH:D014316), Polyacrylamide (MESH:C016679), Oligonucleotides (MESH:D009841), bisacrylamide (MESH:C021221), water (MESH:D014867), KCl (MESH:D011189), NaCl (MESH:D012965), G4 (MESH:D004003), guanine (MESH:D006147), n (MESH:D009584), magnesium (MESH:D008274), A (MESH:D001151), LiCl (MESH:D018021), MgCl2 (MESH:D015636), acrylamide (MESH:D020106), LiOH (MESH:C028467), TBE (-), PhenDC3 (MESH:C000710336), 13C (MESH:C000615229), cacodylic acid (MESH:D002101), Li+ (MESH:D008094), dT (MESH:D013936), D2O (MESH:D017666)
- **Species:** Saccharomyces cerevisiae (baker's yeast, species) [taxon 4932], Homo sapiens (human, species) [taxon 9606], Giardia duodenalis (species) [taxon 5741]
- **Mutations:** C) in 100, GGGUUA)3GGG, C) for 2, (E) in 100, AGGGT)3GGG, (A) in 100, C at 295

## Figures

10 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12856209/full.md

---
Source: https://tomesphere.com/paper/PMC12856209