Thymine DNA glycosylase binds to R-loops and excises 5-formyl and 5-carboxyl cytosine from DNA/RNA hybrids
Baiyu Zhu, Lakshmi S. Pidugu, Mary E. Cook, Xinyu Y. Nie, E.A.P. Tharaka Amarasekara, Jerome S. Menet, Alexander C. Drohat, Jonathan T. Sczepanski

TL;DR
This study shows that TDG can bind to R-loops and remove specific modified cytosines from DNA/RNA hybrids, suggesting a role in DNA demethylation.
Contribution
The study demonstrates TDG's ability to excise 5fC and 5caC from DNA/RNA hybrids and provides mechanistic insights into R-loop-mediated DNA demethylation.
Findings
TDG binds to R-loop substrates and excises 5fC and 5caC from DNA/RNA hybrids.
R-loops enable strand-specific TDG activity at CpGs, explaining asymmetric 5fC/5caC distribution.
19F NMR reveals mechanistic details of base excision in DNA/RNA hybrid duplexes.
Abstract
Once considered rare byproducts of transcription, R-loops are now recognized as important regulators of various nuclear processes. In particular, evidence indicates a role for R-loops in regulating DNA methylation dynamics. R-loops have been shown to promote active DNA demethylation—the enzymatic reversal of 5-methylcytosine back into cytosine—by recruiting associated proteins, providing an attractive targeting mechanism. Nevertheless, many aspects of this process, including whether the associated proteins bind to and function on DNA within R-loops, remain to be substantiated. Herein, we demonstrate that thymine DNA glycosylase (TDG), a key enzyme in the active DNA demethylation pathway, binds to synthetic R-loop substrates in vitro and can excise DNA demethylation intermediates 5-formylcytosine (5fC) and 5-carboxycytosine (5caC) from DNA in DNA/RNA hybrids. We also show that R-loops…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDNA and Nucleic Acid Chemistry · Origins and Evolution of Life · DNA Repair Mechanisms
5-methylcytosine (5mC) is a highly conserved epigenetic modification of DNA found predominantly within CpG dinucleotides and is associated with heterochromatin formation leading to silencing of gene expression. While traditionally viewed as a static DNA modification, recent research has revealed that 5mC is dynamic in nature, and reversal of 5mC back into cytosine (i.e., demethylation) is an essential biological pathway in mammals. The reversal process involves two primary mechanisms: passive and active DNA demethylation. Passive DNA demethylation occurs when the methylated DNA is progressively diluted through successive rounds of replication, primarily due to the deactivation or nuclear exclusion of maintenance DNA methyltransferase (DNMT) 1 or its cofactors. Alternatively, the enzymatic reversal of 5mC to cytosine (referred to as “active” DNA demethylation) involves the successive oxidation of 5mC into 5-hydroxymethylcytosine, 5-formylcytosine (5fC), and 5-carboxycytosine by the ten-eleven translocation (TET) family of dioxygenases (Fig. 1) (1, 2, 3, 4, 5). Thymine DNA glycosylase (TDG) excises the two most oxidized cytosine derivatives, 5fC and 5caC, from DNA (4, 6), thereby initiating base excision repair (BER) and ultimately restoring unmodified cytosine (7, 8, 9, 10, 11, 12, 13, 14). The active DNA demethylation pathway is essential for maintaining proper epigenetic states during development, regulating hormone-dependent gene expression in differentiated cells, and preventing tumor formation. Therefore, a comprehensive understanding of the active DNA demethylation pathway and its component enzymes is crucial for deciphering how DNA methylation landscapes are established in both normal and disease conditions.Figure 1The active DNA demethylationpathwaymediated by TET and TDG. TDG, thymine DNA glycosylase; TET, ten-eleven translocation.
TDG is a member of the uracil DNA glycosylase superfamily that was originally discovered to excise pyrimidine bases from G•U and G•T mispairs and to initiate BER at these sites (15, 16). This activity is thought to help protect the genome against C→T transition mutations that arise via the deamination of cytosine and 5mC, respectively. Importantly, as the only mammalian glycosylase currently known to be capable of removing 5fC and 5caC from DNA, TDG plays a central role in the active DNA demethylation pathway (3, 17). Depletion of TDG from mouse embryonic stem cells (mESCs) results in the accumulation of 5fC and 5caC at regulatory sites, such as promoters and enhancers, pointing to a role for TDG in epigenetic regulation of gene expression (13, 14, 18, 19, 20, 21). TDG’s catalytic activity is essential during development for preserving the epigenetic integrity of numerous developmental and tissue-specific genes; its loss in the germline results in embryonic lethality (11, 12). Beyond development, TDG-mediated DNA demethylation plays a key role during transcriptional activation of a subset of hormone-responsive genes in differentiated cells (22, 23, 24, 25, 26). Importantly, evidence suggests that 5fC and 5caC function as stable epigenetic modifications to regulate gene expression in addition to their roles as intermediates in the DNA demethylation pathway (27, 28, 29, 30, 31, 32). Thus, TDG has a pivotal role of determining whether 5fC and 5caC are retained as potential epigenetic marks or removed via BER.
A key bottleneck in our current understanding of the active DNA demethylation pathway is that it remains unclear how the associated enzymes, including TDG, are targeted and regulated at the genome level. DNA demethylation occurs at specific promoters and enhancers in response to developmental or environmental cues, typically affecting only a limited number of CpG dinucleotides (11, 12, 33). Yet, it is unknown how this precise targeting is achieved. While most studies aimed at answering this question have focused on the interactions of the demethylation machinery with its protein partners, such as sequence-specific transcription factors, emerging evidence now indicates potential regulatory relationships with RNA (26, 34, 35). Indeed, several proteins involved in controlling DNA methylation dynamics, including DNMT1, DNMT3a/b, TETs, TDG, and GADD45A (growth arrest and DNA-damage–inducible, alpha) (25, 36, 37, 38, 39) are localized at target sites in an RNA-dependent manner. One possible mechanism for these regulatory functions involves the formation of R-loops. R-loops are three-stranded nucleic acid structures consisting of a DNA/RNA hybrid and a displaced ssDNA. While prevalent during transcription, R-loops have gained recent attention for having important roles in many other nuclear processes, including recombination and DNA repair. Importantly, evidence indicates a role for R-loops in regulating DNA methylation dynamics. Genome-wide mapping studies show that the formation of R-loops negatively correlates with DNA methylation, especially at gene promoters (40), consistent with their ability to inhibit the binding and catalytic activity of DNMTs. Other evidence suggests that R-loops can promote 5mC reversal. Notably, Arab et al. showed that several long noncoding RNAs (lncRNAs) form R-loops near transcriptional start sites (TSSs) and demonstrated that these structures can promote local DNA demethylation and gene expression (41). The proposed mechanism involves specific recognition of R-loops by the scaffolding protein GADD45A, which in turn recruits the DNA demethylation machinery, including TET1 and TDG, to the underlying CpG sites for processing. Because R-loops are formed in a sequence-specific manner, this provides an attractive mechanism for the precise targeting of CpGs for demethylation. Interestingly, many of the R-loops implicated in directing DNA demethylation activities overlap with target CpG sites, suggesting that DNA demethylation may occur directly on these unique hybrid structures. In the case of TDG, for example, this would require recognition and base excision of 5fC and 5caC on DNA/RNA hybrid substrates. However, whether TDG or other components of the demethylation machinery function on R-loops remains to be substantiated.
To address these questions, we investigated the activity of TDG on R-loop substrates. We show that TDG binds synthetic R-loop substrates in vitro and is capable of excising 5fC and 5caC from DNA/RNA hybrid duplexes, establishing a biochemical mechanism by which DNA demethylation could occur directly over R-loops. Furthermore, we demonstrate that R-loops can direct the strand selectivity of TDG and protect against double-strand break (DSB) formation at symmetrically modified CpG dinucleotides. Finally, we provide evidence consistent with TDG association with R-loops in mammalian cells. Together, our results define a previously unrecognized biochemical capability of TDG and provide a framework for future studies examining potential roles of R-loops in epigenetic regulation, particularly in the context of TDG-mediated DNA demethylation.
Results
TDG binds DNA/RNA hybrid duplexes in vitro
To begin our investigation, we asked whether TDG is capable of binding to DNA/RNA hybrid duplexes in vitro. We used agarose gel-based electrophoretic mobility shift assays to measure the affinity of TDG toward either DNA/DNA duplexes or DNA/RNA hybrids containing 2′-deoxy-2′-fluoroarabinouridine (U^F^), a noncleavable substrate analogue of 2′-deoxyuridine (U) (Fig. 2A). The U^F^ modification was placed within a centrally positioned CpG dinucleotide (5′-XpG-3′/5′-CpG-3′), which is the preferred sequence context of TDG and the most relevant biological target in the context of active DNA demethylation (42, 43, 44, 45). Furthermore, two different sequences were used in order to demonstrate the generality of the observations. As shown in Figure 2, B and C, TDG was indeed capable of binding to DNA/RNA hybrid duplexes H1-U^F^ and H2-U^F^ with an apparent Kd of 63 nM and 101 nM, respectively (Table 1). Surprisingly, these interactions were only ∼2- to 3-fold weaker than for binding to the corresponding DNA/DNA duplexes (D1-U^F^ and D2-U^F^, respectively; Fig. 2, B and C and Table 1), indicating that TDG can easily accommodate the noncanonical structure of the DNA/RNA hybrid. When the TDG–hybrid complexes were further resolved using native polyacrylamide gels (Fig. S1), TDG was found to form both a 1:1 and a proposed 2:1 TDG:substrate complexes with the DNA/RNA hybrids. Thus, in addition to binding to the U^F^ modification, a second molecule of TDG can form a nonspecific complex at an adjacent unmodified site, similar to what has been reported for TDG binding to DNA/DNA duplexes (44). Even in the absence of the U^F^ modification (H1-C and H2-C), both DNA/RNA hybrid substrates were bound tightly by TDG (Fig. S2, A and B and Table 1). This observation is consistent with TDG’s high affinity for CpG sites regardless of their modification state (44). In agreement with these similar Kd values, competition binding experiments showed that DNA/RNA hybrid duplexes can effectively compete with DNA/DNA duplexes for TDG binding (Fig. S2C). It is noteworthy that the Hill coefficients for the hybrid substrates (H1/2-U^F^) are approximately 2-fold higher than those for the corresponding DNA duplexes (D1/2-U^F^). While Hill coefficients are not always easily interpreted, it is possible that this observation reflects asymmetric binding to the DNA/RNA hybrid with respect to the strand. Such behavior could arise if one strand—most likely the DNA strand within the hybrid—provides a higher-affinity or catalytically preferred binding interface, resulting in apparent positive cooperativity or strand-biased binding dynamics. This hypothesis requires further investigation. Taken together, these results show that TDG binds to DNA/RNA hybrid duplexes regardless of the modification state and with an affinity that is only slightly lower than for canonical DNA/DNA duplexes.Figure 2TDG binds to DNA/RNA hybrid duplexes. A, sequences used in this study. Black and red colors denote DNA and RNA, respectively. See also Table S1. B and C, representative EMSA data and corresponding saturation plots for binding of TDG to either D1/H1 (B) or D2/H2 (C). For each reaction, the indicated substrate (5 nM) was incubated with TDG (0–300 nM) in a buffer containing 100 mM NaCl, 2.5 mM MgCl_2_, 10 mM Tris–HCl (pH 7.5), and 5% glycerol for 30 min at 30 °C. Data are mean ± SD (n = 3). EMSA, electrophoretic mobility shift assay; TDG, thymine DNA glycosylase.Table 1. Equilibrium dissociation constants for TDG binding to the indicated substrateSubstrateKd (nM)95% CIHill Coef.D1-U^F^3228–361.8H1-U^F^6360–664.5D2-U^F^3228–362.0H2-U^F^10198–1055.3D1-C4338–482.3H1-C6760–744.5D2-C4744–492.9H2-C5755–596.1D1-5fC5653–583.0H1-5fC6057–644.9D1-5caC2524–272.0H1-5caC3129–332.795% CI, 95% confidence interval.
TDG can excise G•U pairs, but not G•T mispairs, from DNA/RNA hybrid duplexes
Having shown that TDG is capable of binding to DNA/RNA hybrids, we next asked whether it was catalytically active on these substrates. While the binding of TDG to DNA/RNA hybrids could be potentially explained by a series of nonspecific interactions, base excision requires precise positioning of the substrate within the active site of the enzyme. Therefore, we determined the rate of TDG-mediated excision of G•U or G•T pairs from DNA/RNA hybrid substrates using single-turnover kinetic experiments performed using a large excess of TDG relative to substrate (Fig. S3A). Single-turnover conditions are typically employed to study TDG catalysis due to strong product inhibition (46, 47, 48). Remarkably, TDG exhibited robust excision activity for G•U pairs within both DNA/RNA hybrid duplexes tested (Fig. 3A and Table 2). However, the rate of U excision from H1-U and H2-U (kmax = 1.05 min^−1^ and 0.33 min^−1^, respectively) was slower than for the equivalent DNA/DNA duplexes (Table 2). In sharp contrast, TDG was nearly inactive on DNA/RNA hybrids containing a G•T mispair (H1-T and H2-T) (Fig. 3B and Table 2). TDG is known to have reduced activity toward G•T mispairs compared to G•U pairs within DNA/DNA duplexes due to a steric clash between the dT methyl group and TDG that hinders flipping of the base into the enzyme active site (49, 50). Nucleotide flipping by TDG has been shown to be strongly dependent on the local environment (e.g., sequence context) and is directly related to the efficiency of base excision (49). Thus, the inability of TDG to excise G•T mispairs from a DNA/RNA hybrid (and its reduced activity toward G•U pairs) is possibly the result of impaired nucleotide flipping when the opposing strand is RNA. This hypothesis is explored more carefully below. Consistent with these single-turnover results, multiturnover experiments performed under enzyme-limiting conditions showed that TDG remains catalytically active on DNA/RNA hybrids, with turnover enhanced by the addition of APE1 (Fig. S3B).Figure 3TDG is catalytically active on DNA/RNA hybrids. A, single-turnover kinetics of TDG (1000 nM) acting on the indicated G•U containing substrate (100 nM). B, single-turnover kinetics of TDG or TDG^A145G^ (1000 nM) acting on the indicated G•T containing substrate (100 nM). The data for H1-T is obscured by that for H2-T. The data for H1-T and H2-T were fitted to a linear equation. C, single-turnover kinetics of TDG (1000 nM) acting on the indicated 5fC/5caC-containing substrate (100 nM). All reactions contained 100 mM NaCl, 2.5 mM MgCl_2_, and 10 mM Tris–HCl (pH 7.5) and were carried out at 30 °C. Data are mean ± SD (n = 3). 5caC, 5-carboxycytosine; 5fC, 5-formylcytosine; TDG, thymine DNA glycosylase.Table 2TDG excision activity (kmax) for the indicated substrates at 30 °CSubstratekmax (min^-1^)95% CID1-U17.714.6–21.8H1-U1.050.95–1.15D2-U11.19.1–13.5H2-U0.330.30–0.37D1-T0.190.18–0.20H1-Ta0.00070.00065–0.00074H1-T (TDG^A145G^)a0.020.019–0.021D2-T0.380.36–0.41H2-Ta0.000660.00048–0.00085H2-T (TDG^A145G^)a0.0080.007–0.01D1-5fC0.650.6–0.71H1-5fC0.130.12–0.14D1-5caC0.720.66–0.8H1-5caC0.120.11–0.13TCF21-DNA8.227.0–11.3TCF21-hybrid0.440.42–0.47TCF21-loop0.060.05–0.07TCF21-flap0.710.67–0.75TCF21-loopEX0.110.10–0.12TCF21-loopEX-5fC0.050.03–0.0695% CI, 95% confidence interval.aData were fitted to a linear equation. In the case of H1/2-T (TDG^A145G^), only the initial linear phase of the progress curve was fitted.
TDG can excise 5fC and 5caC from DNA/RNA hybrid duplexes
If R-loops are targeted for DNA demethylation in vivo, as prior work suggests (40, 41), then TDG should be capable of excising oxidized cytosine derivatives from these substrates. To test this directly, we examined the capacity of TDG to excise 5fC and 5caC from CpG sites in a DNA/RNA hybrid substrate (H1-5fC and H1-5caC, respectively; Fig. 2A). We note that TDG’s affinity for hybrids H1-5fC and H1-5caC was only slightly weaker than for the corresponding DNA/DNA duplexes (Fig. S4 and Table 1). Using single-turnover kinetic experiments, we found that the rate of 5fC and 5caC excision from the hybrid substrates (kmax = 0.13 min^−1^ and 0.12 min^−1^, respectively) was only about 5-fold slower than from the corresponding DNA/DNA duplex (Table 2). Thus, TDG possesses substantial activity for excising 5fC and 5caC from DNA/RNA hybrids, suggesting that R-loops are compatible with TDG-mediated active DNA demethylation.
Nucleotide flipping by TDG is impaired in DNA/RNA hybrids
The kinetic experiments above revealed that TDG has reduced glycosylase activity on DNA/RNA hybrids compared to DNA/DNA duplexes. Given that our rate constants (kmax) were obtained using single-turnover experiments, this observation indicated that the opposing RNA strand adversely impacted the chemical step and/or any preceding steps (e.g., conformational changes) that occur following substrate binding, which includes nucleotide flipping. Previous studies have shown that nucleotide flipping by TDG is strongly dependent on DNA context and that the flipping equilibria and glycosylase activity are positively correlated (49). Thus, we hypothesized that the unique structural properties of the DNA/RNA hybrid may hinder nucleotide flipping, leading to the observed reduction in kmax for base excision. To examine nucleotide flipping by TDG on DNA/RNA hybrids, we employed 2′-fluoro-substituted deoxynucleotides, U^F^ and 2′-deoxy-2′-fluoroarabinothymidine (T^F^) (Fig. 2A), together with ^19^F NMR. The chemical environment of the fluorine atom within U^F^ and T^F^ was previously shown to differ greatly for the stacked and flipped conformations in duplex DNA, leading to different ^19^F chemical shifts when measured by ^19^F NMR (49). This allows the equilibrium constant for reversible nucleotide flipping (Kflip) to be readily determined by comparing the integrals for peaks corresponding to the flipped (IF) and stacked (IS) states (Kflip = IF/IS).
As shown in Figure 4A, the ^19^F NMR spectrum for the DNA duplex containing a G•U^F^ pair (D2-U^F^) featured a single peak (δ^19^F of −116.0 ppm), consistent with the expectation that U^F^ is predominantly stacked within the unbound duplex. This peak was lost upon the addition of TDG and a new broader peak was formed ∼6 ppm upfield (δ^19^F of −122.0 ppm), which was attributed to the flipped conformation(s) of U^F^ (Fig. 4B). These spectra confirmed that the vast majority of U^F^ is flipped for TDG-bound D2-U^F^ (Kflip >49) and were fully consistent with prior ^19^F NMR measurements of G•U^F^ containing DNA duplexes (49, 51). Similar to D2-U^F^, the ^19^F NMR spectrum for the DNA/RNA hybrid containing a G•U^F^ pair (H2-U^F^) showed a single peak (δ^19^F of −116.8 ppm), indicating that U^F^ was also predominantly stacked within the unbound hybrid duplex (Fig. 4C). This peak was shifted slightly upfield compared to the free DNA duplex D2-U^F^ (Δδ^19^F of 0.8 ppm), indicating a different chemical environment around the stacked G•U^F^ pair within the hybrid. The addition of TDG to hybrid H2-U^F^ again resulted in formation of a new upfield peak (δ^19^F of −122.6 ppm) corresponding to U^F^ flipped into the TDG active site (Fig. 4D). However, a small peak corresponding to the stacked G•U^F^ pair remained (δ^19^F of −117.3 ppm), indicating that U^F^ was not fully flipped by TDG in the DNA/RNA hybrid. The Kflip for H2-U^F^ was calculated to be 3.0, which is at least an order of magnitude lower than for D2-U^F^ (Kflip >49). These data reveal that flipping of U^F^ from G•U^F^ pairs by TDG is impaired in DNA/RNA hybrids relative to DNA duplexes.Figure 4DNA/RNA hybrids impair nucleotide flipping by TDG. ^19^F NMR spectra of either DNA/DNA duplex D2-U^F^ (A and B) or DNA/RNA hybrid H2-U^F^ (C and D) in the presence or absence of TDG, collected at 25 °C. Downfield peaks (near −116 ppm) correspond to the stacked (nonflipped) conformation of U^F^, while the upfield peaks (near −122 ppm) represent the flipped conformation. TDG, thymine DNA glycosylase.
Next, we examined the excision of G•T mispairs from DNA/RNA hybrids using the A145G mutant of TDG (TDG^A145G^). Previous structural and biochemical studies have shown that Ala145 of TDG hinders nucleotide flipping of dT due to a steric clash between its methyl group and that on the thymine base (50). Consistently, the A145G mutation greatly increases both Kflip and glycosylase activity for G•T mispairs but has little effect on G•U pairs. Thus, if the inability of TDG to excise G•T mispairs from DNA/RNA hybrids was due to hindered nucleotide flipping, we expected TDG^A145G^ to rescue this activity. Indeed, TDG^A145G^ was capable of excising G•T mispairs from both hybrid substrates H1-T and H2-T (Fig. 3B). Despite the relatively slow rates, this represented a 28- and 12-fold increase in the rate of base excision compared to the WT enzyme on the same substrates, respectively (Table 2). Given this result, we attempted to determine Kflip for H2-T^F^ in the presence of TDG^A145G^ using ^19^F NMR as above (Fig. S5). However, only a single peak (δ^19^F −117.4 ppm) corresponding to the stacked conformation of the G•T^F^ pair was observed in both the presence and absence of TDG^A145G^, suggesting that the population of flipped T^F^ is too low to be observed (<∼2%) for H1-T^F^ bound to TDG^A145G^. While we cannot exclude the possibility that the ^19^F chemical shift for T^F^ is identical in the stacked and flipped conformations, this seems highly unlikely, given the large flipping-induced chemical shift perturbation for D2-T^F^ bound to TDG (Fig. S5).
Taken together, these data strongly suggest that the reduced rate of base excision by TDG on DNA/RNA hybrids is due to impaired nucleotide flipping.
TDG functions on authentic R-loop structures
The above experiments showed that DNA/RNA hybrids can act as substrates for TDG. However, we sought to demonstrate TDG’s ability to function on an authentic R-loop structure derived from an endogenous sequence know to be targeted for DNA demethylation. For this, we chose the promoter region of the tumor suppressor gene TCF21. The TCF21 promoter was shown to form an R-loop with the lncRNA TARID (TCF21 antisense RNA inducing promoter demethylation), triggering local DNA demethylation and TCF21 expression (41). Importantly, the R-loop is positioned over several CpG sites shown to undergo demethylation in a TDG-dependent manner, suggesting that TDG can carry out base excision directly on this R-loop. We synthesized a DNA substrate consisting of nucleotides −29 to +36 (relative to the TSS) of the TCF21 promoter and positioned an R-loop near the center (TCF21-loop) such that it covered several of the targeted CpG sites (Fig. 5A and Fig. S6A). To probe the potential impact of the R-loop structure on TDG activity, we also generated the corresponding “flap” (TCF21-flap) and hybrid (TCF21-hybrid) substrates (Fig. 5B). TDG bound tightly to these substrates in the absence of any DNA modifications. The Kd values for the various R-loop substrates (30–50 nM) were similar to the DNA/DNA duplex control (TCF21-DNA, Kd = 49 nM) (Fig. S6, B and C). To investigate TDG’s incision activity, we incorporated a G•U pair at CpG #2 located centrally within the DNA/RNA hybrid region of these substrates and determined the rate of excision using single-turnover kinetics (Fig. 5C and Table 2). Interestingly, compared to TCF21-hybrid, the rate of G•U excision from TCF21-loop was much slower (kmax is reduced 7-fold), suggesting that the presence of the ssDNA loop impeded base excision. In contrast, the rate of G•U excision from the flap substrate (TCF21-flap) was similar to the hybrid (TCF21-hybrid). TDG has been shown to bend its DNA substrate by as much as 70° upon binding (52), which is believed to help facilitate flipping of the target nucleobase into the active site for subsequent excision. Therefore, we hypothesized that the reduced rate of excision from TCF21-loop relative to TCF21-flap was due to structural constraints imposed by the DNA loop that impeded TDG’s ability to bend the substrate. Consistently, increasing the size of the ssDNA loop from 15 nt to 30 nt (TCF21-loopEX) led to a ∼2-fold increase in base excision rate (Fig. 5C and Table 2). Given that endogenous R-loops typically span tens to hundreds of nucleotides, often extending over ∼100 to 500 nt in transcriptionally active regions, TCF21-loopEX likely represents the most biologically relevant R-loop architecture with reduced structural constraints (40, 53, 54). Importantly, TDG was also capable of excising 5fC from the TCF21-loopEX substrate (TCF21-loopEX-5fC; Fig. 5C), although at a reduced rate compared to dU (Table 2), consistent with the results from the DNA/RNA hybrid experiments. Together, these results demonstrate that TDG is able to bind and excise authentic R-loop structures and suggest that TDG’s activity on R-loops could be dependent on the length and structure of the ssDNA loop.Figure 5TDG functions on authentic R-loop structures. A, DNA used in this work. The sequence is derived from the TCF21 gene promoter (−29 to +36 relative to the TSS). The blue line indicates the position of the endogenous R-loop, whereas the red line indicates the position of the R-loop used herein. Individual CpG dinucleotides are numbered and the position of dU incorporation is indicated by blue text. See Figure S6 for sequence details. B, schematic illustration of the TCF21-derived substrates. Black and red colors denote DNA and RNA, respectively. C, single-turnover kinetics of TDG (1000 nM) acting on the indicated substrate (100 nM). Reaction conditions are identical to those described in Figure 3. Data are mean ± SD (n = 3). TDG, thymine DNA glycosylase; TSS, transcriptional start site.
R-loops direct the strand selectivity of TDG and prevent DSB formation at symmetrically modified CpG
Active DNA demethylation has been shown to occur in a strand-selective fashion at transcriptionally active promoters (22, 55). Furthermore, genome-wide mapping studies of 5fC and 5caC suggest that TDG’s processivity for these modifications is strand selective at some CpGs. However, it remains unclear how this selectivity is achieved, especially considering that TDG has no apparent strand preference when presented with symmetrically modified CpGs in vitro (7). In light of our findings above, we reasoned that the strand selectivity of TDG could be influenced by R-loops. Because TDG is catalytically inactive on ssDNA (56), only the DNA strand that is hybridized to the RNA is expected to be a substrate for base excision. Thus, for CpGs within an R-loop, TDG’s activity should be directed toward the DNA strand contained within the DNA/RNA hybrid, even if both sides of the CpG contain a potential substrate. To test this directly, we assembled a 60-bp DNA substrate in which the central CpG site was symmetrically modified with G•U pairs (symGU) (Fig. 6A and Table S1). The top and bottom strands of symGU were labeled with Cy5 and Cy3, respectively, allowing for base excision on both strands to be monitored simultaneously within the same reaction. As shown in Figures 6A and S7, TDG acted evenly on both strands of symGU under single-turnover conditions. The amount of single-strand incision rapidly reached 50% for both strands before starting to plateau. This observations is consistent with prior results showing that the processing of one side of a CpG largely inhibits processing of the other, likely due to the tight interaction of TDG with the AP-site product (7). Nevertheless, the extent of excision reached >80% for both strands after 30 min. In stark contrast, formation of a 15-nt long R-loop over the symmetrically modified CpG (symGU-R) resulted in almost exclusive base excision from the top strand (i.e., the DNA/RNA hybrid) (Figs. 6B and S7). The rate of excision from the top strand (kmax = 2.0 min^−1^) was >700-fold faster than from the bottom, unhybridized strand (kmax = 0.0027 min^−1^). These results demonstrate that R-loops can effectively direct the strand selectivity of TDG.Figure 6Processing of symmetrically modified R-loops by TDG. A and B, single-turnover kinetics of TDG (1000 nM) acting on either (A) symGU or (B) symGU-R (100 nM). Substrates were labeled on the top strand with Cy5 (red) and on the bottom strand with Cy3 (green). Reaction conditions are identical to those described in Figure 3. Data are mean ± SD (n = 3). C, representative native PAGE gel showing the formation of DSBs. The indicated substrate (100 nM) was treated with TDG (200 nM) and/or APE1 (20 nM) in a buffer containing 100 mM NaCl, 2.5 mM MgCl_2_, and 10 mM Tris–HCl (pH 7.5) for 30 min at 30 °C. Asterisks indicated the nicked duplex. The diamond indicates a DSB control product generated via the treatment of symGU with restriction enzyme Hpy188I. DSB, double strand break; TDG, thymine DNA glycosylase.
Base excision of symGU exceeded 50% on both strands under the conditions used herein (Fig. 6A), indicating that TDG-initiated BER could induce the formation of DNA DSBs at CpGs that are symmetrically modified with its nucleobase substrates. Indeed, when monitored by native gel electrophoresis (Fig. 6C), the combined action of TDG and APE1 produced a substantial fraction of DSBs from DNA symGU (>80%). Similar observations have been observed at CpGs symmetrically modified with 5caC (7). However, no detectable DSBs were produced upon incubation of the R-loop substrate symGU-R with TDG and APE1 (Fig. 6C), consistent with the almost exclusive processing of the top, hybrid strand by the two enzymes. We note that efficient incision of DNA AP sites by APE1 in DNA/RNA hybrids has been previously reported (57). Thus, in addition to directing the strand selectivity of TDG at symmetrically modified CpG substrates, the formation of R-loops may help avoid the formation of cytotoxic DSBs during active DNA demethylation.
TDG associates with R-loop–enriched regions in cells
Genome-wide mapping studies of TDG occupancy show that TDG is enriched at promoters and enhancers of active genes, a distribution similar to R-loops (18, 33, 58, 59, 60). To investigate this potential overlap more carefully, we compared the genome-wide distribution of TDG binding sites and R-loops in mESCs using published TDG chromatin immunoprecipitation sequencing (ChIP-seq) (18) and MapR (60) datasets, respectively. MapR employs a catalytically inactive RNase H to target micrococcal nuclease to R-loops to cleave and release them for high-throughput sequencing. As shown in Figure 7A and S8, TDG binding sites were strongly enriched for MapR signals. Of the 71,772 TDG ChIP-seq peaks analysed, 21,602 of them (30.1%) overlapped with a MapR peak (i.e., an R-loop). Overlap was strongest at gene promoters (referred to as TDG/R-loop promoters; 48.4%, TSS −1 kb to +100 bp), with overlapping peaks in these regions being highly enriched for marks of active transcription (H3K27ac and H3K4me3) (Fig. 7, A–C). Because MapR signal can be influenced by chromatin accessibility (60), we acknowledge that some overlap between TDG ChIP-seq and MapR peaks may reflect independent enrichment in transcriptionally active, open chromatin rather than direct engagement of TDG with DNA/RNA hybrids. Nevertheless, the substantial coenrichment of TDG and MapR signals, particularly at gene promoters, is consistent with TDG localizing to transcriptionally active, R-loop–enriched chromatin (61). Given that 5fC and 5caC are also enriched at the promoters of actively transcribed genes (18, 19), we asked whether TDG/R-loop promoters undergo active DNA demethylation by comparing overlapping TDG ChIP-seq and MapR peaks with genome-wide maps of 5fC and 5caC generated using control and TDG-silenced mESCs (18). Indeed, we found that 5fC and 5caC levels at TDG/R-loop promoters increased upon depletion of TDG (Fig. 7D), consistent with an active TDG-dependent 5fC and 5caC excision mechanism taking place at these sites.Figure 7TDG interacts with R-loops in mammalian cells. A, heat map representations on a window of ±3 kb around the center of TDG ChIP-seq peaks. Reads were parsed based on the overlap of TDG peaks with MapR peaks and ordered based on TDG ChIP-Seq signal. B, pie charts comparing the genome-wide distribution of all TDG peaks (n = 71,772) to those that overlap with R-loops (n = 21,602). Promoter-TSS: ± 1 kb from TSS; extended promoter: −10 kb to −1 kb from TSS. C, percent of TDG peaks at R-loops. D, box plot of the percentage of total 5fC/5caC at TDG/R-loop promoters in TDG knockdown (shTDG) versus control mESCs. ∗∗p < 0.01. E, TDG is associated with cellular R-loops. Lysates from TDG-expressing HeLa cells were immunoprecipitated with the S9.6 antibody and coprecipitated TDG was visualized by Western blot (n = 3 biological replicates). Where indicated, lysates were treated with RNase H1 or the S9.6 antibody was treated with competitor (comp.) prior to the immunoprecipitation step. Quantification of precipitated TDG from three independent DRIP experiments is shown below. Data are mean ± SD normalized to the input (10%). 5caC, 5-carboxycytosine; 5fC, 5-formylcytosine; ChIP-seq, chromatin immunoprecipitation sequencing; DRIP, DNA-RNA immunoprecipitation; mESC, mouse embryonic stem cell; TDG, thymine DNA glycosylase; TSS, transcriptional start site.
To further investigate TDG–R-loop interactions in cells, we carried out DNA-RNA immunoprecipitation (DRIP) using the R-loop–specific antibody S9.6 and monitored the coprecipitation of TDG by Western blotting (40). The assay was carried out using HeLa cells that were transfected with a plasmid encoding full-length human TDG. As shown in Figure 7E, TDG was strongly enriched following immunoprecipitation (IP) of the cell lysates with the S9.6 antibody compared to the IgG control. Treatment of the lysates with RNase H1, which degrades RNA within DNA/RNA hybrids, abolished TDG precipitations. Furthermore, incubation of the S9.6 antibody with a synthetic R-loop competitor (but not DNA duplex competitor) prior to the IP step abolished its ability to precipitate TDG, further confirming that the assay specifically enriched R-loop bound TDG. Together, these DRIP results, combined with the observed overlap between TDG ChIP-seq peaks and genomic R-loop regions (Fig. 7A), are consistent with an association between TDG and R-loop–enriched chromatin in cells.
Discussion
Originally thought to be rare byproducts of transcription, R-loops have now been shown to play a key role in a variety of nuclear processes. In particular, the number of connections between R-loop formation and chromatin modifications continues to grow. Studies now indicate that R-loops may act as an epigenetic mark, being read by chromatin remodelers and other proteins to effect changes in chromatin state (62). Notably, R-loops have been shown to promote local DNA demethylation by recruiting associated proteins, including TETs and TDG (41). Herein, we showed that TDG, a central component of the active DNA demethylation machinery, binds DNA/RNA hybrid substrates that model R-loops in vitro and can excise oxidized cytosines from DNA/RNA hybrid duplexes. Furthermore, we demonstrated that R-loops can confer strand selectivity of TDG at CpG dinucleotides, providing a potential explanation for the strand-specific distribution of 5fC/5caC observed at gene promoters (22, 55). Finally, our data are consistent with TDG associating with R-loops in mammalian cells. Overall, our findings indicate that 5fC/5caC can be directly removed from DNA/RNA hybrids, underscoring a mechanistic link between R-loops and active DNA demethylation. While this manuscript was under review, Richina et al. (63) reported related findings demonstrating TDG activity on DNA/RNA hybrid and R-loop substrates. Together, these independent studies support a functional intersection between TDG and R-loop–associated nucleic acid structures. Our work extends these observations by providing a quantitative and mechanistic analysis of TDG activity on defined hybrid and authentic R-loop substrates.
DNA duplexes adopt a B-form helix whereas DNA/RNA hybrids favor A-form (64, 65, 66, 67, 68). Despite these structural differences, our data show TDG can accommodate both types of substrates, but how? A potential explanation lies in the intrinsic structural plasticity of the DNA/RNA hybrid. While DNA/RNA hybrids have many characteristics of the A-form, NMR studies and molecular dynamic simulations indicate that they exist as an ensemble of conformation having both B-like and A-like structures (64, 65, 66, 67, 68). TDG may exploit this structural plasticity to sculpt the hybrid to fit its DNA-binding pocket. Indeed, the ability of TDG and other DNA glycosylases to conformationally distort their substrates (e.g., through DNA bending and pinching) via protein–DNA interactions is well documented (43, 52, 69, 70). The catalytic pocket of TDG has also been shown to be adaptable, which could further facilitate its interactions with the hybrid. For example, cis–trans isomerization of proline 155 allows for large conformational changes to occur near the active site (71). Regardless of how TDG interacts with DNA/RNA hybrids, our ^19^F NMR studies reveal that the resulting complex differs from that made between TDG and duplex DNA and leads to impaired nucleotide flipping. Flipping of the target nucleobase into the TDG active site and stabilization of its extrahelical conformation requires a precise network of interactions with the substrate, which may simply be more difficult to achieve with the hybrid (e.g., due to steric clash). The unique thermodynamic and mechanical properties of the hybrid relative to duplex DNA are also expected to contribute to flipping. The thermal stability of DNA/RNA hybrids is generally greater than for duplex DNA, making the hybrid less prone to base flipping (72, 73, 74). Consistently, the ΔG°37 values for the unmodified hybrid sequences H1-C (ΔG°37 = −44.50 kcal mol^−1^) and H2-C (ΔG°37 = −39.50 kcal mol^−1^) used herein were calculated to be 2.2 kcal mol^−1^ and 2.45 kcal mol^−1^ less than their corresponding DNA duplexes, respectively, using reported nearest neighbor parameters (in 1 M NaCl) (74, 75). Furthermore, theoretical studies predict that DNA/RNA hybrids are stiffer and more resistant to local deformation than DNA duplexes, potentially making it more difficult to be bent by TDG (64). Bending of the substrate is believed to facilitate flipping of the target nucleobase into TDG’s active site for subsequent excision (52). Ultimately, addressing questions about how TDG binds to and excises nucleobases from DNA/RNA hybrids will require future structural studies.
Because R-loops form in a sequence-specific manner, they provide a compelling mechanism for the precise, locus-specific targeting of CpG sites for active DNA demethylation. Growing evidence indicates that, in certain contexts, R-loops act as molecular guides that recruit components of the DNA demethylation machinery, such as GADD45A, TET1, and TDG, to specific gene promoters (25, 41, 62). A well-characterized example involves the antisense lncRNA TARID (TCF21 antisense RNA inducing promoter demethylation), which forms an R-loop at the TCF21 promoter (41). This R-loop serves as a binding platform for GADD45A, which in turn triggers local DNA demethylation and activates TCF21 transcription through the recruitment of TET1 and TDG. The sequential progression of TARID transcription, R-loop formation, promoter demethylation, and TCF21 expression during the cell cycle underscores the temporal regulation of this process. Moreover, depletion of RNH1 or RNH2, enzymes that degrade the RNA component of R-loops, leads to an accumulation of DNA demethylation intermediates, reinforcing that R-loops are active sites of DNA demethylation in vivo. Consistent with this model, our data showing that TDG binds directly to R-loops and excises 5fC and 5caC from DNA/RNA hybrids provide mechanistic support for R-loop–mediated targeting of DNA demethylation, without asserting a universal role across genomic contexts.
Although TDG activity is reduced on DNA/RNA hybrids compared to fully duplex DNA, intrinsic catalytic rate alone does not necessarily predict substrate relevance in the nuclear environment, particularly given that TDG binds DNA/RNA hybrids with affinity comparable to that observed for duplex DNA (Table 1). Moreover, TDG and other BER enzymes are known to function on kinetically disfavored substrates in chromatin contexts, where accessibility, locus-specific recruitment, and regulatory interactions strongly influence substrate engagement (38, 41, 48, 76, 77). Accordingly, the reduced catalytic rate observed on DNA/RNA hybrids is likely secondary to chromatin- and context-dependent regulatory features associated with R-loops during active DNA demethylation. For example, R-loops are inhibitory for nucleosome formation and are typically associated with increased chromatin accessibility (78). We previously showed that nucleosomes and folded chromatin fibers impede DNA binding and base excision by TDG (48). Thus, R-loop formation may be important for providing TDG access to the underlying DNA, leading to increased 5fC/5caC turnover. Additionally, our results demonstrate that R-loops can target TDG-mediated base excision to the DNA strand contained within the DNA/RNA hybrid, even if both sides of the CpG contain a potential substrate. This provides a potential mechanistic explanation for how TDG can achieve strand-selective processing of 5fC and 5caC at symmetrically modified CpG dinucleotides. While this strand bias is imposed at the level of TDG-mediated excision, it does not require assumptions about the precise timing or context of upstream oxidation events. Notably, TET enzymes have been shown to retain activity on DNA/RNA hybrids and partially ssDNA (79), indicating that oxidation itself may not be intrinsically strand selective within R-loops. In this context, TDG-mediated strand selectivity emerges as a critical determinant of downstream demethylation outcomes. Additional biochemical and cellular experiments will be needed to test and validate models in which R-loops direct strand-selective steps of active DNA demethylation. Finally, our data suggest that R-loops play a protective role during active DNA demethylation by preventing the formation of DSBs. Most CpG dinucleotides in mammals are symmetrically methylated, generating a high potential for DNA DSBs upon demethylation and subsequent BER. Previous studies proposed that DSB are mostly avoided due, in part, to the high affinity of TDG for its AP-site product, which prevents the opposite strand from being processed simultaneously (7). However, APE1 has been shown to stimulate the catalytic turnover of TDG by disrupting the product complex (80), suggesting that tight coupling of TDG and APE1 with the downstream BER machinery is necessary to avoid the formation of DSBs. This is highlighted by our observation that the combined activities of TDG and APE1 yielded significant DSBs in the absence of downstream BER enzymes (Fig. 6C). In contrast, no DNA DSBs were detected with R-loop substrates, which is attributed to the weak activity of TDG and APE1 on the looped out ssDNA (81, 82). Thus, by essentially masking one side of the CpG from the BER machinery, the formation of R-loops provides a simple mechanism for avoiding DSBs during active DNA demethylation. It is worth noting here that TDG was found to be weakly active on G•T mispairs in DNA/RNA hybrids. This indicates that G•T mispairs, which arise frequently at CpG dinucleotides due to the higher rate of deamination of 5mC than unmethylated cytosine (83), will avoid repair as the result of R-loop formation. In this scenario, the formation of R-loops could be considered a disadvantage by promoting C to T mutations within CpG dinucleotides.
In conclusion, our findings establish that TDG retains catalytic activity on DNA/RNA hybrid substrates in vitro and reveal previously unrecognized biochemical properties of TDG in this context. These results suggest several potential regulatory roles for R-loops and RNA during active DNA demethylation, but defining their physiological relevance and functional impact in cells will require future investigation. This work therefore provides a biochemical framework for exploring how TDG activity may intersect with R-loop biology.
Experimental procedures
Reagents
All synthetic oligonucleotides were either purchased from Integrated DNA Technologies or prepared by solid-phase synthesis on an Expedite 8909 DNA/RNA synthesizer. Nucleoside phosphoramidites and DNA synthesis reagents were purchased from Glen Research. Sulfo-Cyanine3 (Cy3; cat. No: 21320) and cyanine5 (Cy5; cat. No: 23320) NHS ester dyes were purchased from Lumiprobe Life Science Solutions. Pipes, 1M, pH 8.0 (cat. no: J60618.AK), sodium deoxycholate detergent (cat. no: 89904), N-lauroylsarcosine sodium salt, 95% (cat. no: J60040.09), Dynabeads Protein A for IPs (cat. no: 10001D), NP-40 Surfact-Amps Detergent (cat. no: 85124), Halt Protease Inhibitor cocktail (cat. no: 78430), NuPAGE LDS sample buffer (cat. no: NP0007), Goat anti-Rabbit IgG (H + L) highly cross-adsorbed secondary antibody, Alexa Fluor Plus 647 (cat. no: A32733), Lipofectamine 3000 Transfection Reagent (cat. no: L3000008) were purchased from Thermo Fisher Scientific. Hpy188I (cat. no: R0617S) and proteinase K (cat. no: P8107S) were purchased from New England Biolabs. Mouse IgG2a Isotype Control (cat. no: M5409-.1MG) and BL21Rosetta (DE3) (cat. no: 70954-3) were purchased from Sigma-Aldrich. Anti-DNA-RNA hybrid [S9.6] Antibody (cat. no: ENH001) was purchased from Kerafast, Inc. TDG polyclonal antibody (cat. no: 13370-1-AP) was purchased from Proteintech. Dry Powder Milk (cat. no: M17200-500.0) was purchased from RPI-Research Products International.
Biological resources
HeLa S3 cells (cat. no: CCL-2.2) were purchased from American Type Culture Collection and cultured in Dulbecco's modified Eagle’s medium (DMEM; Thermo Fisher Scientific) supplemented with 25 mM Hepes, 1 mM GlutaMax, 10% FBS, and 100U/ml Penicilin-Streptomycin (Thermo Fisher Scientific). Cells were maintained at 37 °C in humidified CO_2_ (5%) atmosphere. The TDG mammalian expression vector (cat. no: HG13000-UT) was purchased from Sino Biological, Inc.
Protein expression and purification
Full-length human TDG was expressed and purified as described previously with minor modifications (84). In brief, the TDG plasmid (pET28a-hTDG; 35 ng) was transformed into BL21 Rosetta (DE3) cells and the outgrowth was used to prepared 4 × 25 ml cultures of Luria-Bertani broth supplemented with 50 ug/ml kanamycin and chloramphenicol. Following overnight shaking at 37 °C, 10 ml of the overnight culture was added to 1 L of Luria-Bertani supplemented with 50 ug/ml Kanamycin and chloramphenicol. The cells were induced with 0.25 mM IPTG at 15 °C for 15 h when reached OD∼0.6. They were then pelleted by centrifugation at 4500 rpm for 20 min at 4 °C using Sorvall RC-5C plus centrifuge. The resulting pellets were stored at −80 °C overnight and then thawed at 4 °C. The cell pellets were then resuspended in lysis buffer (50 mM PO_4_^3-^ pH 8, 300 mM NaCl, 5 mM imidazole, 1 mM BME) supplemented with protease inhibitor. To start lysis, lysozyme (1 mg/ml) and DNase (0.025 U/ul) were added to the resuspended cell pellets and kept on ice for 30 min. The cells were sonicated for 6 min. The resulting lysates were centrifuged (10,000 RPM, 60 min) and filtered with a 0.2 um syringe tip filter (cat. no: SLGPR33RS, Millipore Sigma). The prepacked His GraviTrap TALON column (cat. no: 29000594, CYTIVA) was equilibrated with 25 ml of lysis buffer twice and then the filtered lysate was passed through. The TDG bound resin was then washed with 30 ml of wash 1 (700 mM NaCl in lysis buffer) and again with 30 ml of lysis buffer. TDG was eluted with 3 × 5 ml of elution buffer (500 mM imidazole in lysis buffer). The eluted TDG was exchanged into IEA buffer (20 mM Hepes pH 7.5, 75 mM NaCl, 1 mM DTT, 0.2 mM EDTA, 1% glycerol) using a 5 ml Sephadex G-25 resin HiTrap Desalting column (cat. no: 17140801, Cytiva). TDG was loaded onto a 1 ml HiTrap Q HP anion exchange chromatography column (cat. no: 17115301, Cytiva) that was equilibrated with IEA buffer. Bound TDG was then eluted using a linear gradient (0–100%) of IEB buffer (20 mM Hepes pH 7.5, 1 M NaCl, 1 mM DTT, 0.2 mM EDTA, 1% glycerol) in 500 uL fractions. Fractions containing TDG were pooled and concentrated using an Amicon Ultra centrifugal filter with a 3 kDa MWCO (cat. no: UFC900308, Millipore Sigma).
TDG^82-308^ was expressed in Escherichia coli BL21(DE3) at (22 °C) and purified (at 4 °C) by Ni-affinity, ion-exchange (SP sepharose), and size-exclusion chromatography as described previously (71, 85). Enzyme purity was >99% as judged by SDS-PAGE with Coomassie staining, and the concentration was determined by absorbance (280 nm) using an extinction coefficient of ε^280^ = 17.4 mM^−1^ cm^−1^ (44, 86). Purified enzyme was flash frozen and stored at −80 °C.
Oligonucleotide synthesis and purification
All oligonucleotides used in this study are shown in Table S1. Synthetic oligonucleotides prepared in-house were made using an Expedite 8909 DNA/RNA synthesizer according to manufacturer’s recommended protocol. All oligonucleotides (purchased or synthesized in-house) were purified by denaturing PAGE (20%, 19:1 acrylamide:bisacrylamide). Targets bands were excised from the gel and eluted overnight at room temperature in elution buffer (200 mM NaCl, 10 mM EDTA, 10 mM Tris–HCl pH 7.6). Samples were then filtered to remove gel fragments and desalted by ethanol precipitation. 6-carboxyfluorescein-labeled oligonucleotides were either purchased directly or synthesized in-house using the 5′-fluorescein phosphoramidite (cat. no: 10-5901-90E, Glen Research). Sulfo-Cy3 and Sulfo-Cy5 labeling was carried out using the corresponding NHS esters. Oligonucleotides harboring a 5′-amino modifier were purchased from Integrated DNA Technologies. Labeling reactions were performed by mixing the amine-modified oligonucleotide with 10-fold molar excess of the indicated NHS ester dye in 0.1 M sodium bicarbonate buffer (pH 8.5). The reaction was kept on ice overnight and the excess dye was subsequently removed by ethanol precipitation prior to use. Purified oligonucleotides were dissolved in water and their concentrations were determined by absorbance at 260 nm using a NanoDrop 2000c (Thermo Fisher Scientific). Duplex and R-loop substrates were prepared by annealing 10 uM each of the corresponding strands (Table S1) in annealing buffer (50 mM NaCl, 10 mM Tris–HCl pH 7.5). The mixture was kept at 95 °C for 3 min before being cooled down to room temperature over the course of 90 min.
Electrophoretic mobility shift assay
Electrophoretic mobility shift assays were carried out as described previously with minor modifications (84). Briefly, the indicated substrate (5 nM) was mixed with increasing concentrations of TDG (0–300 nM) in binding buffer (100 mM NaCl, 2.5 mM MgCl_2_, 10 mM Tris–HCl pH 7.5, and 5% glycerol). The reaction mixture was incubated at 30 °C for 30 min and an aliquot was resolved by 0.6% agarose gel buffered with 1 × TBE. Electrophoresis was carried out for 1 h (6–8 V/cm) at 4 °C. The gel was visualized using a ChemiDoc MP Imaging System (Bio-Rad Laboratories, Inc), and images were quantified using Image Lab software version 6.1.0 (Bio-Rad Laboratories, Inc.; bio-rad.com/en-us/products/image-lab-software?ID=KRE6P5E8Z). GraphPad Prism 9 Version 9.4.1 (graphpad.com) was used to fit equations for specific binding with Hill slope.
Glycosylase assays
Single-turnover kinetic reactions were initiated by mixing 100 nM of the indicated substrate with 1 uM of TDG in a buffer containing 100 mM NaCl, 2.5 mM MgCl_2_, and 10 mM Tris–HCl (pH 7.5). Aliquots (2 ul) were removed at the desired time points and added to a solution (2 ul) of 1% SDS in water to quench the reaction. To cleave the abasic site product, equal volume of 0.2 M NaOH was added to the aliquots, which were subsequently heated at 70 °C for 5 min before adding 8 ul of denaturing loading buffer (90% formamide, 10 mM EDTA pH 8). The products were then resolved using denaturing PAGE (20%, 19:1 acrylamide:bisacrylamide). The gels were visualized by ChemiDoc MP Imaging System (Bio-Rad Laboratories, Inc), and images were quantified using Image Lab version 6.1.0 (Bio-Rad Laboratories, Inc). The fraction product versus time was fitted to Equation 1:
where A is the amplitude, kobs is the rate constant, and t is the reaction time. Experiments were performed with saturating TDG, which was confirmed by obtaining similar rate constants for experiments carried out at higher enzyme concentrations.
19F NMR spectroscopy
The ^19^F NMR experiments were performed at 25 °C on a Bruker 600 MHz spectrometer (564.2 MHz for ^19^F) equipped with four channels, a z-axis gradient, and a 5 mm HFCN cryogenic probe (optimized for ^1^H, ^19^F, ^13^C, and ^15^N), as previously described (51, 87). ^19^F NMR experiments were carried out using a construct of TDG comprised of residues 82 to 308 (TDG^82-308^) that is functionally equivalent to that of full-length TDG. The samples contained DNA at a concentration of 62 to 75 μM DNA and (if present) a 2-fold higher concentration of TDG^82-308^. The buffer for ^19^F NMR experiments consisted of 0.1 M NaCl, 0.03 mM TCEP, 15 mM Tris–HCl pH 7.5, and 10% D_2_O. The ^19^F NMR experiments were collected with 2048 complex points, an acquisition time of 0.66 s, a relaxation delay of 2.0 s, and with 5000 scans for free DNA and 8000 to 48,000 scans for TDG complexes with DNA or DNA/RNA hybrids. The NMR data were processed by applying exponential multiplication with 25 Hz line broadening prior to Fourier transformation and baseline correction using TopSpin (Bruker) and analysed using TopSpin and CcpNmr (88). The ^19^F chemical shift values (δ^19^F) reported herein are relative to an external sample of TFA (6.5 mM) in the identical buffer.
DSB assay
Reactions were initiated by mixing 200 nM TDG and 20 nM APE1 with 100 nM of the indicated substrate in a reaction buffer containing 100 mM NaCl, 2.5 mM MgCl_2_, and 10 mM Tris–HCl (pH 7.5) at 30 °C. A control without APE1 was also setup side-by-side. Aliquots (2 ul) were removed at the indicated time points and quenched with 1.6 U of proteinase K (New England Biolabs), followed by the addition of 50% glycerol in water (1 ul). Aliquots were left to sit at room temperature for 30 min. Samples were then resolved by native PAGE (10%, 29:1 acrylamid:bisacrylamide). The gels were visualized by ChemiDoc MP Imaging System (Bio-Rad Laboratories, Inc) and quantified as described above.
DNA-RNA immunoprecipitation
DRIP assays were performed with noncrosslinked HeLa S3 cells using the S9.6 antibody as previously described with minor modifications (89). HeLa S3 cells were maintained in T-75 size CELLSTAR TC Treated Cell Culture flask (cat. no: 07-000-225, Thermo Fisher Scientific) in DMEM (Thermo Fisher Scientific) supplemented with 25 mM Hepes, 1 mM GlutaMax, 10% FBS, and 100U/ml Penicilin-Streptomycin (Thermo Fisher Scientific). Twenty-four hours before transfection, 2.5 × 10^6^ HeLa S3 cells were seeded in T75 plates in DMEM supplemented with 25 mM Hepes, 1 mM GlutaMax, and 10% FBS. On the day of transfection, 750 ul of Opti-MEM (Thermo Fisher Scientific) was mixed with 60 ul of Lipofectamine 3000 reagent (Thermo Fisher Scientific). In a separate tube was added 750 ul of Opti-MEM, 8 ug of the TDG plasmid (pCMV3-TDG), and 40 ul of the p3000 reagent, which was supplied with the Lipofectamine 3000 reagent. The two tubes were then mixed and allowed to incubate at room temperature for 15 min before being added to the cells. After 14 h, cells were washed with 2 × 5 ml of PBS and treated with 3 ml of 0.25% EDTA-trypsin (Thermo Fisher Scientific) at 37 °C for 5 min to detach cells from the plate surface. The trypsin was quenched by addition of 7 ml of DMEM and the cells were pelleted at 500×g at 4 °C. After removing the supernatant, the cells were washed with 5 ml of PBS and pelleted at 500×g (4 °C). Supernatant was removed and the cells were lysed in lysis buffer (85 mM KCl, 5 mM Pipes pH 8.0, 0.5% NP-40). The nuclei were collected by centrifugation at 10,000×g (4 °C) and the resulting pellet was resuspended in RSB buffer (200 mM NaCl, 2.5 mM MgCl_2_, and 10 mM Tris–HCl pH 7.5) supplemented with 0.5% Triton X-100, 0.2% sodium deoxycholate, 0.1% SDS, and 0.05% sodium lauroyl sarcosinate. The nuclear extracts were sonicated for 6 min to obtain DNA fragments of approximately 500 to 700 bp in length, which was confirmed by native PAGE. Meanwhile, protein A Dynabeads (Thermo Fisher Scientific) were preblocked with PBS supplemented with 0.5% BSA for 2 h. The preblocked beads were then washed with 3 × 1 ml of RSB buffer supplemented with 0.5% Triton X-100 and incubated with either the IgG control or S9.6 antibody for 2 h. For control experiments involving either the hybrid competitor (H1-C) or DNA duplex competitor (D1-C), the prepared S9.6-coated Dynabeads were further treated with 500 nM of the indicated competitor for 2 h in RSB buffer supplemented with 0.5% Triton X-100, 0.05% sodium deoxycholate, 00.025% SDS, and 0.0125% sodium lauroyl sarcosinate. Excess competitor was removed by washing the beads with RSB buffer supplemented with 0.5% Triton X-100. For control experiments involving RNase H (NEB), nuclear extracts were treated with 0.5 U ul^−1^ of RNase H1 overnight at room temperature. The sheared nuclear extracts were pretreated with 0.1 ng ul^−1^ of RNase A (NEB) and 0.02 U ul^−1^ of RNase III (NEB) for 30 min before applying to the antibody-coated Dynabeads and allowed to incubate for 2 h at 4 °C. The sheared nuclear extracts were then applied to the antibody-coated Dynabeads and allowed to incubate for 2 h at 4 °C. Following IP, the beads were washed with 4 × 1 ml RSB buffer supplemented with 0.5% Triton X-100 and then with 2 × 1 ml RSB buffer. Bound proteins were eluted by incubating the beads with NuPAGE LDS sample buffer (Thermo Fisher Scientific) supplemented with 100 mM DTT for 10 min at 70 °C.
Western blotting
Eluates from the pull-down assays were resolved by 10% SDS-PAGE gels and subsequently transferred onto nitrocellulose membranes using a Trans-Blot Turbo Transfer System (Bio-Rad). Membranes were blocked with 5% nonfat milk in PSB supplemented with 0.5% tween 20 (PBST; ThermoFisher Scientific, Waltham, MA) (blocking solution) for 1 h at room temperature on a rocker. After blocking, the membranes were washed three times with PBST and incubated overnight at 4 °C with the TDG polyclonal antibody (Proteintech) in blocking solution. Following primary antibody incubation, the membranes were washed three times with PBST for 5 min and incubated with Goat anti-Rabbit IgG secondary antibody Alexa Fluor 647 (Thermo Fisher Scientific) in blocking solution for 1 h at room temperature on a rocker. The membranes were again washed three time with PBST and followed by imaging by fluorescence (Alexa Fluor 647; excitation/emission 650/671 nm) on a ChemiDoc MP Imaging System (Bio-Rad Laboratories, Inc).
Bioinformatics analysis
Datasets used in this study were retrieved from NCBI SRA database. They consist of TDG ChIP-Seq (GSE55660) (18), MapR and BisMapR (GSE160578) (60), H3K27ac ChIP-Seq (GSE49847; files SRR566839 and SRR566840) (90), and H3K4me3 ChIP-Seq (GSE49847; files SRR317222 and SRR317223) datasets, which were all carried out in mESCs cells.
TDG ChIP-Seq peak list was directly retrieved from GSE55660. Peak coordinates were converted from mouse genome version mm9 to mm39 (liftOver from ucsc genome browser), and blacklisted peaks were removed to end up with a final list of 71,772 peaks. Peak location was determined with the annotatePeaks.pl script from the Homer suite (91). Fastq files retrieved from NCBI SRA database were aligned to the mm39 mouse genome version using bowtie (read length ≤ 50 nucleotides) or bowtie2 (read length > 50 nucleotides), and PCR duplicates were removed. Signal was retrieved at TDG ChIP-Seq ( ± 3 kb window from peak center; bin size of 10 bp) using scripts from Bedtools (92).
Analysis of 5fc/5caC signal at TDG ChIP-Seq peaks was performed using published output files of mESC RRMAB-Seq datasets from control and shTdg mESCs (GSM1341314_RRMAB-Seq_shCtr.bed and GSM1341315_RRMAB-Seq_shTdg.bed files, from GSE55660) (18). Genomic location of CpG with reported 5fC/5caC signal was converted from mouse genome version mm9 to mm39, and 5fC/5caC signal at TDG peaks was retrieved with the intersectBed function of Bedtools (92). Analysis was carried out using CpGs with reported 5fC/5caC signal in TDG peaks, and difference in 5fC/5caC signal between control and shTdg mESCs was determined by student t test and considered significant if p < 0.05.
Data availability
The data generated during all experiments is available from the author upon reasonable request.
Supporting information
This article contains supporting information.
Conflict of interest
The authors declare that they have no conflicts of interest with the contents of this article.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Tahiliani M.Koh K.P.Shen Y.Pastor W.A.Bandukwala H.Brudno Y.Conversion of 5-methylcytosine to 5-hydroxymethylcytosine in mammalian DNA by MLL partner TET 1Science 32420099309351937239110.1126/science.1170116 PMC 2715015 · doi ↗ · pubmed ↗
- 2Kriaucionis S.Heintz N.The nuclear DNA base 5-hydroxymethylcytosine is present in purkinje neurons and the brain Science 32420099299301937239310.1126/science.1169786 PMC 3263819 · doi ↗ · pubmed ↗
- 3Wu X.Zhang Y.TET-mediated active DNA demethylation: mechanism, function and beyond Nat. Rev. Genet.1820175175342855565810.1038/nrg.2017.33 · doi ↗ · pubmed ↗
- 4He Y.-F.Li B.-Z.Li Z.Liu P.Wang Y.Tang Q.Tet-mediated formation of 5-carboxylcytosine and its excision by TDG in mammalian DNA Science 3332011130313072181701610.1126/science.1210944 PMC 3462231 · doi ↗ · pubmed ↗
- 5Ito S.Shen L.Dai Q.Tet proteins can convert 5-methylcytosine to 5-formylcytosine and 5-carboxylcytosine Science 3332011130013032177836410.1126/science.1210597 PMC 3495246 · doi ↗ · pubmed ↗
- 6Maiti A.Drohat A.C.Thymine DNA glycosylase can rapidly excise 5-formylcytosine and 5-carboxylcytosine: potential implications for active demethylation of CPG sites J. Biol. Chem.286201135334353382186283610.1074/jbc.C 111.284620 PMC 3195571 · doi ↗ · pubmed ↗
- 7Weber A.R.Krawczyk C.Robertson A.B.Kuśnierczyk A.VågbøC.B.Schuermann D.Biochemical reconstitution of TET 1–TDG–BER-dependent active DNA demethylation reveals a highly coordinated mechanism Nat. Commun.720161080610.1038/ncomms 10806 PMC 477806226932196 · doi ↗ · pubmed ↗
- 8Jones P.A.Takai D.The role of DNA methylation in mammalian epigenetics Science 2932001106810701149857310.1126/science.1063852 · doi ↗ · pubmed ↗
