Dynamic landscape of transcription initiation by yeast RNA polymerase I

Olena Parilova; Piia Bartos; Anssi M Malinen

PMC · DOI:10.1093/nar/gkag153·February 24, 2026

Dynamic landscape of transcription initiation by yeast RNA polymerase I

Olena Parilova, Piia Bartos, Anssi M Malinen

PDF

Open Access

TL;DR

This study explores how yeast RNA polymerase I initiates transcription, revealing a two-step mechanism for promoter recognition and complex formation.

Contribution

The paper introduces a dynamic model of promoter recognition by RNA polymerase I using integrated biochemical and biophysical methods.

Findings

01

Core factor identifies promoters via a two-step mechanism involving rapid encounter and conformational transition.

02

Nonpromoter DNA binding is single-step and results in quick dissociation of nonspecific complexes.

03

Correct promoter binding leads to DNA bending and melting as the preinitiation complex activates.

Abstract

RNA polymerase I (Pol I) synthesizes precursor ribosomal RNA, a key step in ribosome biogenesis. Elevated Pol I activity supports rapid cell growth—a hallmark of cancer—making Pol I a therapeutic target. The initial step in synthesis involves assembly of the Pol I transcription initiation complex on the gene promoter; however, its quantitative and dynamic parameters remain poorly defined. Here, we integrate biochemical, biophysical, and molecular dynamics approaches to dissect promoter and transcription start site (TSS) recognition by the Saccharomyces cerevisiae Pol I machinery. We show that core factor (CF) identifies the promoter through a two–step mechanism: a rapid encounter is followed by a slower conformational transition that establishes stabilizing interactions. In contrast, CF binds nonpromoter DNA in a single step without such transitions, forming nonspecific complexes that…

Linked entities

Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.

Genes12

RRN11 POLI BGLAP POL1 NET1 ERCC3 Poli RPA190 Stmn1 RPA12 RRN7 RRN3

Proteins13

Species2

Saccharomyces cerevisiae Homo sapiens(human · species)

Cell lines3

TEV protease— Cricetulus griseus (Chinese hamster) · Spontaneously immortalized cell line LTB5— Homo sapiens (Human) · Human papillomavirus-related endocervical adenocarcinoma · Cancer cell line HL105-251-15-40— Homo sapiens (Human) · Induced pluripotent stem cell

Chemicals43

Diseases6

cancer developmental disorders OC biogenesis ribosomopathy syndromes CF

Mutations9

-28G C > T-27T-27T A > G-13ins-13deladenine at bp -20thymine at bp -21A > GC > T

Figures8

Click any figure to enlarge with its caption.

Footprinting of CF on rDNA promoter. (a) The cartoon illustrates Exo III-based footprinting of the CF binding site on WT rDNA promoter templates (span −45/+45, TSS at +1, highlighted in yellow). The 5′ end of either the nontemplate (promoter shown at the top) or the template strand (bottom promoter) is labeled with a cyanine 5.5 fluorophore (red ellipses); the complementary strand is protected at the 3′ end with phosphorothioate bonds (blue asterisks). The direction of Exo III cleavage is indicated by gray arrows. The CF-protected fragment on each promoter is highlighted with cyan rectangles, and the complete CF footprint is marked with a light purple rectangle. (b) Representative gel image of CF binding site mapping by Exo III footprinting. Gel lanes from left to right: TTP (T) and ATP (A) analogue Sanger sequencing ladders; footprinting of the WT rDNA promoter for the downstream boundary (indicated by the left black arrow) in the presence (WT, +CF) and the absence (WT, −) of CF; footprinting of the WT rDNA promoter for the upstream boundary (indicated by right black arrow) in the presence (WT, +CF) and the absence (WT, −) of CF; upstream footprinting of rDNA promoter (−45/+45) substituted at position −27 in the presence (−27T·A > G·C; +CF) and the absence (−27T·A > G·C; −) of CF; upstream footprinting of rDNA promoter (−45/+45) substituted at position −28 in the presence (−28G·C > T·A; +CF) and the absence (−28G·C > T·A; −) of CF; and upstream footprinting of the rDNA promoter fragment spanning −80/−35 in the presence (−80/−35, +CF) and the absence (−80/−35, −) of CF. Blue and red arrows indicate the electrophoretic mobility of specific-length DNA ladder strands. The electrophoretic mobility of full-length promoter is indicated with asterisks, and the CF footprint fragments with a hash symbol. Denaturing 10% polyacrylamide gel was scanned for Cy5.5 fluorescence.

Binding affinity of CF to rDNA promoter. (a) KDapp of CF·DNA(Cy3) complex was determined by mixing 24 nM DNA(Cy3) promoter with 12.5–200 nM CF. Fluorescence intensities were measured using a spectrofluorometer. Data points and error bars represent the mean and SD from 3–7 independent experiments, respectively. (b) Mass distributions of DNA-free and DNA-bound CF complexes were determined by mass photometer. Samples contained 50 nM CF and varying concentrations (0, 25, 50, or 100 nM) of unlabeled rDNA promoter (span −45/+45). The solid lines over the distributions indicate the best Gaussian fits of molecular masses. (c) Relative amount of CF bound to the rDNA promoter in mass photometer assay (n = 3 independent experiments). The curves in panels (a) and (b) were obtained using the equations 2–4 and parameter values listed in Supplementary Table 6.

Interaction dynamics of CF and rDNA promoter. (a) SF fluorescence traces monitoring the binding of 12.5–120 nM CF to 24 nM WT DNA(Cy3) are shown. The inset shows data on a linear timebase. The curves were obtained using equation 5. (b) Fluorescence intensity of WT promoter, nonspecific DNA (NS-DNA) or mutant promoter scaffolds, each Cy3 labeled, at the end of CF binding reaction in SF assay. The curves were obtained using the equations 2–4. (c) Observed rate constants (kobs) of CF binding to WT promoter, nonspecific DNA, or mutant DNA(Cy3) scaffolds in SF assay. (d) SF fluorescence traces monitoring the dissociation of CF from WT promoter, nonspecific DNA, or mutant DNA(Cy3) scaffolds after challenging the preformed complexes with an excess of label-free promoter DNA (span −30/+30). The curves were obtained using the equation 6. The parameter values used for each fit equation are listed in Supplementary Table 6. (e) Mechanistic models for the formation of specific CF·promoter and unspecific CF·DNA complexes.

Role of CF and promoter CF-binding site in transcription initiation. (a) Transcription initiation activities on WT promoters of different lengths are shown. The transcription reactions contained rDNA promoter, Pol I, Rrn3, CF, and NTPs. All active promoters initiated transcription at position +1. Data represent the mean and SE from the indicated number (n) of independent experiments. Statistical significance of observed differences is reported using the reference sample (ref.). The inset shows transcription activity in the presence and absence of CF. RNA transcript was quantified by primer extension assay. (b) Representative PAGE gel illustrating the TSS position and transcription activity levels as detected by primer extension assay. T and A indicate Sanger sequencing ladders prepared using TTP and ATP analogs as chain terminators, respectively. The inferred TSS position at +1, the lengths of the corresponding primer–extension products, and the sequence of the rDNA promoter template strand are shown to the left of the gel scan. The free OP018 primer is indicated with dotted line. All reactions shown were performed simultaneously and resolved on the same denaturing 12% gel, which was scanned for Cy5.5 fluorescence in OP018 primer; full gel scan is provided in Supplementary Materials 2. All assembled PICs were split into transcription reactions supplemented with NTPs (indicated with +) and negative controls without NTPs (−). Asterisk indicates the sample with fluorophore contamination detected in the presence and the absence of NTPs. (c) The Rrn7 subunit (colored brown) embraces the rDNA promoter near positions −27, −28, and −29. Several residues (R293, H294, T295, R297) are oriented into or near the major groove of the promoter DNA. DNA surfaces are shown in semi-transparent colors: yellow for the template strand and sky blue for the nontemplate strand. The figure was generated using cryo-EM model 6RQL and Chimera 1.14 software [46]. (d) The remaining amounts of the CF·DNA(Cy3) complex were determined following incubation with label-free DNA competitors, using the SF instrument to record Cy3 fluorescence intensity. Linear (−0/+30, −28/+30, −26/+30), single base pair indel mutants (−13del, −13ins), and fork (FPt1, FPt2) promoters were used as competitors. The fluorescence intensities of protein-free DNA(Cy3) and the CF·DNA(Cy3) complex were set to 0 and 1, respectively. Data represent the mean ± SE from the indicated number (n) of independent experiments. Statistical significance was assessed using the reference sample (ref.) as the baseline.

The effects of single bp mutations on the promoter recognition and transcription activity. (a) Sequences of the ntDNA strand from the rDNA promoter scaffolds used are shown. Base substitutions and insertions/deletions are highlighted with light blue shading and red font. TSS is highlighted in yellow. The numeric positions of these sites are indicated above the sequences. (b) Transcription initiation activity, determined by primer extension assay, of Pol I on WT and mutant rDNA promoters is shown. (c) The fraction of CF bound to rDNA promoters was determined using MP. Reactions contained 50 nM CF and 25 nM promoter. (d) An illustrative PAGE gel of RNA products from in vitro transcription reactions shows the TSS position. T and A indicate Sanger sequencing ladders prepared using TTP and ATP analogues as chain terminators, respectively. The inferred TSS positions (in parentheses), the lengths of the corresponding primer–extension products, and the sequence of the rDNA promoter template strand are shown to the right of the gel scan. The free OP018 primer is indicated with dotted line. Transcription reactions supplemented with NTPs are indicated with (+NTPs), and negative controls without NTPs with (−). All samples shown were run on the same denaturing 12% PAGE gel, which was scanned for Cy5.5 fluorescence in OP018 primer; full gel scan is provided in Supplementary Materials 2. Data in panels (b) and (c) represent the mean and SE from the indicated number (n) of independent experiments. Statistical significance is reported using the reference sample (ref.).

Promoter topology affects the TSS recognition by Pol I. (a) Cartoon represents the structure of fork junction promoters (FPt1, 9 nt overhang in tDNA; FPt2, 24 nt overhang in tDNA; FPnt1, 8 nt overhang in ntDNA; FPnt2, 11 nt overhang in ntDNA) and two pre-melted bubble promoters (Pb1, ntDNA·tDNA mismatch from −10 to +5; Pb2, mismatch from −9 to +2) used to investigate the TSS selection by Pol I. Numbers indicate the span of each DNA strand, the mismatch regions, and the TSS (+1) of WT rDNA promoter. ntDNA and tDNA are shown in dark and light gray, respectively. (b) The transcription initiation activity of fork promoters was measured in the presence of Pol I, Rrn3, and CF by primer extension assay. When multiple TSS were observed, as in the case of FPt1, Pb1, and Pb2, RNA products were separately quantitated and summed up to obtain the reported total Pol I activity. WT linear rDNA promoter (span −90/+30) was used as a control. Reported activities are the mean and SE of three independent experiments. (c) Transcription activity on FPt1 was measured with and without CF in the presence of Pol I and Rrn3. Total activity and TSS-specific transcription, determined by primer extension assay, are shown. (d) Total transcription activity of Pb1 and Pb2 was measured under the same conditions as in panel (b) using primer extension assay. Data represent means and SE from three independent experiments. (e) TSS position(s) on different promoters. T and A indicate Sanger sequencing ladders prepared using TTP and ATP analogues as chain terminators, respectively. The inferred TSS positions (in parentheses), the lengths of the corresponding primer–extension products, and the sequence of the rDNA promoter template strand are shown to the left of the gel scan. The free OP018 primer is indicated with dotted line. All samples shown were run on the same denaturing 12% PAGE gel, which was scanned for Cy5.5 fluorescence in OP018 primer; full gel scan is provided in Supplementary Materials 2. (f) Primer extension gel and its quantification for transcription activity level on Pb2 promoter with different components of the Pol I basal system; transcription activity of Pol I·Rrn3·CF was normalized to 1. All samples shown were run on the same denaturing 12% PAGE gel, which was scanned for Cy5.5 fluorescence in OP003 primer; full gel scan is provided in Supplementary Materials 2. (g) Representative PAGE gel compares the TSS selection by Pol I on the full-length rDNA promoter (span −200/+150) embedded in either linear or negatively supercoiled (plasmid) DNA. Transcription reactions were performed using either the basal Pol I system (Pol I, Rrn3, CF) or the holoenzyme (Pol I, Rrn3). T and C indicate Sanger sequencing ladders prepared using TTP and CTP analogues as chain terminators, respectively. The inferred TSS positions (in parentheses), and the lengths of the corresponding primer–extension products and terminating base identities in the rDNA promoter template strand are shown to the left of the gel scan. The free OP018 primer is indicated with dotted line. All samples shown were run on the same denaturing 12% gel, which was scanned for Cy5.5 fluorescence in OP018 primer; full gel scan is provided in Supplementary Materials 2. (h) Kernel density estimation plot of transcription levels by Pol I·Rrn3·CF complex at each TSS position on a negatively supercoiled promoter (n = 8). The y-axis shows relative transcription activity, the x-axis displays TSS positions, and the z-axis (color gradient) indicates TSS density. The calibration bar represents TSS density (probability) at heatmap from 0 to 0.285, with maximum density 1. Labels (+NTPs) and (−) in panels (e)–(g) refer to transcription reactions performed in the presence or absence of NTPs, respectively. Data are mean and SE from the indicated number (n) of independent experiments. The statistical significance of is reported using ref. as the reference sample.

rDNA promoter bending and deformation in MD simulation trajectories. (a) Snapshots of promoter DNA conformations illustrate the difference in DNA bending between CC and protein-free simulations. The approximate binding sites of Pol I and CF are indicated with gray and green curves, respectively. The template strand is shown in blue and the nontemplate strand in purple. The region of pronounced minor groove widening (bp −18 to −22) is highlighted in yellow. Dashed lines visualize represent approximate helical axis vectors used to measure DNA bending angle (θ). (b) Histogram of DNA bending angles in the CC (protein-bound) or protein-free (free DNA) simulations. (c) Bar graph shows the frequency of DNA base pairing in different promoter positions. (d) Major groove widths of promoter DNA in the CC and protein-free simulations. (e) Minor groove widths of promoter DNA in the CC and protein-free simulations. DNA parameters were calculated using do_x3dna and analyzed using dnaMD Python module. Each plot combines data from 16 independent MD simulations.

Interactions of CF with rDNA promoter in MD simulation trajectories. (a) Cartoon putty representation of CF contacts with the rDNA promoter. The thickness of the cartoon reflects contact frequency. The contacts are color-coded by protein region, as detailed in panels (b) and (c). The DNA template strand is shown in blue, and the nontemplate strand in pink. (b) The frequency of Rrn7 contacts with DNA, defined as the percentage of trajectory frames in which a contact occurs between DNA and a residue in the Rrn7 subunit of CF. (c) Frequency of Rrn11 contacts with DNA, calculated as in panel (b). (d) Representation highlights the protein residues from Rrn11 (red) and Rrn7 (orange) that form hydrogen bonds with DNA bases. (e) Representation highlights the insertion of R293 between bps −26 and −27.

Equations8

\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \begin{eqnarray*} y = \frac{{{{A}_1}}}{{{{w}_1}\sqrt {\pi /2} }}{{e}^{ - 2\frac{{{{{\left( {x - {{m}_{c,1}}} \right)}}^2}}}{{{{{\left( {{{w}_1}} \right)}}^2}}}}} + \frac{{{{A}_2}}}{{{{w}_2}\sqrt {\pi /2} }}{{e}^{ - 2\frac{{{{{\left( {x - {{m}_{c,2}}} \right)}}^2}}}{{{{{\left( {{{w}_2}} \right)}}^2}}}}}. \end{eqnarray*}\end{document}

\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \begin{eqnarray*} && {{\left[ {{\mathrm{CF}}\cdot{\mathrm{DNA}}\left( {{\mathrm{Cy}}3} \right)} \right]}_{{\mathrm{fraction}}}} \\&& =\frac{{\left( {[{\mathrm{DNA}}{{{\left( {{\mathrm{Cy}}3} \right)}}_{{\mathrm{total}}}}} \right] + [{\mathrm{C}}{{{\mathrm{F}}}_{{\mathrm{total}}}}] + K_{\mathrm{D}}^{{\mathrm{app}}}) - \sqrt {\left( {[{\mathrm{DNA}}{{{\left( {{\mathrm{Cy}}3} \right)}}_{{\mathrm{total}}}}} \right] + [{\mathrm{C}}{{{\mathrm{F}}}_{{\mathrm{total}}}}] + {{K}_{\mathrm{D}}}{{)}^2} - 4 \times \left[ {{\mathrm{DNA}}{{{\left( {{\mathrm{Cy}}3} \right)}}_{{\mathrm{total}}}}} \right] \times [{\mathrm{C}}{{{\mathrm{F}}}_{{\mathrm{total}}}}]} }}{{2 \times [{\mathrm{DNA}}{{{\left( {{\mathrm{Cy}}3} \right)}}_{{\mathrm{total}}}}]}}.\\ \end{eqnarray*}\end{document}

\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \begin{eqnarray*} {{\left[ {{\mathrm{DNA}}\left( {{\mathrm{Cy}}3} \right)} \right]}_{{\mathrm{fraction}}}} = 1 - {{\left[ {{\mathrm{CF}}\cdot{\mathrm{DNA}}\left( {{\mathrm{Cy}}3} \right)} \right]}_{{\mathrm{fraction}}}}. \end{eqnarray*}\end{document}

\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \begin{eqnarray*} {{F}_{{\mathrm{total}}}} = \ {{\left[ {{\mathrm{DNA}}\left( {{\mathrm{Cy}}3} \right)} \right]}_{{\mathrm{fraction}}}} \times {{F}_{{\mathrm{free}}}} + \ {{\left[ {{\mathrm{CF}}\cdot{\mathrm{DNA}}\left( {{\mathrm{Cy}}3} \right)} \right]}_{{\mathrm{fraction}}}} \times {{F}_{{\mathrm{bound}}}}. \end{eqnarray*}\end{document}

\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \begin{eqnarray*} y = A \times {{e}^{ - {{k}_{{\mathrm{obs}}}} \times t}} + m \times t + {{y}_0}. \end{eqnarray*}\end{document}

\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \begin{eqnarray*} y = A \times {{e}^{ - {{{\left( {{{k}_{{\mathrm{obs}}}} \times t} \right)}}^\beta }}} + {{y}_0}. \end{eqnarray*}\end{document}

\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \begin{eqnarray*} \tau = \frac{{{{{(\ln 2)}}^{\left( {\frac{1}{\beta }} \right)}}}}{{{{k}_{{\mathrm{obs}}}}}}. \end{eqnarray*}\end{document}

\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \begin{eqnarray*} {\mathrm{median}}\ {{k}_{obs}} = \frac{{\ln 2}}{\tau }. \end{eqnarray*}\end{document}

Funding7

—Research Council of Finland10.13039/501100002341
—Sigrid Juséliuksen säätiö10.13039/501100006306
—Turun Yliopistosäätiö10.13039/501100022793
—University of Turku Graduate School10.13039/501100019391
—Suomen Kulttuurirahasto10.13039/501100003125
—Emil Aaltosen Säätiö10.13039/501100004756
—Sigrid Juseliuksen säätiö

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRNA modifications and cancer · Genomics and Chromatin Dynamics · RNA Research and Splicing

Full text

Introduction

Cell growth and homeostasis rely on efficient protein synthesis, which is heavily influenced by the number of functional ribosomes [1, 2]. A critical checkpoint in ribosome production is the transcription of ribosomal RNA genes (rDNA), responsible for over 60% of total RNA synthesis in cells [3]. Transcription is key process in regulating ribosome synthesis, responding to various environmental signals related to growth control. Multicopy 35S rDNA gene is transcribed by RNA polymerase I (Pol I). Dysregulation of Pol I transcription and impaired ribosome biogenesis is implicated in numerous diseases, including ribosomopathy syndromes, developmental disorders, and many cancers [1, 4–6]. Cancer cells hijack transcription to activate ribosomal RNA synthesis [1, 3]. Notably, upregulated genes encoding the components of the Pol I transcription machinery are frequently found in various cancers [7].

Pol I transcription system of Saccharomyces cerevisiae (hereafter referred to as yeast) is homologous to that in mammals, making yeast a valuable model for studying fundamental aspects of eukaryotic Pol I transcription. The basal protein machinery for promoter-specific transcription initiation consists of Pol I, Rrn3, and SL1/TIF-IB in mammals or core factor (CF) in yeast [8, 9]. CF (or SL1) binds core rDNA promoter element and recruits the Pol I·Rrn3 complex to the adjacent site, enabling the formation of transcription initiation closed complex (CC) [10, 11]. DNA unwinding and loading into Pol I active site induce RNA-synthesis capable conformation, an open complex (OC) [12–16]. Next, the growing nascent RNA triggers further structural rearrangements, including the breakage of Pol I interaction with CF and Rrn3, leading to the formation of transcription elongation complex, characterized by productive RNA synthesis downstream from the promoter. The suggested mechanism of Pol I specificity to rDNA promoter relies on a distinct “bendability” and “meltability” of the promoter sequence that enables contacts between initiation factors, DNA, and polymerase [13, 15]. The assembly of basal Pol I apparatus on the promoter is stimulated by additional transcription factors binding the upstream element of DNA promoter, such as UAF [17, 18] in yeast or UBF in mammals. Another class of Pol I activators, e.g. Net1 [19, 20], and TIF-IF [20], enhance essential protein–protein interactions but do not have specific binding site on the rDNA promoter. While the structural basis of the Pol I machinery, including the initiation complex, has been recently extensively studied using cryo-electron microscopy (cryo-EM) [12–16], the functional aspects require further investigation. Detailed studies of Pol I function can reveal the molecular mechanisms controlling rDNA transcription and identify new druggable targets.

We combined biochemical, biophysical, and computational analyses to study how the basal yeast Pol I transcription initiation apparatus recognizes the rDNA promoter and determines the transcription start site (TSS). Our data show that CF recognizes its binding site on the rDNA using a two-phase mechanism of initial binding and isomerization. We showed that the stability of CF·promoter complex largely depends on the specific interaction at the upstream edge of the CF binding site. Promoter-bound CF then recruits Pol1·Rrn3 complex to the promoter. TSS recognition by Pol I, at least in part, appears to be guided by its spatial relationship to promoter-bound CF and by the intrinsic physical properties of the DNA, particularly the bendability of the segment near the upstream edge of the transcription bubble. Our findings highlight the importance of rDNA promoter properties and key sequence motifs in defining efficient rDNA transcription, including transcription rate, specificity, and TSS selection.

Materials and methods

Protein production and purification

Purification of Pol I [21], Rrn3 [11], and CF [10, 22] proteins followed previously published protocols with modifications. Detailed purification protocols are described Supplementary Materials 1. Briefly, Pol I was purified from S. cerevisiae SC1613 (constructed by Cellzome AG, Germany), which has a TAP tag in the chromosomal gene of AC40 subunit of Pol I [23], grown in a bioreactor and harvested at exponential growth phase. Recombinant Rrn3 and CF were expressed in Escherichia coli using expression plasmids (Supplementary Table S1). The purification process of all proteins involved several chromatographic steps, followed by protein concentration and analysis with Nanodrop, sodium dodecyl sulfate–polyacrylamide gel electrophoresis (SDS–PAGE), and western analysis.

Size exclusion chromatography of protein preparations

Transcription factor preparations used in biophysical assays [mass photometry (MP) and protein-induced fluorescence enhancement] were further polished using size-exclusion chromatography (SEC). Prior to the gel filtration step, the His-tag in CF was cleaved off by supplementing concentrated CF sample with TEV protease and incubating the sample either overnight at 4°C or 2 h at 16°C in a thermo-shaker (Grant Instruments) set to 1000 rpm. A Superose 6 Increase 10/300 GL gel filtration column (Cytiva) pre-equilibrated with GF-CF buffer [20 mM HEPES, pH 7.8, 300 mM NaCl, 1 mM MgCl_2_, 0.1 mM ethylenediaminetetraacetic acid (EDTA), and 0.1mM dithiothreitol (DTT)] resolved TEV-treated CF sample (~500 µl) at 0.4 ml/min flow rate using Äkta purifier FPLC (GE Healthcare). Resolved proteins were collected in 500 µl fractions and analyzed by NanoDrop spectrophotometer (Thermo Fisher Scientific) and SDS–PAGE. Size-exclusion chromatography of Rrn3 was carried out similarly to CF with the exception that GF-R buffer (20 mM HEPES, pH 7.8, 150 mM NaCl, 1 mM MgCl_2_, 0.1 mM EDTA, 0.1 mM DTT) was used. Final preparations of CF and Rrn3 were concentrated using centrifugal filters (Amicon Ultra-4, Millipore), stabilized with the addition of glycerol to final concentration 20% (v/v), aliquoted, and flash frozen in liquid nitrogen for storage at −80°C.

DNA template preparation

Oligonucleotides encoding wild-type (WT) or mutant S. cerevisiae rDNA promoters were purchased from Merck or Eurofins Genomics. To prepare double-stranded promoter scaffolds, the equimolar amounts of template and nontemplate strands (5–10 µM) were mixed in 0.1% (v/v) DEPC-treated buffer (10 mM HEPES-KOH, pH 7.5, 50 mM NaCl, and 100 µM EDTA) and annealed by using a polymerase chain reaction (PCR) machine at 94°C, 3 min to gradually cool the mixture from 94°C to 4°C.

To prepare circular supercoiled transcription templates, rDNA promoter fragment from position −200 to +150 (relative to the TSS at +1) or promoter-free fragment from ATG13 gene (length 371 bp) was cloned by PCR from the genomic DNA of S. cerevisiae INVSc1 (Thermo Fisher Scientific) and ligated between SacI and HindIII restriction sites in pUC18 plasmid (Supplementary Table S1). Plasmids, pAM036 (contains rDNA promoter) and pAM041 (ATG3 fragment) were maintained and amplified using E. coli XL1-blue cells. The plasmids were purified using GeneJET Plasmid Miniprep Kit (Thermo Fisher Scientific) according to the manufacturer’s protocol. The linear scaffolds of promoter (−200/+150) and nonpromoter DNA (ATG3) were produced by PCR amplification of the target fragment from pAM036 and pAM041 vectors, respectively (Supplementary Table S2). The DNA sequences of all transcription templates are given in Supplementary Table S3.

In vitro transcription

In vitro transcription reactions were performed as previously described with modifications [24, 25]. Plasmids (circular, negatively supercoiled) and linear templates containing either native, truncated native, base-substituted, pre-melted, or fork junction of rDNA promoter were examined. Basal preinitiation complex (PIC) was reconstituted by first mixing 2.4 pmol of synthetic DNA template with 15 pmol of CF and incubating the mixture in TIB10 [TIB buffer (100 mM HEPES/50 mM KOH, pH 7.5, 100 mM potassium glutamate, 0.025 mM ZnCl_2_, 5% (v/v) glycerol, 1 mM EDTA, 0.2 mM tris(2-carboxyethyl)phosphine (TCEP), DEPC treated) supplemented with 10 mM Mg-acetate] for 15 min at 22°C. Then, a pre-incubated Pol1·Rrn3 complex (6 pmol Pol I and 12 pmol Rrn3) in TIB2 was added in ratio 1:1, and the incubation was continued for another 15 min at 30°C. Transcription reaction was initiated by the addition of equal volume of 1 mM NTPs (ATP/CTP/GTP/UTP) in TIB10 to PIC. The final volume of transcription reaction mixture was 10 µl and contained 1 pmol template DNA, 2.5 pmol Pol I, 5 pmol Rrn3, 6.25 pmol CF, and 5 nmol (0.5 mM) NTPs in TIB8; the reaction tube was incubated in a thermo-mixer (Grant) at 30°C for 20 min. Control transcription reactions without NTP addition were prepared in parallel. Transcription reaction was terminated by adding 1 µl of 105 mM EDTA (~9.5 mM final) and immediate heating of the samples at 75°C for 5 min.

The quantity and TSS of RNA products were determined using primer extension assay. The inactivated transcription reactions were supplemented with 1.25–6 pmol of fluorophore-labeled DNA primer (OP003, OP021, or OP018; Supplementary Table S2) and 5 mM of unchelated Mg^2+^ (by adding1 µl of 150 mM MgCl_2_). The primer was annealed to the target site on the RNA transcripts by heating the samples at 95°C for 80 s followed by cooling on ice for 15 min. The estimated minimal RNA transcript detectable by primer extension was 27 nt for OP18 (annealing to positions from +9 to +27), and 51 nt for OP03 (annealing at +32 to +51). In the Pb2 scaffold containing a deletion of region from +22 to +31, the detectable length for OP03 decreased to 41 nt. The assay also detects shorter RNA products when Pol I initiates transcription downstream of the +1 site. In contrast, in vitro transcription products shorter than the primer annealing regions remain undetectable by primer extension. To monitor spurious transcription in the opposite direction, primer OP021 was employed; this primer is complementary to promoter region from −163 to −142. Reverse transcription reaction was initiated by the addition of 12 µl of mixture containing 100 U SuperScript II Reverse Transcriptase (Invitrogen), 0.5 mM dNTPs, 5 mM DTT, 10 U Murine RNase inhibitor (New England Biolabs), and 50 µg/ml Actinomycin D in 1× SuperScript buffer. The reactions (total vol. ~25 µl) were incubated at 47°C for 1 h, and then terminated by the addition of formamide gel loading buffer (92.5% formamide, 0.039 M LiOH, 0.013 M EDTA, 0.23% OrangeG) to 1:1 volume ratio, followed by heating at 95°C for 3 min. The single-stranded DNA products of reverse transcription reaction were separated on 12% denaturing polyacrylamide gels and quantitated/visualized with an Odyssey Infrared Imager (Li-Cor Biosciences) at 700-nm channel. Complete gel scans are provided in Supplementary Materials 2. The TSS of transcribed DNA templates was determined by comparing electrophoretic mobility of transcription reactions to that of the manual Sanger sequencing run on the same gel. Sequencing reactions were done using OP018 primer and Applied Biosystems Thermo Sequenase Dye Primer Manual Cycle Sequencing Kit (Thermo Fisher Scientific) according to the manufacturer’s protocol.

Gel image analysis

Each transcription product, i.e. a gel band deriving from distinct length RNA molecules (length varies because of different TSS’s) in the reaction mixture, was quantified by using Fiji software [26] to draw a polygonal selection contour around the gel band and calculating the total fluorescence intensity inside the contour. Background fluorescence value was subtracted from the RNA band intensity by using the average background per pixel, determined in at least three random RNA-free gel areas, multiplied for the contour area. The band contours in the negative control lane (no NTP’s added to the reaction) were positioned based on the RNA band positions in the NTP containing lane, and their intensities were analyzed similarly. Finally, the background corrected fluorescence intensity of each RNA band was converted to molar RNA concentration by comparison to the intensity of free primer band on the lane. Since the concentration of primer was known in total sample volume at each step of in vitro transcription, the primer concentration in loaded aliquot resolved on PAGE was calculated. Because the activity of Pol I varied between different purification batches, transcription activity was normalized based on the transcription activity observed on the WT rDNA promoter [(−90/+30), (−45/+45), and (−30/+30)].

Exonuclease III footprinting

DNA scaffolds for footprinting consisted of two complementary DNA strands; the strand designed to be footprinted contained Cy5.5 fluorophore at the 5′ end, while the other strand was protected against exonuclease activity by six phosphorothioates bonds at the 3′ end (Supplementary Table S3) [27]. CF·DNA complexes were formed by incubating 1.5 pmol of CF with 0.15 pmol of DNA template in DEPC-treated FTB5 buffer [50 mM Tris–HCl, 7.5, 100 mM potassium glutamate, 5% (v/v) glycerol, 5 mM MgCl_2_, 0.2 mM TCEP, 0.1 mM EDTA] for 15 min at 30°C. The samples were then treated with 600 units of exonuclease III (Exo III, 200 U/µl stock, Thermo Fisher Scientific, cat. no EN0191) for 4 min at 30°C. The reactions were stopped by the addition of 20 mM EDTA and 1% (w/v) sodium dodecyl sulphate. The control reactions were prepared in the absence of CF or Exo III treatment. DNA was extracted by the additions of glycogen (RNA grade, Thermo Scientific, cat. no R0551) to final concentration of 0.5 mg/ml and chloroform:phenol:isoamyl alcohol mixture (pH 8.0, RNAse-, DNAse-, and Proteinase-free, Across Organics, cat. no 327 111 000) followed by DNA recovery from the aqueous phase by ethanol precipitation. DNA pellet was dissolved in the 1:1 mixture of DEPC-treated water and formamide gel loading buffer, heated at 95°C for 3 min, and then analyzed on denaturing 10%–12% polyacrylamide gels. Footprinting patterns were visualized with Odyssey Infrared Imager (Li-Cor Biosciences). The positions of DNA region protected by protein binding were identified by correlating the electrophoretic mobilities of the sample DNA bands with those of Sanger sequencing ladders, which were prepared similar to those for the analysis of in vitro transcription reactions.

Mass photometry

MP was conducted with a Refeyn Two^MP^ Mass Photometer (Refeyn Ltd) at 22°C. In the MP assay, a protein molecule or complex lands on surface of the microscope slide interfering the light reflected from the surface; the change in reflected light scales linearly with the mass allowing accurate determination of molecular weights [28]. Objective was cleaned with isopropanol and covered with Immersol 518F (Zeiss). Microscope coverslips (w × l: 24 × 50 mm, Refeyn) and alignment chamber (Refeyn) were sequentially cleaned with milli-Q water, isopropanol (HPLC grade), and milli-Q water followed by air stream drying. To prepare a reaction well, a silicone gasket (24 samples well cassettes for automated system, Refeyn) was installed on the coverslip, followed by mounting on the objective. Before each data acquisition, the z plane focus was identified using buffer-free focus. Next, 18 µl of target protein(s) or preformed CF·DNA complex was added to the reaction well, and the movies containing 8853 frames were recorded for 180 s using AcquireMP software (Refeyn Ltd).

To determine the molecular masses and oligomeric states of individual proteins, stocks were diluted to 40–50 nM protein concentration in specific reaction buffers: CF (FTB5 buffer), Rrn3 (TIB2), and Pol I (TIB2). To determine the apparent dissociation constant of CF·promoter complex, 50 nM CF was mixed with 0, 25, 50, or 100 nM WT rDNA promoter scaffold (span −45/+45) and incubated for 5 min at 24°C before MP data collection. To study CF binding to promoters harboring a mutated bp, 50 nM CF was mixed with 25 nM rDNA promoter (span −45/+45 and bp substitution at −27 or −28) and incubated for 5 min at 24°C before MP data collection. Each MP analysis was performed in at least three independent experiments.

To obtain contrast-to-mass calibration of MP data, a protein standard mixture, containing bovine serum albumin (molecular mass 65 kDa), IgG (150kDa monomer and 300 kDa dimer), apoferritin (443 kDa), and IgM (970 kDa) in each specific reaction buffer, was analyzed along with the actual samples in each experiment. The mean protein peak contrast was first determined using Gaussian fit procedure in AcquireMP software (Refeyn Ltd). The mean contrast values from calibration protein mixture were then plotted and fitted to linear regression to define the contrast-to-mass calibration factor in DiscoverMP v2.3 (Refeyn Ltd). Quantification of the molecular weights of CF, Rrn3, Pol I, and their oligomerization were processed and analyzed using DiscoverMP v2.3. MP data to determine the apparent molecular weights and abundances of DNA-free CF and CF·DNA complex was additionally analyzed and plotted by using Origin 2016 (OriginLab Corporation) to fit the histograms of apparent molecular weight values to two-peak Gaussian equation (an area version) (Equation 1). The fit parameters mC, w, and A are the center, width, and area of the two Gaussian distributions, respectively.

\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \begin{eqnarray*} y = \frac{{{{A}_1}}}{{{{w}_1}\sqrt {\pi /2} }}{{e}^{ - 2\frac{{{{{\left( {x - {{m}_{c,1}}} \right)}}^2}}}{{{{{\left( {{{w}_1}} \right)}}^2}}}}} + \frac{{{{A}_2}}}{{{{w}_2}\sqrt {\pi /2} }}{{e}^{ - 2\frac{{{{{\left( {x - {{m}_{c,2}}} \right)}}^2}}}{{{{{\left( {{{w}_2}} \right)}}^2}}}}}. \end{eqnarray*}\end{document}

We measured blank 50 nM CF samples before each promoter binding experiment to verify the integrity CF sample. These data indicated that a minor mass peak (at about 284.4 kDa), which represented 4.3% of DNA-free CF abundance and most likely is a contaminating protein, overlapped with the mass of CF·DNA complex. This contribution was subtracted from the abundance of CF·DNA complex (ACF·DNA) before calculating the relative amounts of DNA-free CF and CF·DNA complex, respectively.

Protein-induced fluorescence enhancement assay: spectrofluorometer detection

The apparent equilibrium dissociation constant (KD^app^) of CF·promoter complex in manual mixing experiments was first determined for 25 nM DNA(Cy3) [contains −40/−10 rDNA promoter region and Cy3 label at the 3′ of nontemplate DNA (ntDNA) strand] in 50 µl of binding buffer into a quartz cuvette (Hellma Analytics, light path 3 × 3 mm, cat.no. HL105-251-15-40). Fluorescence intensity of this protein-free promoter sample was continuously measured for 120 s with a LS-55 fluorescence spectrofluorometer (Perkin Elmer) at 24°C. The excitation and emission wavelengths were set to 545 and 565 nm, respectively, and emission and excitation slits to 10 nm. The reaction mixture in the cuvette was then supplemented with 12.5–200 nM CF in FTB5, thoroughly mixed, and fluorescence recording was continued for an additional 300 s. The promoter concentration after CF addition was 24 nM. KD^app^ was estimated by simultaneously fitting Equations 2–4 to the measured fluorescence values (Ftotal), which account for changes in free CF concentration upon DNA binding. In these equations, (CF_total_) represents the total concentration of CF, and [DNA(Cy3)total] denotes the total concentration of Cy3-labeled DNA. The parameters Ffree and Fbound correspond to the limiting fluorescence intensity when all DNA(Cy3) is either protein-free or fully bound to CF, respectively. The fraction of DNA(Cy3) bound to CF is expressed as [CF·DNA(Cy3)]fraction, while the fraction of protein-free DNA(Cy3) is given by [DNA(Cy3)]fraction. Data fitting was performed using Origin 2016 (OriginLab Corporation). For MP experiments using label-free DNA, the same equations were applied, except that Ffree and Fbound were fixed as 0 and 1, respectively.

\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \begin{eqnarray*} && {{\left[ {{\mathrm{CF}}\cdot{\mathrm{DNA}}\left( {{\mathrm{Cy}}3} \right)} \right]}_{{\mathrm{fraction}}}} \\&& =\frac{{\left( {[{\mathrm{DNA}}{{{\left( {{\mathrm{Cy}}3} \right)}}_{{\mathrm{total}}}}} \right] + [{\mathrm{C}}{{{\mathrm{F}}}_{{\mathrm{total}}}}] + K_{\mathrm{D}}^{{\mathrm{app}}}) - \sqrt {\left( {[{\mathrm{DNA}}{{{\left( {{\mathrm{Cy}}3} \right)}}_{{\mathrm{total}}}}} \right] + [{\mathrm{C}}{{{\mathrm{F}}}_{{\mathrm{total}}}}] + {{K}_{\mathrm{D}}}{{)}^2} - 4 \times \left[ {{\mathrm{DNA}}{{{\left( {{\mathrm{Cy}}3} \right)}}_{{\mathrm{total}}}}} \right] \times [{\mathrm{C}}{{{\mathrm{F}}}_{{\mathrm{total}}}}]} }}{{2 \times [{\mathrm{DNA}}{{{\left( {{\mathrm{Cy}}3} \right)}}_{{\mathrm{total}}}}]}}.\\ \end{eqnarray*}\end{document}

\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \begin{eqnarray*} {{\left[ {{\mathrm{DNA}}\left( {{\mathrm{Cy}}3} \right)} \right]}_{{\mathrm{fraction}}}} = 1 - {{\left[ {{\mathrm{CF}}\cdot{\mathrm{DNA}}\left( {{\mathrm{Cy}}3} \right)} \right]}_{{\mathrm{fraction}}}}. \end{eqnarray*}\end{document}

\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \begin{eqnarray*} {{F}_{{\mathrm{total}}}} = \ {{\left[ {{\mathrm{DNA}}\left( {{\mathrm{Cy}}3} \right)} \right]}_{{\mathrm{fraction}}}} \times {{F}_{{\mathrm{free}}}} + \ {{\left[ {{\mathrm{CF}}\cdot{\mathrm{DNA}}\left( {{\mathrm{Cy}}3} \right)} \right]}_{{\mathrm{fraction}}}} \times {{F}_{{\mathrm{bound}}}}. \end{eqnarray*}\end{document}

The binding of CF to different transcription DNA scaffolds was assessed in competition experiments using different label-free DNA constructs to trigger the dissociation of CF from DNA(Cy3). In competition experiments, the fluorescence intensity traces were continuously measured in total for 1800 s in three sequential steps: (i) free DNA(Cy3), (ii) the formation of CF·DNA(Cy3) complex, and (iii) competitor-induced dissociation of CF from CF·DNA(Cy3) complex. Fluorescence intensity of 25 nM DNA(Cy3) and DNA(Cy3)·CF complex (sample contained 24 nM DNA and 120 nM CF) was determined as described in KD^app^ experiment. After the addition of competitor DNA to 227 nM concentration, the reaction mixture at the third step contained also 23 nM DNA(Cy3) and 115 nM CF. The fluorescence values were corrected for dilution accompanying the competitor addition (~4% volume increase). The reported fluorescence dissociation traces are means with SE. Presented data were normalized from 0 to 1 intensity scale by setting the increment in fluorescence intensity between protein-free DNA(Cy3) and CF mixture with DNA(Cy3) (before the competitor addition) as 1, respectively. In addition to presenting data as full kinetic dissociation traces, fluorescence intensity over the last 15 s was averaged to construct bar-error graphs (mean ± standard error, SE) that represent the remaining fraction of CF·DNA(Cy3) complex. Data for KD^app^ estimation and competition of label-free DNA templates were measured in FTB5 buffer at least in three independent experiments and averaged. KD^app^ in reduced buffer ionic strength was determined in LTB5 (25 mM Tris–HCl, 50 mM K-glutamate, 5% glycerol, 5 mM MgCl_2_, 0.1 mM TCEP, 0.05 mM EDTA).

Protein-induced fluorescence enhancement assay: stopped-flow detection

Measurements were performed at 24°C using SFM-3000 stopped-flow (SF) instrument equipped with FC-08 reaction cuvette (BioLogic). The instrument was operated using 4 ms sample mixing dead time. Cy3 fluorophore in the promoter templates was excited at 553 nm and emitted light was collected through a 570-nm longpass filter. Total reaction volume in SF experiments was 150 µl, which was obtained by mixing 75 µl from each of two active sample storage syringes (one typically containing CF and the other containing the promoter) in the SF instrument.

The rate of CF binding to the promoter DNA was determined by mixing to final concentrations 12.5–120 nM CF with 24 nM DNA(Cy3) scaffold in FTB5 buffer. Fluorescence signal (i.e. detector voltage) of the reaction mixture was continuously recorded for 100 s using two time-base settings: for reaction time range 0.004–7.45 s, data were integrated 0.5 ms per each time-point, while for the range 7.47–100 s the integration was 50 ms. Fluorescence signal of protein-free DNA was measured by mixing DNA(Cy3) with FTB5 buffer.

The rate of CF dissociation from the promoter DNA was determined by mixing preformed CF·DNA(Cy3) complex [concentrations after mixing: 115 nM CF and 23 nM DNA(Cy3); two-fold higher during the preformation] with 227 nM (after mixing) unlabeled competitor rDNA promoter (span −30/+30). Reference signal of protein-free DNA(Cy3) promoter was determined by mixing 23 nM DNA(Cy3) and FTB5 to the reaction cuvette of the SF instrument. Fluorescence intensity was recorded using 0.5 and 50 ms data integration times for reaction time ranges 0.004–7.45 and 7.47–250 s, respectively.

Each reported SF curve is the average of 6–8 individual SF traces (a.k.a. mixing reactions or SF shots) from which the fluorescence curve of protein-free DNA(Cy3) has been subtracted, and which was followed by the normalization of the initial fluorescence level of the reaction (over the first ~0.3 s where CF binding/dissociation is not yet observed) as 1. For CF dissociation data, the reaction end-point was additionally normalized to zero fluorescence. The apparent rate of CF binding to promoter DNA (kobs) was extracted by fitting Equation 5 to fluorescence data: A in the equation is the fluorescence change amplitude of the exponential reaction phase, y0 is the fluorescence at the end-point of the exponential reaction phase, t is the reaction time (i.e. abscissa), and m is the slope of the linear reaction phase. CF binding data obtained using WT DNA(Cy3) were strictly mono-exponential whereas mutant scaffolds [DNA(Cy3,−27) and DNA(Cy3,−28)] had an additional slow linear increase in fluorescence. Consequently, m was set as 0 for WT data and left as a free parameter for mutant data.

\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \begin{eqnarray*} y = A \times {{e}^{ - {{k}_{{\mathrm{obs}}}} \times t}} + m \times t + {{y}_0}. \end{eqnarray*}\end{document}

The reaction progress curves for the binding of CF to nonspecific DNA or dissociation of any preformed CF·DNA complexes deviated from exponential curve. In these cases, the median reaction rate was extracted by first fitting the stretched mono-exponential function (Equation 6) to fluorescence curves [29, 30], and then calculating median reaction time (τ) and median kobs using Equations 7 and 8. The stretching parameter β modifies the product of the observed rate parameter (kobs) and reaction time (t) to accommodate the deviation of the reaction progress curve from single exponential behavior. Data fitting was performed using Origin 2016 (OriginLab Corporation).

\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \begin{eqnarray*} y = A \times {{e}^{ - {{{\left( {{{k}_{{\mathrm{obs}}}} \times t} \right)}}^\beta }}} + {{y}_0}. \end{eqnarray*}\end{document}

\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \begin{eqnarray*} \tau = \frac{{{{{(\ln 2)}}^{\left( {\frac{1}{\beta }} \right)}}}}{{{{k}_{{\mathrm{obs}}}}}}. \end{eqnarray*}\end{document}

\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \begin{eqnarray*} {\mathrm{median}}\ {{k}_{obs}} = \frac{{\ln 2}}{\tau }. \end{eqnarray*}\end{document}

The binding of CF to its cognate binding site on the DNA is a reversible reaction. Such a reversible system approaches equilibrium at an observed rate, which is defined by the sum of forward and reverse reaction rates (k_obs _= k_for _+ krev) [31]. The stretched exponential function (Equation 6) is an empirical function often used to describe heterogeneous systems, including enzyme reactions [29, 30]. Based on a previous discussion of different potential sources of heterogeneity [30], we assume that in our case the stretched exponential fit, specifically the stretching parameter β, accommodates both temporal and structural heterogeneity of CF·DNA complexes, as well as deviations from single exponential behavior caused by the sequential nature of the dissociation reaction (note that also the specific binding happened in a two-step reaction).

Molecular modeling

System preparation was conducted with Schrödinger Suite software version 2024-1 (Schrödinger Inc.). The starting structure for the modeling was cryo-EM-based structure of S. cerevisiae PIC, showing Pol I, Rrn3, and CF bound on an rDNA promoter at 2.90 Å resolution [15]. We selected closed conformation 2 (CC2) over CC1 because of its better resolution. This structure (accession code: 6RQL) was downloaded from the PDB database and prepared using the Protein Preparation Workflow of Maestro with default settings. This included filling in missing sidechains and modeling of shorter (<30 residues) missing loops in chains A, B, C, G, H, N, Q, R, and S, optimizing hydrogen bonds and protonation using PROPKA at pH 7.4, deleting water molecules >5 Å from the heteroatoms and a short energy minimization using OPLS4 force field [32]. Six structural zinc ions were included from PDB 6RUO. For the free DNA simulations, the DNA was extracted from the complex and further processed as described below.

For the molecular dynamics (MD) simulations, the complex was prepared using tleap from AmberTools24. The complex was solvated with TIP3P water molecules with 12 Å buffer distance and the system was neutralized and physiological salt concentration of 0.15 M was added using Na and Cl ions (542 and 450 ions, respectively). The resulting simulation system of the complex had ~684 000 atoms. MD simulations were conducted with Amber software using force fields ff19SB for the protein [33] and OL21 [34] for the DNA on the CSC (IT Center for Science, Finland) supercomputer LUMI. The minimization and equilibration steps were following: (i) all nonwater atoms constrained, (ii) heavy atoms constrained, (iii) protein back bone constrained, and (iv) no constraints. The constraint force was 50 kcal/mol in the minimizations and 10 kcal/mol in the equilibration simulations. The minimizations 1–4 used the steepest descent algorithm with a maximum of 10 000 steps. The equilibration steps 1–3 consisted of 0.4 ns simulations, and step 4 was a 4 ns simulation. In the first equilibration step, the system was heated to 310 K. The temperature and pressure were maintained with Langevin thermostat and Berendsen barostat in the equilibration simulations. During the production runs, the NPT ensemble was used: the 1.0 bar pressure was maintained with Monte Carlo barostat and 310 K temperature by Langevin thermostat. The files were made ready for analysis by aligning and centering the complex, stripping away water molecules, Na, and Cl ions, and writing the output in xtc format using cpptraj tool. The simulations are available on Zenodo, DOI 10.5281/zenodo.15687574. The total simulation time was 16 µs (16 × 1000 ns) for both systems and the frames were saved for every 0.2 ns (80 000 frames in total). The simulations were analyzed using MDAnalysis [35, 36] and dnaMD Python packages 3DNA [37, 38] and cpptraj [39].

Statistical analyses

Statistical analysis was performed using IBM SPSS Statistics 29.0 package. The assumption of normal data distribution was confirmed by kurtosis and skewness assessment, visual examining Q-Q plots and Shapiro–Wilk (S-W) normality test. Next, homogeneity of variances was examined by Levene test with the null hypothesis of equal variances rejected when P < .05. Unequal variances (Levene statistics, *P *< .05) and/or unequal sample sizes validated Welch’s ANOVA model. Finally, intergroup comparisons were carried out by post-hoc analysis using Games–Howell, when the Welch’s *P-*value was significant. Descriptive data are presented as mean and standard deviation (SD) or standard error (SE). The main outputs of statistical analysis are summarized in Supplementary Table S4.

Results

Oligomeric state of purified RNA polymerase I and transcription initiation factors

We first reconstituted yeast Pol I in vitro transcription system for functional studies using purified proteins and nucleic acids. To this end, we expressed S. cerevisiae transcription initiation factors CF and Rrn3 in E. coli and purified the recombinant proteins (Supplementary Fig. S1a and b), followed by SEC analysis (Supplementary Fig. S1d and e). Pol I was purified from S. cerevisiae cells harvested in the exponential growth phase (Supplementary Fig. S1c). The oligomeric state of purified proteins was determined using MP in buffer compositions similar to those used in transcription and footprinting assays. Rrn3 showed a major monomer population (84%) and a minor dimer population (16%) (Supplementary Fig. S1f). The experimental molecular weights of Rrn3 monomer (82.3 ± 12.9 kDa) and dimer (145 ± 25 kDa) matched the calculated values (72.4 and 144.8 kDa). CF had only one significant protein population with an estimated mass of 219 ± 33.8 kDa, consistent with the calculated monomer mass (221.8 kDa) (Supplementary Fig. S1g). Pol I had a major monomer population (86%) and a minor dimer population (13%) (Supplementary Fig. S1h). The experimental molecular weights of Pol I monomer (577 ± 88 kDa) and dimer (1141 ± 109 kDa) matched the calculated values (589.5 and 1179 kDa). This analysis confirmed that the purified proteins were intact and correctly assembled.

Binding of CF to rDNA promoter: location, stoichiometry, and affinity

We next aimed to study how CF recognizes the rDNA promoter, what is the stability CF·promoter complex, and how CF recruits Pol I to it [10, 15, 17]. First, we defined the boundary of CF binding site on the promoter and specificity of its selection. To achieve single base pair (bp) resolution, we employed Exo III footprinting assay based on the detection of Cy5.5 label at 5′ end of DNA template [27]. We built two DNA scaffold models of the rDNA promoter, covering the promoter region with reference to TSS at +1 from nucleotide position −45 to +45 [Cy5.5(−45/+45) and (−45/+45)Cy5.5] and from −80 to −35 [Cy5.5(−80/−35)] as a control of nonspecific interaction (Fig. 1a and Supplementary Fig. S2a). The footprinting results show a CF protected region within the −45/+45 promoter but not within the −80/−35 promoter (Fig. 1b). PAGE analysis of the untreated scaffolds indicated excellent DNA purity, facilitating the mapping of exact footprint boundaries (Supplementary Fig. S3) of the rDNA promoter fragment from −32 to −12 (Fig. 1and Supplementary Table S5). This aligns with the previously determined CF footprint (from the position −32 to −9 bp) [15], the span of the minimal binding competent DNA fragment (−28 to −17 bp) [22] and structural evidence from cryo-EM models (−35 to −12 bp [13], −40 to −16 bp [14]). We studied DNA unwrapping−wrapping dynamics using the time-course of Exo III digestion (Supplementary Fig. S2b). The downstream edge of the footprint remained sharp at position −12 in all time-points indicating that DNA at the downstream part of the CF binding site remains tightly anchored to CF with nondetectable level (if any) of unwrapping dynamics. The close inspection of the upstream CF footprint reveals two bands, suggesting that most CF·DNA complexes have the footprint edge at −32, while a smaller fraction of the complexes have the edge at −33. Overall, CF protects invariable DNA region over the increasing cleavage time, indicating strong and specific CF binding to the promoter DNA (Supplementary Fig. S2b).

Footprinting of CF on rDNA promoter. (a) The cartoon illustrates Exo III-based footprinting of the CF binding site on WT rDNA promoter templates (span −45/+45, TSS at +1, highlighted in yellow). The 5′ end of either the nontemplate (promoter shown at the top) or the template strand (bottom promoter) is labeled with a cyanine 5.5 fluorophore (red ellipses); the complementary strand is protected at the 3′ end with phosphorothioate bonds (blue asterisks). The direction of Exo III cleavage is indicated by gray arrows. The CF-protected fragment on each promoter is highlighted with cyan rectangles, and the complete CF footprint is marked with a light purple rectangle. (b) Representative gel image of CF binding site mapping by Exo III footprinting. Gel lanes from left to right: TTP (T) and ATP (A) analogue Sanger sequencing ladders; footprinting of the WT rDNA promoter for the downstream boundary (indicated by the left black arrow) in the presence (WT, +CF) and the absence (WT, −) of CF; footprinting of the WT rDNA promoter for the upstream boundary (indicated by right black arrow) in the presence (WT, +CF) and the absence (WT, −) of CF; upstream footprinting of rDNA promoter (−45/+45) substituted at position −27 in the presence (−27T·A > G·C; +CF) and the absence (−27T·A > G·C; −) of CF; upstream footprinting of rDNA promoter (−45/+45) substituted at position −28 in the presence (−28G·C > T·A; +CF) and the absence (−28G·C > T·A; −) of CF; and upstream footprinting of the rDNA promoter fragment spanning −80/−35 in the presence (−80/−35, +CF) and the absence (−80/−35, −) of CF. Blue and red arrows indicate the electrophoretic mobility of specific-length DNA ladder strands. The electrophoretic mobility of full-length promoter is indicated with asterisks, and the CF footprint fragments with a hash symbol. Denaturing 10% polyacrylamide gel was scanned for Cy5.5 fluorescence.

To track directly the binding dynamics of CF, we constructed a DNA scaffold containing the promoter sequence from −40 to −10 and a cyanine 3 (Cy3) fluorophore [hereafter called as DNA(Cy3)] (Supplementary Fig. S4a). Cy3 was selected due to its tendency to exhibit an increase in fluorescence intensity upon protein binding to the DNA [40], i.e. PIFE. Indeed, upon the addition of CF to DNA(Cy3), there was ~35% increase in fluorescence intensity (Supplementary Fig. S4b). This PIFE effect was reversed when the dissociation of CF from DNA(Cy3) was triggered by the addition of unlabeled rDNA promoter. These observations suggest that CF binds near Cy3, consistent with the orientation of CF binding motif on the DNA scaffold. We estimated the binding affinity of CF to DNA(Cy3) using a spectrofluorometer. The obtained equilibrium binding curve was described by a simple model where one CF molecule binds to one DNA(Cy3) molecule (Fig. 2a). The binding affinity was determined in two buffers conditions: FTB5 buffer with a ionic strength of 156 mM and FTB5 with 85 mM. The apparent dissociation constant (KD^app^) was found to be 106 ± 64 nM (SE) in FTB5 (Fig. 2a) and 52 ± 31 nM (SE) in LTB5 (Supplementary Fig. S4c). All fit parameters are shown in Supplementary Table S6. Based on the binding inhibition by increased ionic strength, it appears that ions and water molecules contribute to the association of CF with the specific promoter binding site, as previously analyzed in-depth for the interaction of lac repressor with DNA [41]. With mass photometer, KD^app^ was estimated as 20.5 ± 4.5 nM (SE) on a WT promoter with a span from −45 to +45 bp in FTB5 (Fig. 2b and c). Finally, acquired mass distributions support 1:1 stoichiometry of CF·DNA complex.

Binding affinity of CF to rDNA promoter. (a) KDapp of CF·DNA(Cy3) complex was determined by mixing 24 nM DNA(Cy3) promoter with 12.5–200 nM CF. Fluorescence intensities were measured using a spectrofluorometer. Data points and error bars represent the mean and SD from 3–7 independent experiments, respectively. (b) Mass distributions of DNA-free and DNA-bound CF complexes were determined by mass photometer. Samples contained 50 nM CF and varying concentrations (0, 25, 50, or 100 nM) of unlabeled rDNA promoter (span −45/+45). The solid lines over the distributions indicate the best Gaussian fits of molecular masses. (c) Relative amount of CF bound to the rDNA promoter in mass photometer assay (n = 3 independent experiments). The curves in panels (a) and (b) were obtained using the equations 2–4 and parameter values listed in Supplementary Table 6.

Binding of CF to rDNA promoter and unspecific DNA: kinetics

The reaction progress curves, measured using SF, started to show a noticeable increase in Cy3 fluorescence intensity in ~1 s after mixing the DNA(Cy3) with CF, the intensity reaching maximum by ~80 s (Fig. 3a). We calculated the change in the Cy3 intensity in each CF concentration and fit Eq. 2–4 to this data to obtain KD^app^ as 79 ± 35 nM (SE) (Fig. 3b and Supplementary Table S6), a value consistent with KD^app^ from manual PIFE experiments (Fig. 2b).

Interaction dynamics of CF and rDNA promoter. (a) SF fluorescence traces monitoring the binding of 12.5–120 nM CF to 24 nM WT DNA(Cy3) are shown. The inset shows data on a linear timebase. The curves were obtained using equation 5. (b) Fluorescence intensity of WT promoter, nonspecific DNA (NS-DNA) or mutant promoter scaffolds, each Cy3 labeled, at the end of CF binding reaction in SF assay. The curves were obtained using the equations 2–4. (c) Observed rate constants (kobs) of CF binding to WT promoter, nonspecific DNA, or mutant DNA(Cy3) scaffolds in SF assay. (d) SF fluorescence traces monitoring the dissociation of CF from WT promoter, nonspecific DNA, or mutant DNA(Cy3) scaffolds after challenging the preformed complexes with an excess of label-free promoter DNA (span −30/+30). The curves were obtained using the equation 6. The parameter values used for each fit equation are listed in Supplementary Table 6. (e) Mechanistic models for the formation of specific CF·promoter and unspecific CF·DNA complexes.

The initial inspection of normalized curves revealed that the reaction half-life (i.e. the time-point in which half of the fluorescence increase happened) was similar, ~16–20 s, in all CF concentrations on DNA(Cy3) (Supplementary Fig. S5a). The apparent reaction rate of specific CF·DNA complex formation is thus not determined by a bimolecular binding reaction but it is determined by a unimolecular reaction, e.g. a conformational change, taking place in the preformed CF·DNA complex. To obtain quantitative rate parameters, we fitted the original reaction progress curves to a single exponential rate equation (Eq. 4) (Fig. 3a and Supplementary Fig. S5b and c). The obtained fit parameters confirm that that the observed rate of CF·DNA complex formation (kobs) has no significant dependence on the CF concentration; specifically, the kobs values were found as 0.033–0.038 s^−1^, the average kobs,avg as 0.036 ± 0.002 s^−1^, and the average reaction half-life as 19.5 ± 1.1 s [calculated by ln(2)/kobs,avg] (Fig. 3a and Supplementary Table S6). Interestingly, CF binding to equal-length (31 bp) nonspecific DNA [NS-DNA(Cy3)] was markedly different. The binding rate increased linearly with CF concentration, consistent with a simple bimolecular association (Fig. 3c and Supplementary Fig. S5d–f). Binding was also faster, reaching ~2 s^−1^ at 80 nM CF, corresponding to an apparent second–order rate constant (kbind) of ~2.3 × 10^7^ M^−1^ s^−1^ (Supplementary Fig. S5f and Supplementary Table S6). The binding progress curves systematically deviated from a single exponential but were well described by a stretched exponential (Eq. 6), which reflects heterogeneous reaction pathways and complexes [29, 30] (Supplementary Fig. S5h and i, and Supplementary Table S6). The binding equilibrium saturated at low CF concentrations, suggesting that CF populates multiple alternative low–affinity sites along NS–DNA, with the overall binding probability becoming additive across these sites (Fig. 3b).

CF dissociated slowly from its specific binding site when the pre-formed CF·DNA(Cy3) complex was disrupted by the addition of CF binding site containing competitor DNA (Fig. 3d). Specifically, the median reaction time (τ) was found as 93.6 ± 2.1 s and the median dissociation rate (krev) as 0.0076 ± 0.0002 s^−1^ using stretched exponential as the fit equation (Eq. 6; Supplementary Table S6). For comparison, data analysis using the single and double exponential rate equations returned dissociation rates of 0.0112 ± 0.0001 s^−1^ and 0.0084 ± 0.0001 s^−1^ (the fit assigned 88.5% of total reaction amplitude to this rate), respectively, indicating that the inferred dissociation rate is only slightly affected by the selected fit equation (Supplementary Fig. S5g). Finally, the real rate of specific CF·DNA complex formation, kfor, was calculated (from k_obs _= k_for _+ krev; valid when binding leads to the equilibrium) using known kobs (0.036 ± 0.002 s^−1^) and krev (0.0076 ± 0.0002 s^−1^) to be 0.0284 ± 0.0022 s^−1^. Similar analysis confirmed that CF indeed dissociates from nonspecific DNA [NS-DNA(Cy3)] with much faster rate of 0.320 ± 0.005 s^−1^ (krev in Supplementary Table S6, Fig. 3d) reflecting the transient nature of these low–affinity interactions and the absence of the conformational stabilization characteristic of the specific CF·DNA complex. Overall, the binding and dissociation kinetics indicate that both the formation and dissociation of the specific CF·DNA complex are limited by slow isomerization steps, the reaction toward dissociation direction (krev) being three-fold slower than to the forward direction (kfor) (Fig. 3e). In contrast, CF engages nonspecific DNA through simple bimolecular association, giving rise to a distribution of transient complexes.

Promoter sequence determinants of transcription initiation efficiency

We examined the transcription efficiency of the basal system consisting of Pol I, Rrn3, and CF. Previous biochemical and structural studies have defined the yeast rDNA promoter architecture as an upstream element binding UAF and a core element binding CF and Pol I, with the upstream edge of the core promoter near −28 [13, 15, 22, 42–44]. We tested whether the DNA linker between these elements could modulate activity of the core Pol I apparatus, similar to bacterial RNA polymerase (RNAP), where extended upstream regions can enhance initiation via DNA bending [45]. To this end, we trimmed the upstream edge toward the CF binding site. Transcription remained steady when the upstream end was progressively shortened from −90 to −30 (Fig. 4a and b), indicating that the basal Pol I system does not utilize sequences upstream of the CF binding site to tune initiation efficiency. As expected, the deletion of CF binding site, by placing the promoter upstream end to −15 or −12, abolished all activity. The partial deletion of CF binding site, by placing the promoter upstream end to −26 or −27, also inactivated transcription. We mapped the critical upstream border of the promoter to −28 as this promoter supported strong activity that was 7.5-fold more than with a longer upstream sequence. Noteworthy, the cryo-EM based model of Pol I PIC (PDB 3rql) [15] indicates the widening of the DNA major groove from the positions −22 to −27 in comparison to standard B-DNA, probably because of the insertion of the C-terminal cyclin fold of Rrn7 and the N-terminal DNA-binding helical bundle of Rrn11 into this groove (Fig. 4c) [46].

Role of CF and promoter CF-binding site in transcription initiation. (a) Transcription initiation activities on WT promoters of different lengths are shown. The transcription reactions contained rDNA promoter, Pol I, Rrn3, CF, and NTPs. All active promoters initiated transcription at position +1. Data represent the mean and SE from the indicated number (n) of independent experiments. Statistical significance of observed differences is reported using the reference sample (ref.). The inset shows transcription activity in the presence and absence of CF. RNA transcript was quantified by primer extension assay. (b) Representative PAGE gel illustrating the TSS position and transcription activity levels as detected by primer extension assay. T and A indicate Sanger sequencing ladders prepared using TTP and ATP analogs as chain terminators, respectively. The inferred TSS position at +1, the lengths of the corresponding primer–extension products, and the sequence of the rDNA promoter template strand are shown to the left of the gel scan. The free OP018 primer is indicated with dotted line. All reactions shown were performed simultaneously and resolved on the same denaturing 12% gel, which was scanned for Cy5.5 fluorescence in OP018 primer; full gel scan is provided in Supplementary Materials 2. All assembled PICs were split into transcription reactions supplemented with NTPs (indicated with +) and negative controls without NTPs (−). Asterisk indicates the sample with fluorophore contamination detected in the presence and the absence of NTPs. (c) The Rrn7 subunit (colored brown) embraces the rDNA promoter near positions −27, −28, and −29. Several residues (R293, H294, T295, R297) are oriented into or near the major groove of the promoter DNA. DNA surfaces are shown in semi-transparent colors: yellow for the template strand and sky blue for the nontemplate strand. The figure was generated using cryo-EM model 6RQL and Chimera 1.14 software [46]. (d) The remaining amounts of the CF·DNA(Cy3) complex were determined following incubation with label-free DNA competitors, using the SF instrument to record Cy3 fluorescence intensity. Linear (−0/+30, −28/+30, −26/+30), single base pair indel mutants (−13del, −13ins), and fork (FPt1, FPt2) promoters were used as competitors. The fluorescence intensities of protein-free DNA(Cy3) and the CF·DNA(Cy3) complex were set to 0 and 1, respectively. Data represent the mean ± SE from the indicated number (n) of independent experiments. Statistical significance was assessed using the reference sample (ref.) as the baseline.

Further visual inspection of the cryo-EM model reveals that Rrn7 residues H294 and R293 locate adjacent to bps −27/−28 and −26/−27, respectively, implying possible sequence specific recognition of the base or bp identities in these positions (Fig. 4c). To understand the role of single bps at the upstream edge of the CF binding site, we substituted either bp −27 (T·A > G·C, ntDNA·template DNA) or −28 (G·C > T·A) and determined the effect on the transcription initiation activity (Fig. 5a). The substitution at −27 and −28 caused ~93% and 95% decrease, respectively, in basal transcription activity reaching the detection limit by primer extension (Fig. 5b). Interactions involving CF and the promoter region around positions −27 and −28 thus appear important for specific rDNA promoter recognition and for CF positioning in a way that facilitates subsequent Pol I recruitment.

The effects of single bp mutations on the promoter recognition and transcription activity. (a) Sequences of the ntDNA strand from the rDNA promoter scaffolds used are shown. Base substitutions and insertions/deletions are highlighted with light blue shading and red font. TSS is highlighted in yellow. The numeric positions of these sites are indicated above the sequences. (b) Transcription initiation activity, determined by primer extension assay, of Pol I on WT and mutant rDNA promoters is shown. (c) The fraction of CF bound to rDNA promoters was determined using MP. Reactions contained 50 nM CF and 25 nM promoter. (d) An illustrative PAGE gel of RNA products from in vitro transcription reactions shows the TSS position. T and A indicate Sanger sequencing ladders prepared using TTP and ATP analogues as chain terminators, respectively. The inferred TSS positions (in parentheses), the lengths of the corresponding primer–extension products, and the sequence of the rDNA promoter template strand are shown to the right of the gel scan. The free OP018 primer is indicated with dotted line. Transcription reactions supplemented with NTPs are indicated with (+NTPs), and negative controls without NTPs with (−). All samples shown were run on the same denaturing 12% PAGE gel, which was scanned for Cy5.5 fluorescence in OP018 primer; full gel scan is provided in Supplementary Materials 2. Data in panels (b) and (c) represent the mean and SE from the indicated number (n) of independent experiments. Statistical significance is reported using the reference sample (ref.).

The comparison of different cryo-EM structures suggest that CF and the upstream DNA can pivot around Pol I [14, 15, 44]. To gain functional insight on how delicate the positioning of CF and Pol I and its dynamics on the promoter are, we either deleted the bp at position −13 (−13del promoter) or duplicated −13 bp between the original positions −13 and −12 (−13ins promoter) (Fig. 5a). If the protein system would respond rigidly without any rearrangement, these promoter DNA mutations would simultaneously change the contact distance and angle of CF and Pol I by 3.4 Å and 36^o^. The deletion and insertion of a bp reduced basal transcription level by an average 87% and 70%, respectively (Fig. 5b). While none of the mutations appeared to shift the TSS; RNA synthesis still initiated from +1 (Fig. 5d). However, due to the low activity of mutant promoters, precise assignment of the TSS remains challenging. Our PIFE competition assay additionally confirmed that CF binds to both mutant and WT promoters with comparable efficiency (Fig. 4d and Supplementary Fig. S4d). These findings highlight the stringent spacing requirements between CF and Pol I binding sites on the promoter: altering the distance by just one bp within the core element drastically impaired functional PIC formation. This suggests that if functionally relevant ratcheting motions of CF and linker DNA occur, they do not fully compensate for even minimal changes in promoter architecture; our results therefore do not provide support for the proposed ratcheting–based DNA–melting mechanism in the PIC [44].

Effects of binding site mutations on the interaction with CF

To gain quantitative insight into the effects of DNA major groove mutations on CF binding, we reproduced the substitutions at positions −27 or −28 in DNA(Cy3) scaffold (span: −40/−10, Cy3 at the 3′ end of ntDNA) to create mutant rDNA promoter scaffolds DNA(Cy3,−27) and DNA(Cy3,−28) for the PIFE-based binding studies (Supplementary Fig. S4a). We then mixed 24 nM DNA(Cy3,−27) or DNA(Cy3,−28) with different concentrations of CF in the SF instrument and calculated the change in the Cy3 intensity (Fig. 3b). CF binding produced a significantly smaller fluorescence change on both mutant scaffolds, reaching only 15%–35% of that seen with the WT DNA(Cy3). In principle, this effect is expected if the DNA mutations drastically decrease the overall binding affinity of CF. However, the CF concentration dependence of PIFE saturated within the used CF concentrations (up to 120 nM) as indicated by the estimated KD^app^ values 7 ± 7 nM [DNA(Cy3,−27)] and 1 ± 3 nM [DNA(Cy3,−28)] (Supplementary Table S6). It thus appears that the DNA major groove mutations compromised the capability of CF to fully recognize its specific binding site (near Cy3), specifically the maturation of initial CF·DNA complex(es) to the stable conformation producing high PIFE. To corroborate this model, we analyzed CF binding to −27 or −28 mutated promoters with reference to WT promoter of same length (span −45/+45) using MP (Fig. 5a). MP detects every CF–DNA complex stable for at least 0.02 s data frame in the recorded movie of 180 s length, thus it identifies both specific and unspecific CF·DNA complexes. MP data indicate that CF binding occupancy on the mutant promoters is comparable to that on the WT promoter (Welch’s ANOVA: F = 1.237; *P *= 0.413; Fig. 5c). Also consistent with the model, Exo III footprinting, which only detects the stable CF·DNA complex, failed to show CF binding to the −27 and −28 mutant promoters (Fig. 1b). We conclude that the recognition of bp at the positions −27 and −28 are pivotal mechanistic elements in defining the CF binding specificity.

The observed binding rates were 0.112–0.180 s^−1^ (kobs,avg: 0.143 ± 0.023 s^−1^) and 0.281–0.661 s^−1^ (kobs,avg: 0.375 ± 0.112 s^−1^) for DNA(Cy3,−27) and DNA(Cy3,−28), respectively (Fig. 3c and Supplementary Fig. S6 and Supplementary Table S6), and, unlike CF binding to fully nonspecific DNA, these rates did not increase consistently with CF concentration, indicating that binding to the mutant promoters does not follow simple bimolecular association. Observed binding rates are 3–11-fold larger than for WT DNA(Cy3) (kobs,avg: 0.036 ± 0.002 s^−1^). We then determined CF dissociation rates, krev, by mixing the preformed CF·DNA(Cy3,−27) and CF·DNA(Cy3,−28) complexes with unlabeled WT competitor DNA and found them as 0.0358 ± 0.0002 s^−1^ and 0.3107 ± 0.0056 s^−1^, respectively (Fig. 3d). These values are significantly increased from WT DNA(Cy3) (0.0076 ± 0.0002s^−1^) (Supplementary Table S6). Next, kfor values were calculated (from k_obs _= k_for _+ krev) to be 0.107 ± 0.023 s^−1^ [DNA(Cy3,−27)] and 0.064 ± 0.118 s^−1^ [DNA(Cy3,−28)], again larger than 0.0284 ± 0.0022 s^−1^ for WT DNA(Cy3). The krev values indicate that CF interaction with the bp −28 is essential for the stability of the specific CF·DNA complex; the substitution of −28 caused rapid dissociation of CF from the binding site (krev increased 40-fold from WT) as well as unfavorable equilibrium constant (K_2 _= kfor/k_rev _= 0.21 ± 0.38 versus WT 3.75 ± 0.37) for the isomerization of initial CF·DNA complex to the final complex, i.e. to high PIFE state (Supplementary Table S6). The substitution of bp −27 also destabilized the specific CF·DNA complex in the terms of the dissociation rate as krev increased by 4.7-fold. However, because the substitution also increased kfor 3.8-fold, the equilibrium constants K2 (3.00 ± 0.66) was little affected.

Together, these data show that the −27 and −28 mutant promoters form semi–stable CF·DNA complexes: they are still partially (specifically) recognized but undergo the slow, stabilizing isomerization that defines the WT interaction only inefficiently. Formation of these semi–stable complexes likely sequesters CF and suppresses most nonspecific binding modes that would otherwise arise from rapid bimolecular encounters on unrelated DNA.

Promoter determinants in transcription start site selection

Because one bp insertion or deletion in the linker region impaired activity but retained TSS at +1, we hypothesized that Pol I—similar to bacterial RNA-polymerase—σ holoenzyme [47]—might recognize specific bases in single-stranded promoter DNA during OC formation. To test this, we used forked rDNA promoter models (Fig. 6a). Fork junction promoter FPt1, exposing the template strand upstream from −6, supported transcription but initiation occurred mainly from the upstream end of the fragment, indicating unspecific initiation rather than sequence-specific recognition (Fig. 6b–e). CF was not essential for this activity but enhanced RNA yield by 2.7-fold. Binding studies showed that CF interacts with FPt1 only weakly and nonspecifically, as demonstrated by the poor ability of FPt1 (or FPt2) to compete with the WT promoter in PIFE assays (Fig. 4d and Supplementary Fig. S4d), consistent with CF facilitating Pol I indirectly on FPt1 rather than defining the TSS. Constructs exposing the nontemplate strand (FPnt1–2) were inactive (Fig. 6b and e), confirming that Pol I does not specifically recognize single-stranded DNA to define the TSS. Fully double-stranded templates showed no detectable end-initiated transcription (Fig. 4a and b).

Promoter topology affects the TSS recognition by Pol I. (a) Cartoon represents the structure of fork junction promoters (FPt1, 9 nt overhang in tDNA; FPt2, 24 nt overhang in tDNA; FPnt1, 8 nt overhang in ntDNA; FPnt2, 11 nt overhang in ntDNA) and two pre-melted bubble promoters (Pb1, ntDNA·tDNA mismatch from −10 to +5; Pb2, mismatch from −9 to +2) used to investigate the TSS selection by Pol I. Numbers indicate the span of each DNA strand, the mismatch regions, and the TSS (+1) of WT rDNA promoter. ntDNA and tDNA are shown in dark and light gray, respectively. (b) The transcription initiation activity of fork promoters was measured in the presence of Pol I, Rrn3, and CF by primer extension assay. When multiple TSS were observed, as in the case of FPt1, Pb1, and Pb2, RNA products were separately quantitated and summed up to obtain the reported total Pol I activity. WT linear rDNA promoter (span −90/+30) was used as a control. Reported activities are the mean and SE of three independent experiments. (c) Transcription activity on FPt1 was measured with and without CF in the presence of Pol I and Rrn3. Total activity and TSS-specific transcription, determined by primer extension assay, are shown. (d) Total transcription activity of Pb1 and Pb2 was measured under the same conditions as in panel (b) using primer extension assay. Data represent means and SE from three independent experiments. (e) TSS position(s) on different promoters. T and A indicate Sanger sequencing ladders prepared using TTP and ATP analogues as chain terminators, respectively. The inferred TSS positions (in parentheses), the lengths of the corresponding primer–extension products, and the sequence of the rDNA promoter template strand are shown to the left of the gel scan. The free OP018 primer is indicated with dotted line. All samples shown were run on the same denaturing 12% PAGE gel, which was scanned for Cy5.5 fluorescence in OP018 primer; full gel scan is provided in Supplementary Materials 2. (f) Primer extension gel and its quantification for transcription activity level on Pb2 promoter with different components of the Pol I basal system; transcription activity of Pol I·Rrn3·CF was normalized to 1. All samples shown were run on the same denaturing 12% PAGE gel, which was scanned for Cy5.5 fluorescence in OP003 primer; full gel scan is provided in Supplementary Materials 2. (g) Representative PAGE gel compares the TSS selection by Pol I on the full-length rDNA promoter (span −200/+150) embedded in either linear or negatively supercoiled (plasmid) DNA. Transcription reactions were performed using either the basal Pol I system (Pol I, Rrn3, CF) or the holoenzyme (Pol I, Rrn3). T and C indicate Sanger sequencing ladders prepared using TTP and CTP analogues as chain terminators, respectively. The inferred TSS positions (in parentheses), and the lengths of the corresponding primer–extension products and terminating base identities in the rDNA promoter template strand are shown to the left of the gel scan. The free OP018 primer is indicated with dotted line. All samples shown were run on the same denaturing 12% gel, which was scanned for Cy5.5 fluorescence in OP018 primer; full gel scan is provided in Supplementary Materials 2. (h) Kernel density estimation plot of transcription levels by Pol I·Rrn3·CF complex at each TSS position on a negatively supercoiled promoter (n = 8). The y-axis shows relative transcription activity, the x-axis displays TSS positions, and the z-axis (color gradient) indicates TSS density. The calibration bar represents TSS density (probability) at heatmap from 0 to 0.285, with maximum density 1. Labels (+NTPs) and (−) in panels (e)–(g) refer to transcription reactions performed in the presence or absence of NTPs, respectively. Data are mean and SE from the indicated number (n) of independent experiments. The statistical significance of is reported using ref. as the reference sample.

Next, we assessed TSS on pre-melted promoter scaffolds, which have been used in recent cryo-EM studies to obtain the structure of Pol I·promoter OC [15, 48]. The minimal rDNA promoter Pb1, spanning region −30/+30 with strand mismatch from −10 to +5, was active (Fig. 6d) and displayed multiple TSS’s within the pre-melted region. Most initiation events started from +3 and some from +1, +4, or +5 (Fig. 6e). A minor translocation of the pre-melted region to −9/+2 in Pb2 caused a shift in the TSS pattern, the initiation now taking place almost equally efficiently from the positions −1 and +1 (Fig. 6e). These findings support the conclusions that Pol I does not identify bases and that Pol I can accommodate differently sized transcription bubbles [16]. The primary factor determining the TSS on bubble promoters is the interaction between the downstream DNA duplex and Pol I, which guides the first two initiating template bases—located near the edge of the duplex—into the Pol I active site. Careful interpretation of structural data, from mismatch promoters, is necessary because these promoters can initiate from non-native TSS. Noteworthy, the transcription rates of pre-melted scaffolds were similar or slightly exceeded that of the native promoter suggesting that the rate-limiting step of the transcription initiation is not the transcription bubble melting (Fig. 6d). The activity on Pb2 was stimulated by CF but not absolutely dependent on it (Fig. 6f), indicating that DNA helix distortion (pre-melting) bypasses key steps of the normal transcription initiation mechanism.

We finally determined TSS of a full-length rDNA promoter (−200/+150) embedded to a plasmid. Data shows that the Pol I basal transcription machinery exhibits mTSSs on the plasmid template (Fig. 6g and h). The TSS distribution fluctuated between experimental repeats, probably reflecting variable levels of negative supercoiling in different plasmid promoter preparations. The highest frequency of TSS was found at promoter positions +1, −9, and −15. In contrast, the linear template consistently initiated from +1 (Fig. 6g). Initiation on the plasmid and linear promoter is strictly dependent on CF. Because negative supercoiling stimulates processes that require DNA helix opening, such as transcription initiation, the relaxed TSS selectivity on the plasmid template could be rationalized by the inherently flexible interaction/connection between CF and Pol I that allows Pol I to bind and melt the DNA on different positions near CF.

Structural dynamics of Pol I closed complex

We next investigated the structural dynamics in Pol I CC by performing sixteen independent 1000 ns MD simulations starting from the CC2 cryo-EM model of Sadian et al. [15]. As a control, protein-free DNA was also simulated under the same conditions. To evaluate structural stability, root mean square deviation (RMSD) of backbone atoms were calculated for all Pol I, CF, and Rrn3 subunits. In each trajectory, RMSD increased during the initial 200 ns, reflecting structural relaxation, and then stabilized around ~5–6 Å, indicating that the system reached equilibrium (Supplementary Fig. S7). To characterize local flexibility, residue-wise root mean square fluctuations (RMSF) were calculated for protein Cα atoms and DNA phosphate atoms. Values ranged from 0.6 to 10.5 Å (mean: 1.9 Å). Most of the system exhibited moderate rigidity, consistent with a stable core structure (Supplementary Fig. S8). Increased flexibility was observed in peripheral Pol I subunits, Rrn3, and CF. The highest fluctuations, however, were localized to DNA termini, extended surface-exposed loops, and linker regions. Together, these results demonstrate that the simulations are well-equilibrated and structurally stable, providing a reliable basis for subsequent analyses of functionally relevant dynamics.

Previous cryo-EM studies have revealed distinct orientations of CF relative to Pol I in different PIC states [12–16]. To examine whether such variability occurs in our simulations, we calculated the center of mass (COM) for CF and Pol I across all trajectories (Supplementary Fig. S9a and b). The COM coordinates remained stable, with low RMSF values of 0.33 Å for Pol I and 0.92 Å for CF, indicating minimal displacement. The absence of transitions between distinct CF orientations suggests that the starting conformation represents an energetically stable state that does not readily interconvert within the 1000 ns timescale. We next analyzed the behavior of the Rrn7 Zn ribbon, a domain that connects CF to Rrn3, by monitoring the position of its COM and its distance to Rrn3 (Supplementary Fig. S9c). The Zn ribbon exhibited greater mobility than CF as a whole, with an RMSF of COM as 3.6 Å, indicating local flexibility (Supplementary Fig. S9d). To assess the stability of its interaction with Rrn3, we measured the distance between N17 in the Zn ribbon and N200 in Rrn3. This distance remained stable at ~6 Å in most trajectories, though a subset (3/16) showed increase to ~10 Å (Supplementary Fig. S9e). These results suggest that while the Zn ribbon overall maintains its connection to Rrn3, the interaction is flexible. Importantly, this flexibility did not lead to dissociation of CF from the Pol I PIC.

Structural studies of the Pol I CC have revealed an open clamp conformation, which accommodates both the expander helix and the C-terminal domain of Rpa12 (Rpa12CTD) into the DNA-binding cleft [15]. This state is also associated with partial unfolding of the Rpa190 bridge helix (BH). Transition to the OC involves clamp closure, displacement of the expander and Rpa12CTD from the cleft, and refolding of the BH. Interestingly, our MD simulations indicate that the clamp samples a broad continuum of conformations between the previously characterized open and closed states (Supplementary Fig. S10a and b). Individual simulation trajectories reveal dynamic fluctuations of the clamp (Supplementary Fig. S10c), while the expander (Supplementary Fig. S11a–c) and Rpa12CTD (Supplementary Fig. S11d) remain stably bound within the cleft. The central region of the BH was unstable, exhibiting elevated RMSF (2.5–4 Å; Supplementary Fig. S12a) and diverse structural snapshots (Supplementary Fig. S12b), relative to the more stable helical segments (RMSF ~1.5 Å). These findings suggest that both the clamp and BH retain sufficient conformational flexibility to permit the initiation of DNA loading into the cleft, a process that ultimately displaces the expander and Rpa12CTD.

Distortion of promoter DNA in the Pol I closed complex

Promoter DNA is significantly bent in the CC, a feature thought to be critical for transcription initiation [15]. Our simulations confirmed stable bending of the promoter within the Pol I CC, with an average angle of 66 ± 19° (SD) (Fig. 7a and b). In contrast, removal of the protein components led to relaxation of the DNA into a predominantly straight conformation, with an average bending angle of 27 ± 17° (Fig. 7b). Local base-pair step parameters—twist, roll, shift, slide, tilt, rise, or x displacement—showed large SD and remained in the terms of SD limits mostly similar between free and protein-bound DNA (Supplementary Fig. S13). These results suggest that the promoter is locally flexible, but protein binding is required to induce and stabilize the globally bent DNA conformation in the CC.

rDNA promoter bending and deformation in MD simulation trajectories. (a) Snapshots of promoter DNA conformations illustrate the difference in DNA bending between CC and protein-free simulations. The approximate binding sites of Pol I and CF are indicated with gray and green curves, respectively. The template strand is shown in blue and the nontemplate strand in purple. The region of pronounced minor groove widening (bp −18 to −22) is highlighted in yellow. Dashed lines visualize represent approximate helical axis vectors used to measure DNA bending angle (θ). (b) Histogram of DNA bending angles in the CC (protein-bound) or protein-free (free DNA) simulations. (c) Bar graph shows the frequency of DNA base pairing in different promoter positions. (d) Major groove widths of promoter DNA in the CC and protein-free simulations. (e) Minor groove widths of promoter DNA in the CC and protein-free simulations. DNA parameters were calculated using do_x3dna and analyzed using dnaMD Python module. Each plot combines data from 16 independent MD simulations.

The base pairing within the kinked promoter region of CC was unstable, especially the bps −6 to −8 showing disruptions (Fig. 7c). In the fully flipped out state, the template DNA base −8 interacts with R448 and M451 in the A135 protrusion domain. MD trajectories thus support that as part of the CC-to-OC transition, DNA bases within the kinked promoter region (−7 to −11) flip out from the double helix and are stabilized by interactions with the surrounding protein, until the expanding region of the single-stranded template DNA leads to its loading into the DNA-binding cleft.

In the free DNA simulations, the major and minor groove widths remained relatively stable across the promoter, averaging approximately 18 and 11 Å, respectively (Fig. 7d and e). In the CC, DNA bending induced notable changes in groove geometry. Specifically, the major groove narrowed to around 16 Å between base pairs −14 to −17, and widened to ~20 Å between −21 to −24, within the CF-binding region of the promoter. The minor groove also exhibited localized widening in the CC, reaching about 14 Å between −17 to −21 (within CF-binding site) and −9 to −14 (within Pol I-binding site). Notably, the minor groove became particularly wide—~17 Å—at base pairs −19 to −20, corresponding to the region of most pronounced CF-induced DNA bending (Fig. 7a). These results highlight that groove deformations are primarily driven by protein-induced DNA bending, rather than being pre-defined by the DNA sequence and recognized by CF and Pol I.

CF interaction with DNA

The interactions between DNA and CF observed during MD simulations closely resemble those in the starting structure. To characterize the overall interactions, we calculated the frequency of contacts (defined as distances <10 Å) between the Cα atoms of CF and the DNA backbone (i.e. the C3′ and C5′ atoms). The most frequent contact sites and their respective frequencies are presented in Fig. 8a–c. To assess more specific interactions, we analyzed hydrogen bond formation between DNA bases and residues of CF subunits Rrn11 and Rrn7 (Fig. 8d). Hydrogen bonds were defined as interactions with a distance <4 Å between heteroatoms. Rrn11 formed infrequent hydrogen bonds with bases at bps −20 and −21, primarily involving R11 (in 8% of trajectory frames) (Fig. 8d and Supplementary Fig. S14a). Rrn7 exhibited two distinct DNA contact regions: N209 and a flexible loop comprising residues R293, H294, and R297 (Fig. 8d and Supplementary Fig. S14a). N209 consistently formed hydrogen bonds (92% of frames), mostly with adenine at bp −20 (nt strand) and thymine at bp −21 (t strand). In contrast, the loop region showed more transient interactions (18%–39% of frames), with hydrogen bonding observed at bps −26 to −29. This variability likely reflects the dynamic nature of the loop. Indeed, we observed the loop alternating between an in-the-groove conformation—engaging the DNA major groove at positions −25 to −29, as also seen in the starting structure—and a retracted state (Supplementary Fig. S14b). Notably, in 6 out of 16 simulations, R293 inserted into the DNA near bp −26. In three of these cases, R293 remained stably positioned between the DNA bases throughout the simulation, suggesting a potentially stable binding conformation (Fig. 8e). We performed additional control simulations (3 × 80 ns) using different software (Desmond and OPLS4 force field) and observed similar loop dynamics and R293 insertion into the DNA, ruling out potential force field artifacts (data not shown; data available at available at DOI 10.5281/zenodo.14001802). These findings support a bipartite DNA recognition mechanism by CF: a stable interaction with the DNA minor groove at positions −16 to −21, likely representing the initial binding site, followed by more dynamic and potentially sequence-specific interactions in the major groove at positions −25 to −29. This dual-mode binding may facilitate both specificity and flexibility in CF–DNA recognition.

Interactions of CF with rDNA promoter in MD simulation trajectories. (a) Cartoon putty representation of CF contacts with the rDNA promoter. The thickness of the cartoon reflects contact frequency. The contacts are color-coded by protein region, as detailed in panels (b) and (c). The DNA template strand is shown in blue, and the nontemplate strand in pink. (b) The frequency of Rrn7 contacts with DNA, defined as the percentage of trajectory frames in which a contact occurs between DNA and a residue in the Rrn7 subunit of CF. (c) Frequency of Rrn11 contacts with DNA, calculated as in panel (b). (d) Representation highlights the protein residues from Rrn11 (red) and Rrn7 (orange) that form hydrogen bonds with DNA bases. (e) Representation highlights the insertion of R293 between bps −26 and −27.

Discussion

We provided novel insights into the molecular mechanism of Pol I transcription initiation in this study. Specifically, we found that the essential yeast transcription factor CF, in the first step of PIC formation, identifies its rDNA promoter binding site through a two-step mechanism involving initial binding and isomerization. Our quantitative analysis—the first of its kind—further indicated that the dissociation constant of the CF·promoter complex is 50–100 nM, with a lifetime of ~90 s, and its stability is maintained by specific interactions at bps −27 and −28, the upstream edge of the CF binding site. In the next step of PIC formation, promoter-bound CF recruits the Pol I·Rrn3 complex to the promoter. When promoter-bound Pol I initiates RNA synthesis, we found that Pol I does not recognize specific bases near the TSS. Instead, it selects the TSS based on its interaction distance with promoter-bound CF and the DNA’s physical properties, such as “bendability” near the upstream edge of the transcription bubble. We discuss the reasoning leading to these claims and their broader implications below.

CF in yeast and its evolutionarily unrelated functional analogue, TIF-IB from the protist Acanthamoeba castellanii, have been proposed to recognize structural anomalies—such as DNA curvature, bendability, and minor groove width—in the rDNA promoter [22, 49, 50]. A recent cryo-EM study showed that CF contacts two widened DNA grooves—the major groove from −21 to −28 and the adjacent minor groove from −19 to −24 [15]. However, it is unclear whether these distortions are inherent to the free rDNA promoter or induced by CF binding. Our data exclude the scenario where CF binds the uniformly pre-perturbed promoter as then the binding rate would increase with increasing CF concentration which we did not observe. Another scenario suggests a biphasic binding process: CF binds to the perturbed (e.g. bent) promoters present in an equilibrium with the nonperturbed (e.g. straight) form. This shifts the equilibrium toward the formation of additional perturbed promoters. However, our experiments revealed only single-exponential binding progress curves. The third scenario, induced-fit binding, predicts single-exponential binding progress curves, with the binding rate remaining uniform across different protein concentrations, matching our experimental observations.

The induced-fit mechanism of specific CF·promoter complex involves deformation of the DNA major groove (around positions −21 to −29) and subsequent base recognition, as evidenced by the increased transcription initiation upon promoter truncation to −28, which likely reduces the barrier to deformation (Fig. 4a). Substitutions at −27 and −28 further support this model, as they destabilized the CF·promoter complex by accelerating dissociation (Fig. 3d) and decreased transcription activity by more than 90% (Fig. 5b). Consistent with a model in which only the specific CF·promoter complex undergoes the slow conformational transition to the stabilized state, CF bound nonspecific DNA by simple bimolecular association, and these complexes dissociated 40–fold more rapidly than the specific CF·promoter complex (Fig. 3c–e). MD simulations complement experimental findings by revealing dynamic interactions during CF·promoter engagement, including transient contacts of the R293/H294/R297-containing loop (between α7 and α8 in Rrn7) with the major groove and occasional insertion of R293 into the base stack. Together, these observations point to a two-step process: initial CF binding induces major groove deformation, creating a permissive DNA conformation, followed by base-specific stabilization mediated by flexible loops. Although the α7–α8 loop is dispensable for initial binding [12], it seems to become important for accurate sequence recognition and orienting CF for Pol I recruitment, as demonstrated by our base-pair substitution data and prior mutational studies where alanine replacements or DERH→RHDE (residues 291–294) substitutions in the loop markedly reduced transcription activity [15]. Thus, several residues, including R293, H294, and R297, within this loop likely form the key hydrogen bonds required for sequence verification during the second, isomerization step of specific CF·promoter complex formation.

CF recruits Pol I·Rrn3 complex to the rDNA promoter [13, 14, 44, 51]. Structural studies have not identified specific promoter bases that are recognized by Pol I [14–16, 44, 48]. However, the randomization of the promoter sequence at the Pol I binding site impaired transcription activity suggesting that Pol I cannot initiate on any sequence [13, 42, 43]. We investigated further the potential base specificity of Pol I by studying transcription initiation efficiency and TSS selection using bubble and fork promoters that have regions of exposed single stranded DNA (Fig. 6). The TSS was determined not by single bases but by the topology of the promoter, i.e. the location of the single-stranded template DNA and downstream duplex. The promoter DNA is kinked between the positions −11 and −7 to avoid clashing of the downstream DNA with the Pol I clamp. The kink may also provide essential destabilization of DNA for transcription bubble melting [13–15, 44]. Indeed, our MD simulations showed unstable base-pairing at the kinked promoter region (Fig. 7c). When the CF binding site was moved 1 bp closer or further from the TSS (+1), the initiation activity was diminished, but the TSS remained unchanged (Fig. 5b and d). Previously, the dislodging of the CF binding site by ±5 bp caused complete elimination of the activity as did also the full substitution of the sequence from −11 to −7 [15]. The correct distance between the CF binding site and the “bendable” sequence motif (to be kinked by Pol I) is thus one critical element of functional rDNA promoter, facilitating initial DNA melting by Pol I-dependent bending, the expansion of the melted DNA region to the downstream direction [52] and the loading of template DNA strand into Pol I active site. However, under negative supercoiling, the reliance on the precise spacing of CF and the “bendable” sequence motif decreases as DNA melting is inherently favored by the torsional strain, resulting in increased TSS promiscuity by Pol I (Fig. 6g and h). These findings highlight that, in addition to CF-dependent Pol I recruitment for promoter engagement, the physical properties of rDNA have major role in selective transcription initiation by Pol I.

Bacterial RNAP offers a well-characterized model for exploring the shared and distinct features of transcription initiation with Pol I. Most bacterial genes—including rDNA—are transcribed by the RNAP·σ^70^ holoenzyme (holo), while alternative σ factors redirect RNAP to specific gene sets in response to environmental cues. Both holo and Pol I use protein–DNA binding energy to melt the promoter and form the OC, unlike human Pol II, which requires ATP-dependent DNA unwinding by TFIIH’s XPB translocase [53]. The −35 and −10 elements of bacterial promoters guide holo binding and transcription initiation. These sequences are highly conserved across bacteria, unlike the poorly conserved Pol I-dependent rDNA promoters. The region 4.2 of σ^70^ recognizes the −35 element, thereby functionally resembling CF and its promoter recognition role in the Pol I system. The regions 2 and 3 of σ^70^ mediate recognition and melting of the −10 element, where bases at positions −11 and −7 flip into specific pockets within σ^70^ as the transcription bubble expands [54]. Similar protein pocket-stabilized base-flipping appears absent in Pol I, and how it stabilizes the upstream edge of the transcription bubble remains unclear. An optimal 17 bp spacing between the −35 and −10 elements enables efficient holo binding and transcription initiation. A similar spacing requirement exists in the Pol I system: altering the distance between the CF binding site and the TSS by one bp reduced transcription activity by over 70% (Fig. 5a and b). Current models suggest that proper alignment of the −11/−7 region with Pol I is critical, as this region undergoes sharp bending before melting [15]. This melting may also be driven by allosteric duplex destabilization as the Pol I cleft closes around the promoter [12]. Because sequence dictates both the energy needed to disrupt base stacking and the balance of unfavorable DNA distortions with favorable protein contacts, the sequence–function relationship of the rDNA promoter has likely more currently unknown nuances. In bacteria, even single bp changes can broadly alter holo–DNA interactions and base stacking, allowing a few key positions to control OC stability and transcription rate [55].

OCs on bacterial rDNA promoters are unstable, sensitizing transcription to regulation [47]. On the other hand, OC instability prevents holo, and perhaps by analogue Pol I, from stalling in abortive cycles caused by stable protein–promoter interactions. Additionally, stable inactive OC conformations have been found to suppress bacterial transcription [55, 56]. OC instability at the E. coli rrnBP1 promoter results from its GC-rich discriminator (−8 to −1) and an extended 9 bp spacing between the −10 element and TSS, which weaken the interaction of σ^70^ with single-stranded ntDNA [47]. Lacking a σ^70^-like domain and strong ntDNA interactions, Pol I may also form inherently unstable OCs; consistently, the ntDNA remains disordered across the bubble in all cryo-EM Pol I OC models [12–15, 57]. Several RNAP cleft loops, including the rudder and lid, help to separate ntDNA and tDNA strands and maintain the transcription bubble [58]. Their conserved positioning in Pol I suggests a similar role. The CF Rrn7 B-reader, like σ^70^ finger 3.2 in bacteria—whose deletion destabilizes the OC [59]—may also support OC stabilization by interaction with tDNA near the Pol I active site. During the CC-to-OC transition, RNAP and Pol I clefts contract around the promoter, aligning protein–DNA interactions. This clamp closure displaces mobile elements—σ^70^ domain 1.1 in RNAP [54], and the DNA-mimicking loop/expander and CTD A12.2 in Pol I—clearing the active site for promoter loading [13, 14]. RNAP clamp dynamics are targeted by transcription regulators and inhibitors. Catabolite activator protein transiently opens the clamp [60], and antibiotics fidaxomicin [61] and myxopyronin [62] lock it open or closed, respectively. Given the conserved clamp mechanism in bacterial RNAP and Pol I, it may also be a target for rDNA transcription regulation in eukaryotes.

Transcription initiation mechanisms of Pol I are broadly conserved between yeast and humans [63]. Pol I from both species exhibits highly similar structural features, with conserved protein domains mediating interactions with promoter DNA and displaying comparable conformational dynamics, such as clamp closure upon DNA binding [64–66]. Rrn3 is likewise highly conserved and functions as a bridging factor, linking Pol I to the promoter-specific TF complexes—CF in yeast and SL1 in humans [67, 68]. While the 3D structure of SL1 remains unknown, it contains homologues of all three CF subunits, along with additional subunits TAF1D and TBP [69, 70]. Despite the low sequence identity between CF subunits and their SL1 orthologs (8%–16%), several domains of Rrn7—the subunit critical for promoter recognition and interaction with Pol I/Rrn3—can be functionally substituted by their TAF1B counterparts [71]. However, as SL1 contains additional subunits, flexible regions, and integrated TBP, its behavior likely does not fully mirror that of CF; any mechanistic parallels therefore remain tentative but represent an important direction for future research. Also, the DNA-binding affinity and specificity of isolated SL1 may be lower than those of CF; detection of SL1 binding to the rDNA promoter—typically assessed by footprinting or crosslinking—requires the presence of the upstream binding factor UBF, unlike CF which can bind to the promoter independently [72]. UBF binds not only to the promoter of rDNA but also across the coding region, where it induces and maintains an open chromatin structure. In yeast, the upstream region of the rDNA promoter is bound by UAF, a multiprotein complex that, while structurally unrelated to UBF, serves an analogous promoter-specific role in stimulating Pol I transcription, acting in cooperation with TBP, which mediates UAF’s interaction with CF [17, 63]. Whether UAF merely facilitates CF and Pol I recruitment or actively modulates PIC transitions and/or promoter escape remains unresolved, largely due to the lack of quantitative, step-specific activity data and the challenges associated with reconstituting a biologically active Pol I initiation complex with UAF. For comparison, the bacterial RNAP holoenzyme is known to bend the upstream promoter region (−40 to −60), coupling this structural distortion to enhanced transcription bubble formation and transcription initiation [45].

In conclusion, we found that CF primes rDNA transcription through an induced-fit-based recognition of its binding site on the promoter, followed by recruitment of the Pol I·Rrn3 complex. Efficient initiation of rDNA transcription then depends on the precise alignment of Pol I with the “bendable” and “meltable” regions of the promoter near the TSS. This alignment triggers DNA bending and melting, coordinating Pol I domain movements with the formation of bubble-stabilizing DNA–protein interactions as the CC isomerizes into a catalytically active OC.

Supplementary Material

gkag153_Supplemental_Files

Bibliography72

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1Ferreira R, Schneekloth JS, Panov KI et al. Targeting the RNA polymerase I transcription for cancer therapy comes of age. Cells. 2020;9:266. 10.3390/cells 9020266.31973211 PMC 7072222 · doi ↗ · pubmed ↗
2Warner JR . The economics of ribosome biosynthesis in yeast. Trends Biochem Sci. 1999;24:437–40. 10.1016/S 0968-0004(99)01460-7.10542411 · doi ↗ · pubmed ↗
3Mayer C, Zhao J, Yuan X et al. m TOR-dependent activation of the transcription factor TIF-IA links synthesis to nutrient availability. Genes Dev. 2004;18:423–34. 10.1101/gad.285504.15004009 PMC 359396 · doi ↗ · pubmed ↗
4Bywater MJ, Poortinga G, Sanij E et al. Inhibition of RNA polymerase I as a therapeutic strategy to promote cancer-specific activation of p 53. Cancer Cell. 2012;22:51–65. 10.1016/j.ccr.2012.05.019.22789538 PMC 3749732 · doi ↗ · pubmed ↗
5Pelletier J, Thomas G, Volarević S. Ribosome biogenesis in cancer: new players and therapeutic avenues. Nat Rev Cancer. 2018;18:51–63. 10.1038/nrc.2017.104.29192214 · doi ↗ · pubmed ↗
6Farley-Barnes KI, Ogawa LM, Baserga SJ. Ribosomopathies: old concepts, new controversies. Trends Genet. 2019;35:754–67. 10.1016/j.tig.2019.07.004.31376929 PMC 6852887 · doi ↗ · pubmed ↗
7Rossetti S, Wierzbicki AJ, Sacchi N. Mammary epithelial morphogenesis and early breast cancer. Evidence of involvement of basal components of the RNA polymerase I transcription machinery. Cell Cycle. 2016;15:2515–26. 10.1080/15384101.2016.1215385.27485818 PMC 5026817 · doi ↗ · pubmed ↗
8Keys DA, Vu L, Steffan JS et al. RRN 6 and RRN 7 encode subunits of a multiprotein complex essential for the initiation of r DNA transcription by RNA polymerase I in Saccharomyces cerevisiae. Genes Dev. 1994;8:2349–62. 10.1101/gad.8.19.2349.7958901 · doi ↗ · pubmed ↗