Incorporation of NUPACK-Based Simulation into Classroom and Laboratory Teaching of Nucleic Acids Hybridization for Undergraduate Biochemistry
Jinglin Fu, Anthony Monte Carlo, Doris Zheng

TL;DR
This paper describes using NUPACK to create virtual labs for teaching nucleic acid concepts in biochemistry, especially during the pandemic.
Contribution
The novel use of NUPACK for educational modules on nucleic acid hybridization and structure in undergraduate biochemistry.
Findings
NUPACK-based labs enhanced student understanding of nucleic acid structures and hybridization.
The virtual labs supported remote learning and were extended to high school students.
The approach improved accessibility and flexibility in biochemistry education.
Abstract
The COVID-19 pandemic has accelerated the shift from traditional in-person teaching to remote and online learning, necessitating a more adaptable educational platform to serve the diverse needs of students. Transforming hands-on “wet lab” activities into virtual “dry lab” exercises can promote a more accessible and flexible learning environment, offering innovative methods to improve online teaching outcomes, incorporate interactive components, and provide student support. Here we describe our effort of utilizing NUPACK, a free cloud-based web application, to develop new educational modules on nucleic acids for teaching biochemistry lectures and laboratories. These modules include fundamental topics such as melting temperature, hybridization equilibrium, free energy, secondary folding structures of nucleic acids, and the thermal stability of single-nucleotide polymorphisms. The…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
1
2
3
4
5
6| Poly(A-T)20
| ||||
|---|---|---|---|---|
| Best-Fit Values | 0 M Mg2+ | 0.01 M Mg2+ | Poly(G10A10) | Poly(A-T)30 |
| Bottom | 0.092 | 0.077 | 0.055 | 0.080 |
| Top | 1.005 | 1.012 | 1.016 | 1.007 |
| 40.30 | 51.09 | 56.54 | 50.55 | |
|
| 0.999 | 0.998 | 0.998 | 0.998 |
| Biochem II Lab | Summer Research Apprenticeship | New Students Training for Research |
|---|---|---|
| 2020–present | 2020–present | 2020–present |
| ∼30–40 undergraduates per year; 222 total | 2–3 high school students per summer; 10 total | 3–4 rotation students per year; 16 total |
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenetics, Bioinformatics, and Biomedical Research · RNA and protein synthesis mechanisms · Bacteriophages and microbial interactions
Introduction
1
An increasing number of computational tools are being utilized in research and educational activities in chemistry and biochemistry. It is crucial to integrate computational methods into classroom and laboratory teaching of biochemistry. ?,? This integration can enhance educational outcomes by bridging the gap between theoretical knowledge and practical applications, offering hands-on experience and enabling complex analyses. Computation-assisted learning is particularly beneficial in scenarios with limited laboratory resources, allowing students to engage in intricate tasks and analyze biochemical molecules and processes. Additionally, the computation-assisted approach makes teaching styles more adaptable to fit diverse student needs. For instance, the COVID-19 pandemic promoted remote and online teaching, requesting a rapid modification and a shift of educational platforms toward remote learning modules. This rapid shift posed a challenge for many subjects and negatively impacted the education outcome of students’ learning, especially for laboratory-based instruction. Converting “wet lab” practices to “dry lab” exercises can facilitate more accessible and flexible remote learning, offering innovative ways to enhance online teaching outcomes, incorporate interactive elements, and support students in this new learning environment.
Nucleic acids are a fundamental and essential topic in biochemistry. DNA and RNA, as key biomolecules, play crucial roles in storing and transmitting genetic information about living organisms. In biochemistry education, exploring the molecular structure, function, and applications of nucleic acids is essential. Some basic elements include single-stranded and double-helix structures of DNA, base pairing rules, secondary structures of single-stranded nucleic acids, thermodynamic properties, and hybridization equilibrium. Practical applications such as polymerase chain reaction and DNA sensory circuits should also be introduced.? Computational tools can enhance the learning outcomes by providing hands-on experience for investigating nucleic acid structures’ functions, properties, and hybridizations.
NUPACK is an evolving online software suite used for simulating or analyzing nucleic acid structures and hybridizations as well as designing complex nucleic acid systems. The software platform was developed by Dr. Niles A. Pierce and his colleagues at the California Institute of Technology, ?,? which was supported by the U.S. National Science Foundation and the Molecular Programming Project. NUPACK uses models published by Santa Lucia (1998)? for DNA molecules and Serra and Turner (1995)? and Matthews et al. (1999)? for RNA molecules. It excels in the computation and simulation of the thermodynamic parameters of DNA and RNA strand interactions such as ΔG of hybridization, equilibrium, and secondary structures. Widely used by researchers for DNA molecular programming, nucleic acid nanotechnology, and synthetic biology, NUPACK is considered a standard toolbox for analyzing and designing nucleic acid systems. ?,? It is open-access and free for individual users with an updated NUPACK Cloud server that accommodates a high volume of users and tasks, making it ideal for educational and research purposes. Additionally, NUPACK can be integrated with advanced programming languages like Python and C for customized applications.?
During the COVID-19 pandemic, we developed computational DNA laboratories for remote and online teaching in the biochemistry lectures and laboratories in the Chemistry Department of Rutgers UniversityCamden. We utilized NUPACK to create new and innovative learning modules on nucleic acids, offering students a visual, hands-on, and interactive experience. This paper summarizes the course development and instructions of the DNA computation lab, which can benefit the education on nucleic acids and biochemistry for a broader audience of teachers and students.
Methods
2
NUPACK Cloud web app? is used for the analysis and simulation of nucleic acid structures and hybridizations. Before using NUPACK, teachers and students should register for an individual noncommercial academic subscription, which is temporarily free now. Using the NUPACK Python module and/or source code and modification is also permitted for noncommercial academic purposes only. This allows more user-friendly modification of NUPACK for specific research tasks. The software module and source code should be requested from [email protected], and the redistribution of the software in source form and/or binary form is not permitted.
Other useful tools include online IC_50_ fitting,? Reverse Complement,? and software for Excel or Prism (GraphPad). All data fitting can be completed by the free-access IC_50_ fitting software and Excel if Prism is not available. The detailed instructions for NUPACK are included in the Supporting Information. No unexpected or unusually high safety hazards were encountered.
Simulation and Analysis of Thermal Denaturation
of dsDNA
3
NUPACK Simulation of Thermal Denaturation
of dsDNA
3.1
In Figure, the thermal melting temperature (T m) of DNA strands refers to the specific temperature at which 50% of DNA in a sample has transitioned from double-stranded DNA (dsDNA) into single-stranded DNA (ssDNA) due to the heat-induced denaturation of the double-helix structure.? T m indicates dsDNA stability, with a higher GC content resulting in a higher T m due to increased thermal stability. Additionally, magnesium ions play an important role in stabilizing dsDNA by neutralizing the negative charges on the phosphate backbone. Experimentally, the dissociation of dsDNA into ssDNA can be characterized by the increased absorbance of DNA solutions at 260 nm because of the hyperchromic effect. In dsDNA, the hydrogen bonding between base pairs restricts the resonance in aromatic rings, reducing light absorbance in the helical structure.? An ssDNA solution can absorb 37% more light than a dsDNA solution at the same concentration of total oligonucleotides. Typically, a UV–vis spectrometer is used to measure dsDNA’s thermal denaturation process, with a water or thermal cycler to control the temperature. The quality of the data largely depends on the operational procedure and instrument accuracy. Students may face challenges in obtaining a good thermal-melting curve using a water cycler/UV–vis instrument.
Thermal denaturation of dsDNA into ssDNA by heating. T m is defined as the temperature at which 50% of DNA has transitioned from dsDNA into ssDNA. T m increases with increasing GC content.
NUPACK offers an alternative computation-based “dry lab” to teach students about the melting temperature of DNA hybridization. As shown in Figure, NUPACK is used to simulate the thermal denaturation of dsDNA with a given range of temperatures and salt conditions. Students can set up the parameters of “Compute melt” in NUPACK to generate a thermal denaturation graph showing the equilibrium fraction of unpaired ssDNA at different temperatures (FigureA,B). The melting temperature can be analyzed by fitting the data with the “Boltzmann Sigmoidal Equation” in GraphPad Prism (FigureA),? where V 50 corresponds to the T m value. Another method of fitting the melting temperature is to analyze the slope of the thermal denaturation curve by “first-order derivative analysis” in Prism. T m can be defined as the temperature at which the maximum absorbance change (dA/dT) occurs.? Thus, the peak value of the slope analysis approximates T m (FigureB). While GraphPad Prism is effective for data fitting, the instructor and students may need to purchase licenses. A free online alternative is the “IC_50_/EC_50_” tool, which can calculate the melting temperature by fitting the IC_50_ value of DNA hybridization thermal denaturation (see the Supporting Information).
NUPACK simulation of the thermal denaturation of dsDNA. (A) Setup for the melting analysis of dsDNA in NUPACK. (B) Calculation of the ssDNA fraction at equilibrium depending on the temperature.
T m fitting for thermal denaturation of dsDNA. (A) T m fitting using the Boltzmann sigmoidal model for V 50. (B) T m fitting using first-order derivative analysis for the peak temperature.
Using NUPACK and fitting analysis, students investigate the melting temperature of dsDNA hybridization depending on various parameters (Table), including the number of base pairs (e.g., poly(A-T)20 vs poly(A-T)30), the variation of GC content (e.g., poly(A-T)20 vs poly(G_10_A_10_)), and the addition of Mg^2+^. Students should conclude that dsDNA hybridization is more stable with an increased number of base pairs, higher GC content, and the addition of Mg^2+^ to reduce electrostatic repulsion.
1: Melting Temperatures for dsDNA Analyzed at Various Magnesium Concentrations, GC Contents, and Lengths
Comparison of Melting Temperature Differences
for Single-Nucleotide Polymorphism
3.2
A single-nucleotide polymorphism (SNP) is a variation found at a specific single nucleotide position in the DNA sequence of the genome among individuals.? For instance, at a specific base position in the human genome, the majority of people may have the nucleotide G, while in a minority, it may be an A. This indicates the presence of an SNP at that specific position, with the two possible nucleotide variations (G or A) being referred to as alleles for that position. While some SNPs do not lead to disorders, certain SNPs are linked to specific diseases and personalized medicine. A mismatch of the base pairing in SNPs can result in a decrease in melting temperature. Polymerase chain reaction (PCR) is a commonly adopted technique for identifying SNPs through the change in T m to detect the mismatch hybridization between the target DNA strands and probes.?
In the lab, students are asked to analyze a known SNP with ID “rs762551”, which encodes the CYP1A2*1F allele of the CYP1A2 gene.? CYP1A2 is a cytochrome P450 enzyme that is responsible for the metabolism of caffeine and some drugs. An SNP of “CTCTGTGGGC [C/A] CAGGACGCAT is found within this gene, where a “C” base may appear at the specific base position (red-labeled) in the human genome for some individuals; however, others may show an “A” base instead. This indicates an SNP of the C/A mutation at this specific position. As shown in Table, two TaqMan probes are designed to detect the “rs762551” SNP: one probe strand corresponding to the “C” mutation is labeled with a TET fluorophore (tetrachlorofluorescein), and the other probe strand corresponding to the “A” mutation is labeled with a FAM fluorophore (carboxyfluorescein). In the TaqMan PCR assay, the probe strand reports a higher fluorescence when fully matched with the target sequence, while the probe strand with an SNP mismatch reports a weaker fluorescence. The mismatch in DNA hybridization results in a T m lower than that of a fully matched hybridization.
2: Thermal Melting Analysis of SNPs
To examine the impact of SNP mismatch on the melting temperature, students use NUPACK to simulate the thermal denaturation process of DNA hybridizations between a target and a probe labeled with TET or FAM. By analyzing the thermal denaturation, students should select a test temperature under which the fraction of ssDNA differs the most between TET-Target and FAM-Target.
Simulation
of the Secondary Structure for Nucleic Acid Amplicons
4
For molecular diagnostic applications, it is necessary to identify an appropriate target sequence. Due to homologies between virulent and benign species, these sequences must be chosen carefully to avoid false positives. This can be an involved bioinformatic process. In nucleic acid diagnosis, viral amplicons are the unique and conserved segments (e.g., E and N genes in SARS-CoV-2) shorter than a few hundred nucleotides. For example, in Figure, the E (envelope) and N (nucleocapsid) genes of the SARS-CoV-2 virus are quite conserved across mutations and viral subtypes.? Therefore, amplicons in the E or N gene are used to screen the patient sample and confirm the infection of SARS-CoV-2. The U.S. Centers for Disease Control and Prevention (CDC) published three amplicons from the N gene for identifying the infection of the SARS-CoV-2 virus that caused the COVID-19 pandemic (Table).? The sections labeled red are forward primers and antireverse primers for amplification by PCR. The green-labeled sections are the probe sequences for detection.
SARS-CoV-2 genome map. Diagnosis amplicons use the unique and conserved E and N genes, coding for the viral envelope and nucleocapsid. Reproduced from ref . CC BY 4.0.
3: U.S. CDC Published N Gene Amplicons of the SARS-CoV2 Virus for Diagnosing COVID-19 Infection
In this task, students are instructed to use NUPACK to simulate the secondary structures of DNA amplicons at various temperatures. In Figure, at room temperature of 25 °C, three amplicons exhibit stable secondary structures with a large negative ΔG. As the temperature increases from 25 °C to 60 and 90 °C, the self-folded secondary structures of the DNA amplicons become less stable, resulting in decreased ΔG values. Through simulation, students should understand that the stability of self-folded secondary structures of DNA is highly temperature-dependent.
Simulation of the folding structure and energy for DNA amplicons of SARS-COV2 virus. (A–C) N1 amplicon at (A) 25 °C, (B) 60 °C, and (C) 90 °C. (D–F) N2 amplicon at (D) 25 °C, (E) 60 °C, and (F) 90 °C. (G–I) N3 amplicon at (G) 25 °C, (H) 55 °C, and (I) 90 °C.
Classroom Implementation
5
The DNA computation lab has been implemented into the General Biochemistry II Laboratory at Rutgers UniversityCamden. Each section of the lab has a maximum capacity of 18 students and usually contains 12–18 students. The lab generally offers two sections for 30–40 students each year. The students are asked to complete a prelab quiz to review the lab and the instructions to prepare for the lab, such as registering for the student account for NUPACK, learning DNA thermal stability, and practicing examples of IC_50_ fitting and first-order derivative analysis.
The DNA computation lab runs in a regular lab schedule of 3 contact hours. The lab starts with instructions for the DNA lab and a step-by-step NUPACK guide. All required instruction information is available in the supporting PPT file. Most students can complete the tasks within a regular lab period. The NUPACK and fitting analyses are quite straightforward to implement. Below are tasks for students to complete in the classroom.
Task 1. Thermal Denaturation and Melting Temperature Fitting. All of the NUPACK experiments are set at 0.137 M Na^+^, and the temperature ranges from 20 to 80 °C. The students perform the following:
- 1.1.NUPACK-generated thermal denaturation graph for poly(A-T)20 and poly(A-T)40.
- 1.2.NUPACK-generated thermal denaturation graph for poly(A-T)20 and poly(C-G)20.
- 1.3.NUPACK-generated thermal denaturation graph for poly(A-T)20 at 0 M Mg^2+^ and poly(A-T)20 at 0.01 M Mg^2+^.
- 1.4.Melting temperature fitting by at least two methods, “Boltzmann Sigmoidal Kinetics” (or “IC_50_”) and “first-order derivative analysis”.
- 1.5.Summary of the melting temperatures for DNA hybridizations above and drawing a conclusion about what factors affect the melting temperature of dsDNA hybridization.
Task 2. Thermal Denaturation for Single-Nucleotide Polymorphism. All of the NUPACK experiments are set at 0.137 M Na^+^. The students perform the following:
- 2.1.NUPACK-generated thermal denaturation graphs for hybridization of TET with Target and FAM with Target.
- 2.2.Melting temperature fitting by “IC_50_” and “first-order derivative analysis”.
- 2.3Picking a temperature based on the melting temperature fitting with a significant difference in hybridization yield between TET–Target and FAM–Target.
- 2.4.Drawing a conclusion about how the single-nucleotide mismatch affects the melting temperature of the DNA hybridization.
Task 3. Simulation of the Secondary Folding Structures of ssDNA Amplicons. The students perform the following:
- 3.1.Simulation of the secondary folding structures of N1, N2, and N3 ssDNA amplicons at 25 °C and finding ΔG of the structures.
- 3.2.Simulation of the secondary folding structures of N1, N2, and N3 ssDNA amplicons at 60 °C and finding ΔG of the structures.
- 3.3.Simulation of the secondary folding structures of N1, N2, and N3 ssDNA amplicons at 90 °C and finding ΔG of the structures.
- 3.4.Drawing a conclusion about how the secondary folding structures of ssDNA change depending on the temperature and their structural ΔG.
Student Participation and Feedback
6
Since the COVID-19 pandemic, the DNA simulation lab has been incorporated into the General Biochemistry Lab, which hundreds of students have taken in the past 10 years at Rutgers UniversityCamden. Besides, this DNA simulation lab has also been introduced to high school students who were sponsored by the High School Internships Program (HSIP) from the Army Education Outreach Programs (AEOP). HSIP has sponsored two or three local high school students per year for 8–10 week paid summer research experiences at Rutgers UniversityCamden since 2015. Students can apply directly to HSIP from the AEOP website and are interviewed and selected by the research mentor. The research rotation students in the Fu lab are required to take this DNA simulation training. With the completion of DNA simulation training, students can participate in more complex computation-aided designs and experiments. To enhance K–12 education, we also invited teachers from local high schools to participate in the DNA simulation training. Table summarizes the student training and participation in the NUPACK-based DNA simulation lab in the past 5 years.
4: Students’ Participation in the DNA Simulation Lab
We used the “DNA melting temperature” experiment to evaluate the students’ learning outcomes, which was included in both the wet lab and the dry lab. In this lab, the primary learning goal includes (1) understanding the DNA thermal melting process, (2) fitting the melting temperature for DNA hybridization, and (3) investigating the factors that affect the melting temperature of DNA hybridization. Therefore, the laboratory relies on high-quality data and fitting analysis. While the wet lab can offer students real hands-on experience in sample preparation and experiment operation, it is not the primary goal of the “DNA melting temperature” lab. In the wet lab, students often had trouble obtaining high-quality data for fitting due to equipment limitations and incorrect procedures. For example, the traditional water-bath UV spectrometer may not produce good results due to poor temperature control and slow heating and cooling steps. The students may also fail the experiment due to mistakes in sample preparation and uncareful operation. Additionally, due to the slow data collection, students can investigate only limited parameters and conditions for DNA hybridizations. To improve the data quality and throughput, we asked students to use the NUPACK-based simulation to generate various data sets of the DNA thermal melting process and adjust the input parameters and conditions. These simulated data sets are used to perform fitting algorithms to calculate the melting temperature.
The student’s performances in the wet and dry labs for the DNA melting temperature are compared in Figure. There are more failed or incomplete lab reports for the wet lab (4) than for the dry lab (2). The average score of a lab report improved from 82.5 for the wet lab to 94.5 for the dry lab. As explained above, in the wet lab, the instructor and students may struggle to acquire a good data set for DNA thermal melting, which could be affected by the instrument setup, sample preparation, and operation. The NUPACK-based simulation can substantially improve the students’ understanding of DNA melting temperature by providing a hands-on and interactive approach to learning. Using this, students can visualize and analyze the stability of DNA duplexes when they manipulate various parameters, such as sequence composition, length, and ion concentration. Such a simulation-based learning approach is particularly useful for a lab focusing more on data analysis and parameter investigation rather than the wet lab experience. General Biochemistry Lab I at Rutgers UniversityCamden includes mixed topics of wet and dry laboratories, providing students with experience in basic laboratory skills, simulation, and data analysis.
Assessment of student learning outcomes for DNA melting temperature with the wet lab (blue dots) and the NUPACK-based dry lab (orange dots). The number of incomplete lab reports was reduced from four for the wet lab to two for the dry lab. The average score of a lab report improved from 82.5 for the wet lab to 94.5 for the dry lab.
Conclusion
7
In summary, we have demonstrated the use of NUPACK in teaching computation-based DNA laboratories for undergraduate students in General Biochemistry. NUPACK is a user-friendly and free online tool for predicting DNA hybridization equilibrium, simulating DNA/RNA structures, and calculating the hybridization energy. By integrating NUPACK into the learning process, students can explore factors affecting DNA hybridization’s melting temperature and stability, such as magnesium concentration, GC content, and length. Additionally, students practice analyzing melting temperature differences in SNPs, commonly used in PCR. They also learn about data-fitting tools for analyzing the thermal denaturation process, including fitting models such as IC_50_, Boltzmann sigmoidal kinetics, and first-order derivative analysis. Apart from hybridization studies, students are exposed to simulating the secondary folding structures of ssDNA amplicons from the SARS-CoV-2 viral genome. This hands-on experience of basic DNA computation provides a unique learning opportunity for students in the Biochemistry class, enhancing their understanding of nucleic acid structures, hybridizations, and properties. Students interested in further exploration can use NUPACK to design and simulate complex DNA hybridization systems such as toehold-mediated strand displacement and sandwich hybridization and participate in undergraduate research on DNA nanosystems. This computational-based approach to DNA learning has also been introduced to summer research apprenticeship programs for high school students (e.g., High School Apprenticeships by the Army Education Outreach Program).
Supplementary Material
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Mc Donald A. R.Roberts R.Koeppe J. R.Hall B. L.Undergraduate structural biology education: A shift from users to developers of computation and simulation tools Curr. Opin. Struct. Biol.202272394510.1016/j.sbi.2021.07.01234461592 · doi ↗ · pubmed ↗
- 2Sung R. J.Wilson A. T.Lo S. M.Crowl L. M.Nardi J.St. Clair K.Liu J. M.Biochem AR: An Augmented Reality Educational Tool for Teaching Macromolecular Structure and Function J. Chem. Educ.20209714715310.1021/acs.jchemed.8b 00691 · doi ↗
- 3Santiago-Mc Rae E.Oh S. W.Carlo A. M.Bar O.Guan E.Zheng D.Grgicak C.Fu J.Rapid Nucleic Acid Reaction Circuits for Point-of-care Diagnosis of Diseases Curr. Top. Med. Chem.20222268669810.2174/157016381966622020711414835139798 · doi ↗ · pubmed ↗
- 4Dirks R. M.Pierce N. A.A partition function algorithm for nucleic acid secondary structure including pseudoknots J. Comput. Chem.2003241664167710.1002/jcc.1029612926009 · doi ↗ · pubmed ↗
- 5Zadeh J. N.Steenberg C. D.Bois J. S.Wolfe B. R.Pierce M. B.Khan A. R.Dirks R. M.Pierce N. A.NUPACK: Analysis and design of nucleic acid systems J. Comput. Chem.20113217017310.1002/jcc.2159620645303 · doi ↗ · pubmed ↗
- 6Santa Lucia J. C.Jr.A unified view of polymer, dumbbell, and oligonucleotide DNA nearest-neighbor thermodynamics Proc. Natl. Acad. Sci. U. S. A.1998951460146510.1073/pnas.95.4.14609465037 PMC 19045 · doi ↗ · pubmed ↗
- 7Serra M. J.Turner D. H.Predicting thermodynamic properties of RNA Methods Enzymol.199525924226110.1016/0076-6879(95)59047-18538457 · doi ↗ · pubmed ↗
- 8Mathews D. H.Sabina J.Zuker M.Turner D. H.Expanded sequence dependence of thermodynamic parameters improves prediction of RNA secondary structure J. Mol. Biol.199928891194010.1006/jmbi.1999.270010329189 · doi ↗ · pubmed ↗
