Strength of selection potentiates distinct adaptive responses in an evolution experiment with outcrossing yeast

Mark A Phillips; Megan Sandoval-Powers; Rupinderjit K Briar; Marcus Scaffo; Shenghao Zhou; Molly K Burke

PMC · DOI:10.1093/g3journal/jkag009·January 16, 2026

Strength of selection potentiates distinct adaptive responses in an evolution experiment with outcrossing yeast

Mark A Phillips, Megan Sandoval-Powers, Rupinderjit K Briar, Marcus Scaffo, Shenghao Zhou, Molly K Burke

PDF

Open Access

TL;DR

This study shows how different levels of stress affect yeast evolution, revealing that stronger selection leads to more distinct genetic changes and adaptation pathways.

Contribution

The study experimentally demonstrates how varying selection intensity shapes the genetic architecture and biological pathways of adaptation in yeast.

Findings

01

Adaptation occurred through many small allele and haplotype frequency shifts, indicating a polygenic response.

02

High stress led to larger allele frequency changes at ethanol-related loci compared to moderate or no stress.

03

Moderate and high stress engaged distinct biological pathways, showing selection intensity influences adaptive targets.

Abstract

Selection intensity is expected to influence the magnitude and genetic architecture of adaptive responses, yet it is rarely evaluated as a standalone variable in experimental evolution studies. Here, we evolved outcrossing populations of Saccharomyces cerevisiae for ∼200 generations across a spectrum of environmental stress from zero to moderate to high ethanol exposure, to examine how genomic responses vary with stress intensity. Across treatments, adaptation proceeded through many subtle allele and haplotype frequency shifts rather than large changes at single loci, consistent with a highly polygenic response. At loci associated with ethanol adaptation, the high stress treatment led to larger allele frequency changes compared with the moderate or no ethanol stress treatments, with the genomic architecture of adaptation becoming increasingly polygenic as selection intensity decreased.…

Linked entities

Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.

Species1

Saccharomyces cerevisiae

Chemicals1

ethanol

Figures6

Click any figure to enlarge with its caption.

Fig. 1 — Phenotypes of evolved populations. Growth rates and doubling times for the ancestral and experimental populations after 15 cycles of adaptation in control (a and b), moderate ethanol (c and d), and high ethanol conditions (e and f). In each assay, nine technical replicates of the ancestor and 10 randomly-chosen replicates from each treatment were evaluated.

Fig. 2 — Treatment-specific candidate regions. Results from CMH tests comparing SNP frequencies between cycles 1 and 15 in replicate populations selected for (a) high ethanol stress, (b) moderate ethanol stress, and (c) control conditions. Black lines represent significance thresholds, and color coding of significantly differentiated sites corresponds to the response categories described in Table 1.

Fig. 3 — PCA of experimental populations. Principal components analysis of raw allele frequencies (N = 61,281) in the ancestor and replicate populations at each sequenced timepoint reveals distinct clustering by treatment. Early in the experiment (cycle 1 = C1), all populations cluster near to each other, and the ancestor, but by cycle 7 (C7) and cycle 15 (C15) greater separation between treatment groups is evident, with the largest distance between control populations and ethanol stress populations.

Fig. 4 — Correlations between selection coefficients across categories. Pearson correlation coefficients (r) between selection coefficients estimated in C, M, and H populations for (a) all polymorphic sites, and the candidate SNPs for (b) general laboratory selection, (c) general ethanol selection, (d) high ethanol selection, (e) moderate ethanol selection, and (f) control selection.

Fig. 5 — Magnitude of allele frequency change across response categories. Boxplots show the mean absolute change in SNP frequency for each replicate in each treatment for (a) general laboratory adaptation candidates, (b) control-specific candidates, (c) general ethanol adaptation candidates, and (d) moderate ethanol-specific candidates. For this figure and analysis, we combined general ethanol and high ethanol-specific candidates into one group based on the high correlation between high and moderate-ethanol specific candidates revealed by Fig. 4d.

Fig. 6 — Haplotype frequencies under the highest general ethanol selection peak. Haplotype frequencies in (a) C populations, (b) M populations, and (c) H populations in cycles 1, 7 and 15. Frequencies are shown across the 60 kb region around the most significant SNP in the highest general ethanol selection peak (see Fig. 2). Different colors indicate each founder haplotype, and individual lines show estimated frequencies in each experimental replicate population.

Tables1

Table 1.. The five major response types used to categorize candidate sites. These are defined based on how sites significant in tests of allele frequency differentiation overlap within and between treatment-specific statistical comparisons.

Response type	Definition	Number of candidate SNPs
General laboratory selection	Candidate sites that overlap across all three treatments based on comparisons between cycles 1 and 15 for each.	3,963
General ethanol selection	Candidate sites that overlap across H and M treatments based on comparisons between cycles 1 and 15 for each. Any sites also significant in the C treatment were removed.	1,687
High ethanol-specific	Candidate sites unique to the H treatment based on cycle 1 vs cycle 15 comparisons (i.e., sites significant in other cycle 1 vs 15 comparisons were removed).	2,496
Moderate ethanol-specific	Candidate sites unique to the M treatment based on cycle 1 vs cycle 15 comparison (i.e., sites significant in other cycle 1 vs 15 comparisons were removed).	1,073
Control-specific	Candidate sites unique to the C treatment based on cycle 1 vs cycle 15 comparison (i.e., sites significant in other cycle 1 vs 15 comparisons were removed).	9,896

Equations2

Funding4

—College of Science at Oregon State University
—National Institutes of Health10.13039/100000002
—National Science Foundation Postdoctoral Fellowship
—National Science Foundation Postdoctoral Fellowship

Keywords

experimental evolutionadaptationgenomicspopulation genomicsfungi

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEvolution and Genetic Dynamics · Animal Behavior and Reproduction · Fungal and yeast genetics research

Full text

Introduction

Evolve and resequence (“E&R”) experiments are now commonly used to study the genetics of adaptation and complex traits (Long et al. 2015; Schlötterer et al. 2016). In these studies, experimental populations are subjected to carefully design selective regimes under controlled conditions, and sampled for DNA sequencing over many generations. This experimental framework allows researchers to link observed phenotypic responses, including changes in gene expression, to underlying shifts in genetic variation. In addition to linking genotypes to phenotypes, the ability to control the experimental design including: population size, level of starting genetic variation, and environmental conditions make these experiments a powerful tool for studying broad adaptive dynamics. It has been shown that adaptation in E&R studies featuring outcrossing systems is fueled by standing genetic variation (Long et al. 2015; Barghi et al. 2020). However, the specific evolutionary dynamics observed across these studies vary considerably. We believe achieving a better understanding of what causes these differences is a key step toward extending general findings from E&R to real populations (Phillips and Burke 2021).

Major points of discrepancy in the literature include evolutionary repeatability, the role of contingency in the genetics of adaptation, and the relative prevalence of “sweeps vs shifts” (Kessner and Novembre 2015; Stetter et al. 2018; Christodoulaki et al. 2019; Vlachos and Kofler 2019; Hayward and Sella 2022). Conflicting findings around evolutionary repeatability in particular have received substantial attention. While some studies report high levels of evolutionary repeatability, as evidenced by parallel responses to selection across replicate populations (e.g., Linder et al. 2022), others find more idiosyncratic and replicate-specific responses (e.g., Barghi et al. 2020). While this issue might not be entirely resolved, comparing across studies does point to potential explanations. As synthesized by Schlötterer (2023), differences in trait architecture (i.e., simple vs complex), number of founder genotypes, and properties of founder populations (e.g., levels of linkage disequilibrium and polymorphism) are contributing to the observed differences in evolutionary repeatability. More broadly, there is a critical need to better understand how experimental parameters shape evolutionary outcomes. Here we focus on one such parameter that has received comparatively little empirical attention: the intensity of the selective environment.

How a given selective pressure manifests (e.g., intensity, direction, duration, constant vs dynamic, etc.) has clear ecological relevance that likely shapes evolutionary outcomes in both natural populations and laboratory experiments. While this topic has been explored across a number of theoretical studies (Kessner and Novembre 2015; Stetter et al. 2018; Christodoulaki et al. 2019; Vlachos and Kofler 2019; Hayward and Sella 2022), few E&R studies have sought to measure this effect empirically. Recent work suggests the relationship between how selection is imposed and how populations respond is more complex than might have previously been expected. For example, Otte et al. (2021) compares responses to selection in cold and heat-stressed populations of D. simulans. As these are both forms of temperature stress, one expectation would be that targets of selection are common to both regimes, instead allele frequencies move in opposite directions. However, while they observed a similar number of targets under both selection regimes, overlap was limited and targets with the highest effect sizes were treatment-specific (implying that heat and cold resistance have distinct genetic bases). Here we aim to expand this body of work by testing whether or not varying the intensity of the environmental pressure in a single direction might similarly impact outcomes.

Using populations of outcrossing S. cerevisiae derived from a single ancestral population, we compared genomic responses to moderate and high ethanol exposure. Under a simplified expectation, one might predict that a shared set of alleles contributing to ethanol tolerance segregating in our ancestral population would increase in frequency in both selection treatments with the magnitude of change scaling with selection intensity. Some theoretical work supports this view; for example, Christodoulaki et al. (2019) found that when new trait optima are distant (i.e., corresponding stronger selection), the genomic response involves larger sweep-like changes in and around targets of selection compared to more modest shifts when optima are closer. However, complex traits are shaped by widespread pleiotropy as some suggest (Visscher and Yang 2016; Boyle et al. 2017) or epistasis, such proportional responses may not be observed. Under these scenarios, alleles that are advantageous under high ethanol stress may be neutral or deleterious at lower levels of exposure, and vice-versa. Consistent with this possibility, previous work with outcrossing S. cerevisiae populations identified an entire class of alleles beneficial at late but not early stages of adaptation, providing some support for the idea that epistatic interactions may be common in this system (Phillips et al. 2020). These examples invoke different types of genotype-by-environment interactions that result from different selection intensities effectively creating distinct environments.

In this study, we address two related questions: (i) does increasing environmental stress intensity lead to larger allele frequency changes in and around targets of selection and (ii) do the genetic targets of adaptation vary with stress intensity? We address these questions using genomic data from outcrossing S. cerevisiae populations subjected to zero, moderate, and high ethanol stress for ∼200 generations. This experiment will contribute to a deeper understanding of how intensity in environmental stress influences trait architecture enabling not only more productive synthesis across E&R work, but also better-informed models of polygenic adaptation and an improved capacity to extrapolate results from specific studies to the real world.

Materials and methods

Experimental populations

All experimental populations used in this study were derived from the “S12” yeast population described in detail in Phillips et al. (2021). Briefly, this population was created by combining 12 haploid strains from the barcoded SGRP collection (Cubillos et al. 2009). As described in Linder et al. (2020), these strains have been modified to enable easy crossing and diploid recovery—these modified strains were kindly provided by Dr. Anthony Long (see Supplementary Table 1). The S12 base population was created by an initial cross of these modified haploid strains, followed by 12 iterations of outcrossing to maximize genetic diversity and allow for some domestication to laboratory handling conditions (Burke et al. 2020; Phillips et al. 2021). Due to the already high species-specific recombination rates (Liu et al. 2019), as well as this extra selection for high rates of outcrossing, regions of linked alleles that may potentially respond to experimental evolution are expected to be relatively small in the population (as small as 5-10 kb; cf. Burke 2023). A total of 60 experimental populations were then derived from samples of the S12 population: 20 control replicates (C_1-20_), 20 moderate ethanol stress replicates (M_1-20_), and 20 high ethanol stress replicates (H_1-20_).

Selection regime and outcrossing

Levels of ethanol used in the two stress treatments were chosen prior to the start of the experiment based on growth rate measurements of the base population under different concentrations of ethanol in YPD medium (see next section for general assay methods). We found that 10% ethanol was close to the limit of what could support population growth for a 48-h period, and we thus chose this as the high stress treatment. As 6% ethanol resulted in a growth rate and doubling time approximately mid-way between the high stress and control conditions, we chose this as the moderate treatment (Supplementary Fig. 1).

The weekly outcrossing protocol described by Burke et al. (2020) was used to maintain sexual reproduction in the 60 replicate populations, with minor modifications to increase throughput (i.e., smaller volumes). This protocol involves batch culture of diploids in 1 mL of liquid medium in alternating wells of 24-well plates (Corning); every other well contained sterile YPD which was monitored for growth (as a check for potential cross-contamination) throughout the experiment. After batch culture of diploids, the entire 1 mL of culture was washed and resuspended in 1 mL minimal sporulation media (1% potassium acetate) and incubated with shaking for 72 h (30 °C/200 rpm). After sporulation, a random spore isolation protocol adopted from Burke et al. (2014) was implemented to disrupt asci and isolate spores. This protocol involves resuspending sporulated cultures in 1 mL Y-PER Yeast Protein Extraction Reagent (Thermo), followed by incubation at 50 °C for 15 min to kill vegetative diploid cells. Cultures were then resuspended in a 1% zymolyase (Zymo Research) solution to weaken ascus walls, and vortexed at maximum speed with 0.5 mm silica beads (BioSpec) to mechanically agitate the asci. Following these steps, spores were transferred to YPD agar plates supplemented with nourseothricin sulfate (100 mg/L), hygromycin B (300 mg/L) and G418 (200 mg/L) and incubated at 30 °C for 48 h, during which time spores mated (due to close proximity on the plate) and diploids germinated. The resulting lawns of new diploid cells were scraped off plates using sterile glass slides and transferred to 10 mL of sterile YPD media (1/10th of this culture was preserved in 15% glycerol and archived at −80 °C for DNA extraction and sequencing). From these super-saturated cultures, 10 μL were transferred into 1 mL of treatment-specific medium and incubated for 48 h (30C/200 rpm), with a 1/100 dilution halfway through to increase generational turnover.

For the control treatment, standard YPD (2% yeast extract, 1% peptone, 2% dextrose) was used as the culture medium. For the moderate and high ethanol stress treatments, YPD was supplemented with either 6% or 10% ethanol (by volume), respectively. After this period of batch culture, the next weekly outcrossing iteration would commence (Supplementary Fig. 2). All replicates were handled in parallel for a total of 15 wk, with one cycle of outcrossing per week. Using OD_600_ measurements to estimate cell density, we estimate that ∼10 asexual generations occurred over each 48 h period in liquid culture (selection phase). We infer that an additional ∼4 generations occurred during the noncompetitive phase of the outcrossing protocol (i.e., random mating and diploid recovery on agar plates) based on colony counts of dilution series. Thus, we approximate that 200 generations elapsed in each treatment over the 15 cycles. Although growth was slower in the high ethanol stress treatment compared to the other treatments, this did not lead to dramatically different estimates of generations among the treatments.

Growth rate assays

To detect evidence of phenotypic adaptation and potential trade-offs in each environment, we compared growth rates between evolved populations after 15 cycles of outcrossing and the ancestor by tracking optical density (OD) over time across all media types. To do this, replicate populations were first grown overnight in 10 mL YPD in a shaking incubator (30 °C/200 rpm). Cultures were then quantified via absorbance at OD_600_ (values of 1/100 dilutions typically ranged from 0.10 to 0.15 across all treatments and timepoints) and diluted to starting concentration of OD_600_ ∼0.1 using the appropriate media type. These dilutions were then transferred to individual wells of a 96-well plate (Corning) in 200 μL volumes. A Tecan Spark multimodal microplate reader measured absorbance in each well every 30 min over a 48-h period at 30 °C. Measurements were taken at four positions within each well and the average values were used for subsequent data analysis. To avoid edge effects, wells on the outer perimeter of the plates were filled with sterile YPD. In a given plate reader assay, we included 9 technical replicates of the ancestral population and 10 randomly selected replicates from each treatment. Using these protocols, independent plate reader assays were run under control, moderate ethanol stress, and high ethanol stress conditions.

Using this data and the R (R Core Team 2016) package “Growthcurver” (Sprouffske and Wagner 2016), we estimated doubling times and carrying capacity. Estimates were obtained by fitting data to the following logistic equation that gives the number of cells N_t_ (as measured by absorbance) at time t:

[eqn]

where starting population size is represented by N0 and carrying capacity by K. Here, carrying capacity is simply defined as the maximum population size in a particular environment. Lastly, r represents the growth rate that would occur if there were no limits on total population size. This value is also used to calculate the doubling time which is defined as DT = Ln2/r. From here, statistical comparisons were done on a per-assay basis in R (see Data Availability statement for code). For a given assay, we used Kruskal–Wallis tests to compare mean carrying capacities and doubling times across all groups. To determine which specific groups differed, we performed all pairwise Wilcoxon rank sum tests with Benjamini–Hochberg correction (Benjamini and Hochberg 1995) groups and used a 0.05 significance threshold. We chose to carry out our analysis in this manner to avoid the confounding effects of run-to-run variation. While qualitative patterns are consistent between runs, estimates like carrying capacity can vary a great deal between runs under the same conditions.

DNA extraction, sequencing, and read mapping

The 20 replicates of each treatment were sampled for pooled-population genome sequencing at three distinct timepoints: after a single outcrossing cycle, after 7 outcrossing cycles, and after 15 outcrossing cycles. To sample populations for sequencing, 1 mL of freezer stocks were revived on YPD agar plates. After 48 h of growth at 30 °C, the resulting lawns were broadly sampled by wooden applicator (to capture as much genetic diversity as possible) and these cells were inoculated into 10 mL of liquid YPD culture, and grown overnight in the shaking incubator. Genomic DNA was extracted from samples with Qiagen's Yeast/Bact Kit following the manufacturer's protocol. After checking DNA quantity, sequencing libraries were prepared for Illumina sequencing with the Nextera Kit DNA Sample Preparation Kit, implementing some routine modifications to increase throughput (Baym et al. 2015). Libraries were pooled into groups of 48 and run on at least one PE150 lane of the HiSeq3000 housed at OSU's Center for Quantitative Life Sciences (CQLS); samples with lower-than-average coverages were requantified, repooled and resequenced such that high (>50× average genome-wide) coverages were achieved across all experimental replicates.

As in Phillips et al. (2020), we used GATK v4.0 (Van der Auwera and O’Connor 2020) to align raw data to the S. cerevisiae S288C reference genome (R64-2-1) and create a single VCF file for all variants identified across all populations, using standard best practices workflows and default filtering options. This VCF file was converted into a “raw” SNP frequency table by extracting the AD (allele depth) and DP (unfiltered depth) fields for all SNPs passing quality filters; the former field was used as the alternate allele count observed at a presumed SNP, and the latter was used as the total coverage observed at that site. The VCF file was also used as an input for SnpEff v4.3 (Cingolani et al. 2012) to extract potential functional effects of individual SNPs. SnpEff annotates each variant in a VCF file (e.g., by tagging whether it occurs within a protein-coding sequence) and calculates the effect(s) each produces on known genes (e.g., amino acid changes).

Mean genome-wide coverage across the 180 samples sequenced for this study ranged from ∼50× to ∼300× with a median value of ∼80× (Supplementary Table 2). We began our filtering process by only considering sites where sequence coverage exceeded 20× in all populations. Next, we removed ∼13.8K sites that were not expected to be polymorphic based on the previously described sequences of the founder strains used to create the ancestral population (cf. Phillips et al. 2021). We also removed sites that were not polymorphic in the ancestral population itself, as for this study, our objective is to track the evolution of standing genetic variants, and not de novo mutations. Finally, we only considered sites where the alternate nucleotide frequency fell between 0.02 and 0.98 across the entire dataset. This removes sites where the alternate nucleotide is fixed across all populations, and sites where errors in sequencing and/or variant calling may create the appearance of polymorphism. These steps ultimately resulted in a SNP table with 61,281 high-quality SNPs. Al l scripts used for filtering and subsequent analyses are available on GitHub (see Data Availability statement).

Identifying candidate sites and selection response categories

We assigned each SNP to assign these to one of five distinct categories: (i) general adaptation to laboratory handling, (ii) general adaptation to ethanol stress, (iii) specific adaptation to the high ethanol treatment, (iv) specific adaptation to the moderate ethanol treatment, and (v) specific adaptation to the control treatment. The details of these definitions are given in Table 1. Comparing the magnitude of change at sites across these categories allows us to assess whether greater stress intensity results in greater shifts in allele frequencies and comparing the location of sites observed across these categories allows us to assess whether different stress intensities involve different genomic targets. Notably, the decision to include categories associated with laboratory conditions was made after observing evidence of continued and unique adaptation in our control treatment; this was an unanticipated outcome that has thematically influenced our interpretation of all other results. To identify and differentiate between these types of responses we identified SNPs showing significant responses to selection in each group of populations (C, M, and H), then looked at patterns of overlap between the three resulting candidate lists.

To identify sites responding to selection in each treatment, we used a modified version of the Cochran-Mantel-Haenszel (CMH) test developed by Spitzer et al. (2020). Unlike the classic CMH test commonly used in E&R studies, this version has been modified to account for the effects of drift and sampling noise associated with pooled sequencing. And while it is primarily used to compare SNP frequencies between two timepoints, it can make use of data from intermediate generations to produce more robust results.

To perform the modified CMH test, we used the “ACER” package in R (Spitzer et al. 2020). Three sets of tests were performed comparing changes in SNP frequencies from cycle 1 to cycle 15 in each treatment. To correct for multiple comparisons and identify candidate sites for, we adjusted our P-values using Benjamini–Hochberg procedure as implemented in the “p.adjust” function in R. Sites with adjusted P-values < 0.005 were deemed to be significant. We used a 0.005 significance threshold instead of the typical 0.05 as this increased stringency has been suggested to improve reproducibility (Benjamin et al. 2018). To generate the effective population size (N_e_) estimates needed to run this version of the CMH test, we used the methods described by Jónás et al. (2016) and implemented in the “poolSeq” package in R developed by Taus et al. (2017). Estimates were generated using SNP frequencies from the first and final timepoint sampled for each population using the “estimateNe” function with “method = P.planII.” This method was designed specifically to account for the two-stage sampling process associated with pool-seq data. We specified that the number of generations between timepoints (t) was 200, and number of individuals sampled (poolsize) was 1 × 10^6^ (note: as this poolsize is only a rough estimate, we tried a range of values and found that differences in either direction by several orders of magnitude did not affect estimates).

Estimating selection coefficients

In an effort to further characterize genomic responses to selection, we estimated selection coefficients (s) across the genome in each treatment based on changes in SNP frequencies over time. These were generated using Bait-ER (https://github.com/mrborges23/Bait-ER). Bait-ER is a Bayesian approach to estimate selection coefficients and identify targets of selection from E&R timeseries data developed by Barata et al. (2023). We chose Bait-ER specifically as findings from Barata et al. (2023) suggest s estimates are accurate even when effective population size (N_e_) is miss-specified by orders of magnitude. We believe this is an important consideration for our study as yeast populations feature complex life cycles which violates many of the assumptions underlying standard methods for estimating N_e_ from E&R timeseries data (Jónás et al. 2016). For instance, our populations experience sexual and asexual generations, census size can vary greatly between these phases, and our focal selective pressure is only present during the generations in liquid batch culture.

Using Bait-ER we estimated s based on changes in SNP frequencies over all timepoints sampled for each treatment. See previous section of Materials and Methods for details on how we generated N_e_ estimates. Unlike the modified CMH which allows users to specify N_e_ for each replicate population within a treatment, here we averaged N_e_ estimates across replicate populations to get a single estimate for each treatment (Supplementary Table 3). Ranging from 350 in the C populations to 470 in the H populations, we find that N_e_ estimates are far below what we would expect (in other words, they are unrealistic) and attribute this to the factors described above.

Selection intensity and magnitude of allele frequency change

To assess the prediction that more intense stress should result in greater changes in allele frequencies at target sites, we first calculated the mean changes in SNP frequencies between the C, M, and H populations across response categories. Statistical comparison of means across all groups was performed in R using a Kruskal–Wallis test, and pairwise Wilcoxon rank sum test were performed to distinguish differences between groups.

Gene ontology (GO) term analysis

Once SNPs belonging to each category of selection response listed in Table 1 were identified, we compared these for potential differential enrichment of gene ontology (GO) terms (Ashburner et al. 2000; Aleksander et al. 2023). These lists were created using the SnpEff output: for each response type, we created a list of genes associated with at least one candidate SNP in that category (note: here we consider all effect types as valid associations). Text files with candidate list and associated SnpEff annotation are available through Dryad for interested individuals (see Data Availability statement for details). GO term enrichment analyses for each gene list were then performed using Metascape (Zhou et al. 2019, accessed April 22nd 2022) with default settings. For our analysis, a gene was only counted once, even if there were multiple candidate SNPs associated with it. Our analysis was agnostic to which SNPs were the true target of selection and multiple candidates associated with a gene were assumed to be a result of linkage.

Haplotype analysis

While our primary analysis focuses on changes in individual SNP frequencies, we wondered whether responses to selection might be driven by the effects of rare variants private to single haplotypes. To test this, haplotype frequencies in all evolved populations were estimated using a sliding-window haplotype caller developed by Linder et al. (2020) and that has since been available for public use: github.com/tdlong/yeast_SNP-HAP. Using the haplotyper.limSolve.code.R script, we generated estimates across 60 KB windows with a 1 KB step size for all populations and timepoints. A full description of the algorithm can be found in Linder et al. (2020). Briefly, for a given window, Gaussian weights are calculated so that the 50 SNPs closest to the center of the window account for 50% of the sum of the weights. The limSolve package (Van den Meersche et al. 2009) in R is then used to identify a set of founder strain mixing proportions that minimizes the sum of the weighted squared difference between founder haplotypes and the observed frequency of each SNP in a given sample.

To identify regions of the genome where there were significant changes in haplotype frequency in each treatment, we used the D statistic described in Burke et al. (2014). D is the average percent distance between haplotypes at a given position and is defined as follows:

[eqn]

where h_O,j_ is the haplotype frequency of the jth founder population in cycle 1 and h_Y_,I is the frequency of the yth population in cycle 15, and n is the number of haplotypes estimated at that position. This was done independently for each population in the study. We then looked at D across the genome and averaged values across replicates of each treatment to identify regions of large and consistent changes in haplotype composition.

Results

Phenotypic adaptation to ethanol stress

To characterize the long-term consequences of each experimental treatment, we independently assayed the growth of evolved (cycle 15) replicates in all three media types (control YPD, moderate 6% ethanol, and high 10% ethanol) and compared mean values of growth rate, doubling time, and carrying capacity to the ancestral population and between all populations (Fig. 1). First, in control media, differences in doubling time and carrying capacity are small among groups (Fig. 1a and b, Supplementary Fig. 3a), but significant for both doubling time (Kruskal–Wallis test P = 0.0001) and carrying capacity (P = 0.001). C, M, and H populations double at similar rates to each other in control media, but all exhibit a significantly faster doubling time doubling time than the ancestor (Fig. 1b); this implies that replicates in all treatments adapted to laboratory conditions in ways that improved growth in YPD without an obvious trade-off. Evolved populations were also able to maintain a similar carrying capacity to the ancestor in control media, with H replicates showing a slightly lower capacity than the other groups (mean of 0.9 in the H replicates vs ∼1 for the others, note: carrying capacity is represented as log(OD) as shown in Supplementary Fig. 3). While we do find some statistically significant differences here, the estimated effect sizes are small. As such, it appears that adaptation to ethanol stress in the moderate and high ethanol stress has not had a large negative impact on the ability of replicates in these groups to grow in YPD (i.e., no obvious suggestion of an adaptive trade-off).

Phenotypes of evolved populations. Growth rates and doubling times for the ancestral and experimental populations after 15 cycles of adaptation in control (a and b), moderate ethanol (c and d), and high ethanol conditions (e and f). In each assay, nine technical replicates of the ancestor and 10 randomly-chosen replicates from each treatment were evaluated.

In moderate ethanol, statistically significant differences in growth patterns are evident (Fig. 1c), with respect to both doubling time (Fig. 1d, Kruskal–Wallis test, P = 1.75 × 10^−6^) and carrying capacity (Supplementary Fig. 3b, Kruskal–Wallis test, P = 8.05 × 10^−7^). M replicates have the fastest mean doubling time and both C and M populations double significantly faster than the ancestor in moderate ethanol, evidencing adaptation in these groups. We also find that carrying capacity is significantly different in all pairwise comparisons except for the ancestor vs C replicates. By contrast, H replicates have a significantly longer doubling time and slower growth rate in moderate ethanol compared to all other groups, which may imply a trade-off under lower concentrations of ethanol. H replicates also have the lowest carrying capacity in this media type (mean = 0.98), followed by the M replicates (mean = 1.07) and the C replicates (mean =1.1). Thus, we see clear differences in phenotypic responses to moderate ethanol among populations, even between those evolved under ethanol stress (M and H populations).

In populations assayed in high ethanol (Fig. 1e), ethanol adaptation is evident across all evolved populations and we find significant differences in both doubling time (Fig. 1f, Kruskal–Wallis test P = 1.16 × 10^−5^) and carrying capacity (Supplementary Fig. 3c, Kruskal–Wallis test P = 5.46 × 10^−7^) among groups. Pairwise comparisons reveal that M and H populations both have significantly faster doubling times than C replicates and the ancestor, but are not significantly different from each other. Doubling time is slowest in the ancestral population (mean = 3.64 h), followed by the C replicates (mean = 3.26 h), followed by the M and H replicates (mean = 2.64 and 2.57 h). For carrying capacity, results are consistent with moderate ethanol, with H replicates showing the lowest capacity and all pairwise comparisons are significantly different except for the ancestral population vs the C replicates. The ancestral population has the highest carrying capacity (mean = 1.54, same for the C replicates), followed by the M replicates (1.11) and H replicates (mean = 0.78). So, while populations adapted to ethanol stress treatments generally have faster growth rates under high ethanol stress conditions, they also stop doubling sooner. And adaptation to high ethanol stress specifically is associated with the lowest carrying capacity.

Evidence for different response categories among candidate SNPs

To assess the genomic response to selection, we first compared SNP frequencies between cycles 1 and 15 for each treatment using a modified version of the CMH test tailored for E&R data (Fig. 2). Here we find clear responses to selection as evidenced by consistent changes in SNP frequencies in each treatment, including the controls. As the number of candidate regions overlap across both ethanol stress treatments and controls, it appears that adaptation to general laboratory handling (i.e., our weekly outcrossing protocol) is a major feature of this study. As a result, we sought to further differentiate responses to ethanol stress and control conditions by comparing SNPs across treatment groups and assigning them to 5 major response categories: (i) general adaptation to laboratory handling, (ii) general adaptation to ethanol stress, (iii) specific adaptation to the high ethanol treatment, (iv) specific adaptation to the moderate ethanol treatment, and (v) specific adaptation to the control treatment (Table 1).

Treatment-specific candidate regions. Results from CMH tests comparing SNP frequencies between cycles 1 and 15 in replicate populations selected for (a) high ethanol stress, (b) moderate ethanol stress, and (c) control conditions. Black lines represent significance thresholds, and color coding of significantly differentiated sites corresponds to the response categories described in Table 1.

As seen in Fig. 2 where these categories are overlaid on the CMH results comparing cycles 1 and 15 for each treatment, we observe some mixing of categories within certain peaks, but most are either made up of a single response type of have a clear majority type. And while there are a number of regions that correspond to our three ethanol response types (general, high, and moderate ethanol responses), our strongest signals are observed in regions associated with general laboratory selection (blue peaks). So, while ethanol exposure clearly imposes some stress, laboratory conditions appear to be major selective pressures on their own which aligns with growth rate assays showing lower doubling times of C replicates in control media. Here it should be noted that the ancestral population sampled to establish all experimental replicates had already experienced 12 “domestication” cycles with outcrossing prior to this experiment. As such, there was some expectation that these populations would already be adapted to routine culture and outcrossing protocols. A principal component analysis (PCA) of SNP frequencies indicates that despite what appears to be the shared selection response imposed by laboratory conditions, groups are still broadly differentiated by cycle 15 (Fig. 3).

PCA of experimental populations. Principal components analysis of raw allele frequencies (N = 61,281) in the ancestor and replicate populations at each sequenced timepoint reveals distinct clustering by treatment. Early in the experiment (cycle 1 = C1), all populations cluster near to each other, and the ancestor, but by cycle 7 (C7) and cycle 15 (C15) greater separation between treatment groups is evident, with the largest distance between control populations and ethanol stress populations.

Despite shared responses to laboratory handling across all treatment groups, we also observe treatment-specific responses supporting our hypothesis that different selection intensities could lead to different targets of selection. For instance, there are regions of the genome showing significant responses to selection in the H replicates (Fig. 2a) and not the M replicates (Fig. 2b) and vice versa despite both being exposed to ethanol stress. We also find cases where significant regions are found in the C replicates (Fig. 2c) but not in either the H or M replicates despite all three treatments adapting to the same general laboratory conditions and maintenance protocols. Of note is the fact these control specific responses are some of our most significant treatment-specific responses.

Next, while most significant sites fall into the response categories outlined in Table 1, there are a few prominent examples of sites where this is not the case (e.g., the gray portion of the large peak on chromosome 4 in Fig. 2c). Upon further investigation, significant SNPs that do not fit into our major categories are typically those that are shared between the C and M populations but not H (∼2,500 sites), and occur in or near regions otherwise associated with general laboratory selection or control specific responses. As such, we primarily attribute this to how our statistical approach is interacting with linkage and strength of selection at causative sites. However, it should be noted that there are very few instances where candidates are shared between C and H but not M (∼ 300 sites). As such, it appears there is potentially a negative relationship between the adaptive benefit of certain alleles in our system and increased levels of ethanol exposure.

An important caveat to consider is that interpreting our results as supportive of our hypothesis assumes that all of our response categories are biologically meaningful; however, this might not necessarily be the case. For instance, some SNPs in the high ethanol-specific category may in fact be better characterized as general ethanol alleles that are too weakly selected in M populations to be detected by our statistical tests. To account for this possibility, we calculated selection coefficient (s) estimates for each set of populations and compared how values varied between treatments and categories (see below).

Selection coefficient estimates and response categories

To assess whether or not response categories are biologically meaningful, we first generated s estimates across the genome for the C, M, and H populations across all polymorphic sites in the dataset (Supplementary Fig. 4). Next, we calculated Pearson correlation coefficients (r) between s estimates from the C, M, and H populations across the genome and for candidate sites in all response categories described in Table 1.

In agreement with the whole genome comparisons (Fig. 4a, Supplementary Fig. 5a), we find s estimates for the high ethanol-specific category are more highly correlated between the H and M populations (r = 0.79) than H vs C (r = 0.27) or M vs C (r = 0.54) comparisons (Fig. 4d). We find a similar pattern in the moderate ethanol-specific category where correlations indicate that candidate sites are also highly correlated between H and M populations (r = 0.72) (Fig. 4e, Supplementary Fig. 5e). Results of this analysis, combined with values in the H population being generally higher (Supplementary Fig. 5d) suggests that these sites may be under selection in both M and H categories, but too weakly selected in the M population to be detected by our statistical tests. Despite this, these findings generally support that selection intensity can shape outcomes through context-dependent effects on how strongly specific alleles are favored.

Correlations between selection coefficients across categories. Pearson correlation coefficients (r) between selection coefficients estimated in C, M, and H populations for (a) all polymorphic sites, and the candidate SNPs for (b) general laboratory selection, (c) general ethanol selection, (d) high ethanol selection, (e) moderate ethanol selection, and (f) control selection.

Perhaps the most surprising outcome of analyzing selection coefficient correlations in this way is the consistently lower correlations when comparing H and C populations than when comparing either to M (Fig. 4, Supplementary Figs. 5). We interpret this as evidence that candidate alleles specifically beneficial in control conditions lose value as ethanol exposure increases (Fig. 4f, Supplementary Fig. 5f) and the inverse is true for alleles specifically beneficial at high levels of ethanol exposure (Fig. 4d, Supplementary Fig. 5d), and that this phenomenon drives most of the change observed in the experiment.

Selection intensity and magnitude of allele frequency

To address our hypothesis that more intense selection should lead to greater changes in allele frequencies, we compared the mean change in SNP frequencies between the C, M, and H populations for our five major candidate SNP categories (Fig. 5). If this idea is correct, among sites associated with ethanol resistance, we would expect to see significantly greater changes in SNP frequency in the H treatment compared with the M treatment. This should extend to both specific targets of selection and linked sites. Based on Kruskal–Wallis tests, we see significant differences between treatments in all categories (P-values shown in Fig. 5). So, we primarily relied on results from Wilcoxon tests between treatment pairs to assess whether or not greater selection intensity leads to greater changes in SNP frequencies at and around target sites. Note, when performing statistical comparisons, we combine general ethanol stress and high ethanol-specific candidate SNPs into a single category. This was done as a result of our selection coefficient analysis suggesting that the latter are likely also responding in the M population, just to a lesser degree (see above).

Magnitude of allele frequency change across response categories. Boxplots show the mean absolute change in SNP frequency for each replicate in each treatment for (a) general laboratory adaptation candidates, (b) control-specific candidates, (c) general ethanol adaptation candidates, and (d) moderate ethanol-specific candidates. For this figure and analysis, we combined general ethanol and high ethanol-specific candidates into one group based on the high correlation between high and moderate-ethanol specific candidates revealed by Fig. 4d.

With respect to the general laboratory selection category (Fig. 5a), we find that SNP frequency changes are greatest in the C populations, followed by the M populations, and smallest in the H populations (P = 0.007 for C vs M, P = 2.2 × 10^−7^ for C vs H, and P = 8.2 × 10^−6^ for M vs H). As such, it appears that as ethanol exposure increases, these SNPs become less likely to respond or show a lower rate of response. This pattern is further amplified in the control-specific category (Fig. 5b) with a more extreme gradient following the C > M > H pattern (P = 1.3 × 10^−7^ for C vs M, P = 1.5 × 10^−11^ for C vs H, and P = 0.002 for M vs H).

As expected, for combined general ethanol selection and high ethanol-specific candidate SNPs (Fig. 5c), we find greater frequency changes in the H and M populations compared to C (P = 2.5 × 10^−5^ and P = 0.008, respectively). We also see more change in the H populations than M (P = 3.9 × 10^−8^), indicating that selection intensity impacts the magnitude of change. However, this does not extend to the moderate ethanol-specific candidates (Fig. 5d). Here we do find greater changes in the M populations vs both the C and H populations (P = 0.0003 and P = 3.9 × 10^−9^, respectively).

Lastly, it is worth noting that changes in frequency in general laboratory selection candidates are greater than changes in the general ethanol stress candidates. Like our CMH results (Fig. 2), this further reinforces the idea that continued adaptation to laboratory conditions is a major driving force in this system.

Candidate genes associated with different response categories

To compare potential differences in genes and pathways associated with our major response types, we performed gene enrichment analyses using the candidate genes associated with each category. Here we once again combine high ethanol specific and general ethanol selection responses into a single category and using all genes as the background. As shown in Supplementary Table 4, we find enriched terms related to metabolism, signaling, and autophagy that are shared across categories. However, we also find some distinct patterns of enrichment within response categories (both broadly and narrowly defined). This lends some credibility to the idea that selection is targeting different biological mechanisms across treatments. For instance, in the general laboratory selection category, we find a number of terms related to abiotic stress (e.g., “response to osmotic stress”) and movement of molecules across cell membranes (e.g., “cation transport”)—terms that intuitively might underlie adaptation to the conditions we impose to promote outcrossing. We do not find this pattern in our general ethanol selection category, and instead find terms more associated with replication and cell division (e.g., “mitotic cell cycle,” “mitotic cytokinesis,” and “cell cycle phase transition”). With respect to treatment specific response categories, we find distinct enrichment for terms related to meiosis (e.g., “meiosis II cell cycle process” and “meiotic sister chromatid segregation”) among control-specific responses (note: it is the case that some genes are found across multiple term categories). Lastly, while similar terms are sometimes found in other categories, moderate ethanol stress-specific responses are disproportionately associated with metabolism and nutrient processing (e.g., “lipid metabolic processes,” “glutathione metabolic process,” and “response to nutrient levels”).

Haplotype frequency dynamics

Using DNA sequence data from the 12 haploid founder strains crossed to create the ancestor, we estimated haplotype frequencies across the genome for each of our evolved populations to assess whether or not responses to selection were tied to rare variants private to single haplotypes that exhibit large shifts in frequency. While the approach developed by Linder et al. (2020) was specifically intended for this type of study, it is still difficult to accurately estimate haplotypes in many regions of the genome due to high similarity between founders and low SNP densities. After generating haplotype estimates across 60 kb windows with a 1 kb step size for each population and timepoint combination, we filtered the results to only consider loci where the sum of all haplotype frequencies approximated one, and no individual haplotype frequencies were less than zero or greater than one. Next, for windows where say two founders have nearly identical haplotypes, haplotype frequencies were estimated to be one-half the sum of the two founders. This general approach was extended to cases with >2 indistinguishable haplotypes. Ultimately, we generated haplotype frequency estimates for 11,221 loci across all population and timepoint combinations.

To identify regions of the genome where specific haplotypes might have shifted significantly in response to selection, we calculated D—the average percent distance between haplotypes at a given locus in cycles 1 and 15—for each population. Next, we looked at how average D varies across the genome for each treatment (Supplementary Fig. 6). As a metric, high average D would indicate large consistent changes in haplotype proportions across replicates in a given treatment. While results from our SNP analyses indicate the response to selection extends across the genome in each treatment, average %D is low across most of the genome. Most loci have D values of 10 or less: this is true for 89% of the loci in the C populations, 98% of the loci in the M populations, and 99% of the loci in the H populations (see Supplementary Fig. 7 for density plots for each treatment). It is also worth noting the greater range and overall higher D values in the C populations. Assuming casual SNPs are at least in part tied to specific haplotypes or a small subset of haplotypes, this is consistent with SNP results. Laboratory adaptation accounts for most of the genomic response to selection across our system, and associated candidate SNPs change the most in the C populations (Fig. 5a and b). That being said, there is limited evidence overall that genome-wide shifts in SNP frequencies are tied to large shifts in haplotype proportions or the fixation of favored haplotypes.

To assess haplotype frequency change over time in candidate regions defined by the CMH results, we identified all windows for which we had valid haplotype estimates containing candidate SNPs. We then determined which haplotype had the greatest absolute change in each window between cycles 1 and 15. On average, the most changed haplotype shifted by 0.2 in the C, 0.14 in M, and 0.12 in H populations. As examples, we plotted haplotype frequencies from cycles 1 to 15 for the 60 kb windows surrounding the most significant SNP in the highest general ethanol selection and general laboratory selection peaks from our CMH analysis (Fig. 2). In the highest general ethanol peak, adaptation appears to be primarily reflected in the decrease in the frequency of the B10 haplotype (Fig. 6 cyan lines; average decrease of 0.17 in M and 0.22 in H but 0.08 in C populations). This decrease is also not tied to an increase in a single other haplotype and, despite apparent selection against variants on B10, it still has the highest haplotype frequency in cycle 15 in both the H and M populations. For the highest general laboratory selection peak, the increase of the A7 haplotype across all treatments from cycles 1 to 15 (average of about 0.22 in C, 0.17 in M, and 0.16 in H populations) suggests it may underlie adaption to population maintenance protocols (Supplementary Fig. 8, orange lines). However, it does not become the sole dominant haplotype in the region and others still hover at appreciable frequencies—most notably the B4 haplotype. Within both peaks, patterns of change are consistent with SNP results (e.g., B10 decreases more in the H populations than the M populations) and we see a high level of parallelism across replicates. So, once again while there is a relationship between changes in candidate SNPs and haplotype frequency dynamics, results are not consistent with a model where adaptation is dominated by large shifts in haplotypes due to selection for or against rare variants private to those haplotypes.

Haplotype frequencies under the highest general ethanol selection peak. Haplotype frequencies in (a) C populations, (b) M populations, and (c) H populations in cycles 1, 7 and 15. Frequencies are shown across the 60 kb region around the most significant SNP in the highest general ethanol selection peak (see Fig. 2). Different colors indicate each founder haplotype, and individual lines show estimated frequencies in each experimental replicate population.

Discussion

Our experiment used ethanol stress as a focal selective pressure to test whether stronger stress produces greater allele frequencies at target loci, and whether selection targets varied among treatments experiencing different levels of environmental stress. Despite prior laboratory domestication of the ancestral population (involving a dozen cycles of outcrossing and hundreds of asexual generations), genomic data reveal that continued adaptation to laboratory conditions remains a major component of the evolutionary response (Fig. 2; Supplementary Fig. 4c). While this complicates interpretation, the inclusion of control populations enables separation of responses associated with laboratory adaptation and those driven by ethanol stress (Table 1). While we cannot directly quantify total selection intensity, we can reasonably infer that stress increased across treatments, from control to moderate to high ethanol. In other words, we have created a spectrum of stressful environments, in which some elements are shared, and others are unique.

Patterns of growth in evolved populations imply different avenues of adaptation

Growth assays highlight that adaptation to ethanol stress involves complex and context-dependent phenotypic changes. After 15 cycles, all evolved populations grew faster than the ancestor in YPD, consistent with ongoing adaptation to laboratory conditions and the outcrossing protocol. Under ethanol stress however, responses diverged. In moderate ethanol, populations from the high ethanol treatment exhibited slower growth and reduced carrying capacities compared to the other treatments (Fig. 1c and d, and Supplementary Fig. 3b). And while they do have faster doubling times than controls and the ancestral population in high ethanol stress conditions (Fig. 1e and f), they again have the lowest carrying capacities (Supplementary Fig. 3c). Moderate ethanol populations maintained faster growth under high ethanol conditions, with less severe reduction in carrying capacity (Supplementary Fig. 3c). These patterns indicate that adaptation to ethanol stress involves mechanisms not directly related to growth rate and may reflect trade-offs between rapid growth and final sustainable population size.

Carrying capacity, as defined here, reflects the point at which growth ceases in the population and likely corresponds to nutrient depletion, rather than intrinsic fitness. Nonetheless, variation in this measure may arise from underlying differentiation in cell morphology, metabolism and/or aggregation—which may or may not trade-off with growth rate. Such trade-offs are consistent with pleiotropic effects commonly observed in experimental evolution studies; for example, Kawecki et al. (2021) showed that the same loci influencing starvation resistance in D. melanogaster can shift directionally depending on whether selection is imposed at larval or adult life stages due to underlying trade-offs.

While growth rate assays provide practical insight, they capture only part of the adaptive response. Differences in assay format (i.e., growth in 200 µL in 96 well plates vs 1 mL in 24 well plates) likely influence absolute growth parameters. Nonetheless, the complexity observed across our treatments underscores that increased selection intensity does not yield a simple or proportional phenotypic response. Selection intensity must therefore be carefully considered when extrapolating E&R outcomes to specific traits.

The magnitude of allele frequency change at candidate sites is context-dependent

Theoretical and simulation work (Christodoulaki et al. 2019) predicts that stronger selection should produce larger allele frequency shifts at target loci. Consistent with this expectation, changes in general ethanol candidate sites were greatest in the high ethanol treatment and smallest in controls (Fig. 5c), supporting a positive relationship between stress intensity and genomic response. However, this pattern did not extend to candidate sites associated with general laboratory adaptation. For those sites, the greatest frequency changes occurred in controls and the smallest changes in the high ethanol populations (Fig. 5a). This reversal may reflect antagonism between adaptation to lab conditions and ethanol stress, in which alleles beneficial under benign conditions may incur costs under high stress. Moderate ethanol-specific candidates further support this view: allele frequency shifts are smaller in the high ethanol populations, indicating that some alleles advantageous under moderate stress may become neutral or deleterious when stressed. This outcome aligns with Christodoulaki et al. (2019), who simulated allele frequency changes associated with different distances from new trait optima and found smaller shifts at target sites when trait optima are close. Together, these findings suggest that increased stress intensity can amplify or suppress selection responses depending on the adaptive landscape.

We also considered the possibility that in this experiment, adaptation to laboratory outcrossing protocols may have been stressful enough that the inclusion of ethanol exposure had little additional impact. This could explain why sites in the general laboratory selection and control-specific categories show the greatest changes in SNP frequency on average. However, both phenotypic and PCA analyses confirm that ethanol exposure imposes distinct pressures. We therefore interpret our results as partial support for a positive correlation between selection intensity and allele frequency change, while emphasizing that the relationship depends strongly on the selective context and potential antagonistic pleiotropy.

Ethanol selection involves subtle haplotype shifts

To evaluate whether haplotype-level changes paralleled SNP-level dynamics, we tracked changes in haplotype frequencies across treatments. Unlike prior studies showing change dominated by a single beneficial haplotype (Linder et al. 2022), our data reveal only modest haplotype shifts, with average differences ranging from 0.17 to 0.22 in candidate regions. This contrasts with the larger (∼0.65) changes reported by Linder et al. (2022), suggesting that adaptation in our experiment proceeded via small shifts across multiple haplotypes. Consistent with this interpretation, average percent difference between haplotypes at cycles 1 and 15 (%D) was low genome-wide.

Differences in experimental design likely contribute to the contrast with Linder et al. (2022); for instance, our manual transfers versus their automated handling. Additionally, while we characterized responses to selection to different concentrations of ethanol, they studied responses to 17 entirely different chemical stressors and observed limited parallelism across treatments. Their experiment also featured very high selection intensities by design, with dosages that populations could barely survive in early generations. As a direct point of comparison, our high ethanol stress treatment involves 10% ethanol while theirs is 12.5%. Their higher selection intensities likely favored stronger single-haplotype dynamics, whereas our more moderate conditions promoted smaller changes distributed across several haplotypes. Collectively, these observations suggest that the genomic architecture of adaptation becomes increasingly polygenic as selection intensity decreases.

Selection intensity shapes adaptive architectures

We interpret the existence of treatment-specific responses to stress across groups as evidence that the genetic architecture of adaptation varies with selection intensity. While we identified loci responding generally to ethanol exposure, distinct sets of candidates were specific to either high or moderate ethanol exposure (Fig. 2a and b). Correlations in selection coefficients between treatments for these candidates (Fig. 4d and e) indicate that loci strongly selected in one environment often experience weaker but parallel selection in another, revealing a gradient of selective strength across conditions. This finding highlights a limitation in identifying weakly-selected sites in other categories. However, even with this caveat, this observed context-dependence supports the idea that selection intensity, broadly defined, shapes adaptive outcomes. The apparent phenotypic trade-offs observed in the M and H treatments reinforce this interpretation.

Unexpectedly, the clearest treatment-specific adaptive response arose in control populations (Fig. 2c), which showed strong, “pure” peaks containing fewer SNPs from multiple response categories. So, while these regions of the genome are strongly associated with the laboratory and outcrossing conditions shared by all experimental treatments, they appear to be (at most) weakly selected under moderate ethanol stress and not selected at all under high ethanol stress (Fig. 4f, Supplementary Fig. 5f). These loci may represent alleles beneficial under control conditions but costly under increasing levels of ethanol exposure, consistent with context-dependent pleiotropy or negative epistasis. While we cannot distinguish between these possibilities, both imply that selection intensity reshapes the fitness landscape, favoring distinct genetic solutions across environments.

Selection targets different biological mechanisms across treatments

Gene enrichment analysis reveals distinct biological processes underlying adaptation across treatments. In the moderate and high ethanol populations, genes associated with replication and cell division are enriched and subject to stronger selection in the high ethanol conditions. In contrast, moderate ethanol stress-specific candidates implicate metabolism and nutrient processing, suggesting a shift from resource-management strategies under mild stress to cell-cycle modulation under severe stress. Control populations implicate genes related to meiosis, reflecting continued adaptation to the outcrossing involved in culturing, rather than ethanol resistance. These distinctions provide further evidence that mechanisms of adaptation depend on the strength of selection and likely involve trade-offs among fundamental cellular processes.

Our findings align with prior work showing that ethanol resistance in S. cerevisiae is complex and involves many genetic and physiological mechanisms, including signaling, lipid metabolism, and carbohydrate homeostasis (reviewed in (Ding et al. 2009; Ma and Liu 2010)). Another study of adaptation to high (12%) ethanol exposure in asexual S. cerevisiae also implicated genetic pathways associated with cell cycle and DNA replication (Voordeckers et al. 2015). Here the authors suggest that delayed cell cycle progression allows for greater protection for individual cells. This sort of survival mechanism, which might more accurately be described as stress tolerance rather than stress resistance, may drive the apparent lower carrying capacity observed in the populations adapted to high ethanol stress in the presence of ethanol. This stands out as an intuitive example of a potential context-dependent adaptative mechanism associated with ethanol stress.

Conclusion

Our results demonstrate that stress intensity significantly impacts evolutionary dynamics at several levels. Stronger selection generally amplifies allele frequency change at ethanol-associated loci but can suppress responses at loci involved in general laboratory adaptation, revealing pervasive antagonistic pleiotropy. The absence of large haplotype changes and the presence of treatment-specific architectures suggest that adaptation proceeds through many small, context-dependent changes rather than a few dominant loci. These findings underscore the importance of considering selection intensity when synthesizing results across E&R studies, or attempting to extrapolate findings from E&R studies to natural populations, where environmental conditions are typically less extreme. They further highlight the need for models of polygenic adaptation that incorporate pleiotropy, epistasis, and environmental context. Furthermore, while here we chose to focus on stress, it is likely that other dynamic circumstances of natural environments, such as temporal heterogeneity in selection (e.g., due to seasonality), will also affect outcomes in meaningful ways. As such, we believe our findings illuminate a clear need for studies that explicitly demonstrate how experimental and population-genetic factors shape evolutionary dynamics in E&R studies moving forward. Finally, our results reinforce the critical role of appropriate controls even when timeseries data are available; by distinguishing general laboratory adaptation from ethanol-specific responses, we avoided confounding selection for laboratory adaptation with adaptation for stress.

Supplementary Material

jkag009_Supplementary_Data

Bibliography41

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1Aleksander SA et al 2023. The gene ontology knowledgebase in 2023. Genetics. 224:iyad 031. 10.1093/genetics/iyad 031.36866529 PMC 10158837 · doi ↗ · pubmed ↗
2Ashburner M et al 2000. Gene ontology: tool for the unification of biology. Nat Genet. 25:25–29. 10.1038/75556.10802651 PMC 3037419 · doi ↗ · pubmed ↗
3Barata C, Borges R, Kosiol C. 2023. Bait-ER: a Bayesian method to detect targets of selection in Evolve-and-Resequence experiments. J Evol Biol. 36:29–44. 10.1111/jeb.14134.36544394 PMC 10108205 · doi ↗ · pubmed ↗
4Barghi N, Hermisson J, Schlötterer C. 2020. Polygenic adaptation: a unifying framework to understand positive selection. Nat Rev Genet. 21:769–781. 10.1038/s 41576-020-0250-z.32601318 · doi ↗ · pubmed ↗
5Baym M et al 2015. Inexpensive multiplexed library preparation for megabase-sized genomes. P Lo S One. 10:e 0128036. 10.1371/journal.pone.0128036.26000737 PMC 4441430 · doi ↗ · pubmed ↗
6Benjamin DJ et al 2018. Redefine statistical significance. Nat Hum Behav. 2:6–10. 10.1038/s 41562-017-0189-z.30980045 · doi ↗ · pubmed ↗
7Benjamini Y, Hochberg Y. 1995. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc Ser B Methodol. 57:289–300. 10.1111/j.2517-6161.1995.tb 02031.x. · doi ↗
8Boyle EA, Li YI, Pritchard JK. 2017. An expanded view of complex traits: from polygenic to omnigenic. Cell. 169:1177–1186. 10.1016/j.cell.2017.05.038.28622505 PMC 5536862 · doi ↗ · pubmed ↗