Evidence of Hybrid Origin for Domesticated Spondias (Anacardiaceae) Taxa from Northeastern Brazil: A Picture of Ongoing Domestication of Fruit Species
Marlon Câmara Machado, Alessandra Selbach-Schnadelbach, Cássio van den Berg

TL;DR
This study confirms that certain domesticated Spondias fruit species in northeastern Brazil originated from hybridization between native and introduced species.
Contribution
The study provides genetic evidence confirming the hybrid origin of domesticated Spondias taxa and identifies specific parent species involved in hybridization.
Findings
Hybrid taxa like S. bahiensis and 'umbu-cajá' are confirmed to have originated from specific parent species.
Hybrid lineages involving S. purpurea are shown to be F1 generation hybrids.
Introgression and backcrossing processes are observed in some hybrid lineages.
Abstract
Hybridization is considered an important process in plant evolution, especially in the origins of domesticated plant taxa, with many crop species being the result of interspecific hybridization events. There are several unidentified lineages of Spondias in the northeastern region of Brazil known only by vernacular names such as ‘cajaguela’, ‘umbu-cajá’, and ‘umbuguela’. These taxa are often regarded as being of hybrid origin, based on supposedly intermediate morphological features. However, the morphology-based hypotheses of hybrid origin and parentage of these Spondias taxa remain largely untested experimentally. We collected 355 accessions of Spondias, including S. bahiensis, other putative hybrid taxa, and both native (S. mombin, S. tuberosa, and S. venulosa) and introduced (S. purpurea) species believed to be the parental taxa. We then reconstructed phylogenies of plastid and…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4- —Fundação de Amparo à Pesquisa do Estado da Bahia FAPESB
- —Conselho Nacional do Desenvolvimento Científico e Tecnológico
- —CvdB
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBotanical Research and Applications · Plant Diversity and Evolution · Genetic diversity and population structure
1. Introduction
Hybridization is considered an important process in shaping the evolutionary trajectory of angiosperm evolution, to the point that a majority of all plant species have been suggested as derived from past hybridization events (e.g., [1,2,3,4,5,6,7,8]). Hybridization is of particular importance in the creation of domesticated plant taxa, with a number of staple foods, such as wheat [9], being of hybrid origin. Interspecific hybridization is also at the origin of many tree crop species [10] and has been a powerful force in the evolution of domesticated perennials. Examples of hybrid tree crops range from apples (Malus domestica Borkh.; [11,12,13]), bananas and plantains (Musa paradisiaca L.; [14,15,16], kiwi fruit (Actinidia deliciosa (A. Chev.) C. F. Liang and A. R. Ferguson; [17,18,19]), and various Citrus L. crops such as grapefruit (C. paradisi Macfad.), lemon (C. limon (L.) Osbeck), Mexican lime (C. aurantiifolia (Christm.) Swingle), sweet orange (C. sinensis (L.) Osbeck), sour orange (C. aurantium L.), and willowleaf mandarin (C. deliciosa Ten.) ([20,21,22,23])). Further examples are listed in of [10].
Spondias L. (Anacardiaceae) comprises ca. 18 species native to tropical areas of the Americas, Asia, and Madagascar [24,25]. All Spondias species possess edible fruits, and some of the species are highly valued for the very agreeable taste of their fruits; these species are therefore widely cultivated both on a regional scale (e.g., S. pinnata, S. tuberosa) and pantropically (S. dulcis, S. mombin, S. purpurea). Eleven Spondias species are found in Brazil: three are introduced and cultivated, and eight are native to the country, of which five are Brazilian endemics (Table 1). Hybridization in Spondias has so far only been documented between S. mombin and S. purpurea [26,27], with the hybrid product of this cross described as Spondias × robe [28].
In the northeastern region of Brazil, in addition to native (Spondias mombin, S. tuberosa, S. venulosa) and introduced (S. purpurea, S. dulcis) species, there are some taxa of Spondias which are known only by vernacular names such as ‘cajaguela’, ‘umbu-cajá’, and ‘umbuguela’. These plants appear to occur exclusively in cultivation, being actively maintained and propagated by humans, and can thus be considered domesticated [33]. These taxa are often regarded as being of hybrid origin, with basis on seemingly intermediate morphological features displayed by these plants (Supplementary Materials, Table S1). The vernacular names applied to these plants clearly reflect this; hence, ‘cajaguela’ is regarded as a hybrid between ‘cajá’ (S. mombin) and ‘ciriguela’ (S. purpurea), ‘umbu-cajá’ a hybrid between ‘cajá’ (S. mombin) and ‘umbu’ (S. tuberosa), and ‘umbuguela’ a hybrid between ‘ciriguela’ (S. purpurea) and ‘umbu’ (S. tuberosa). However, two distinct taxa share the vernacular name ‘umbu-cajá’ [33,34]: a northern ‘umbu-cajá’ taxon whose center of diversity lies within the states of Ceará, Paraíba, Piauí, and Rio Grande do Norte, and a southern ‘umbu-cajá’ taxon whose center of diversity lies within the state of Bahia. The northern ‘umbu-cajá’ taxon is also found in Alagoas, Pernambuco, Maranhão, and Sergipe states, as the result of introduction and cultivation by man; likewise, the Bahian taxon is also found in northern Minas Gerais, Pernambuco and Sergipe (Figure 1). The two taxa differ considerably in gross morphology, including the dimensions of the plants, leaf size, number and morphology of leaflets, and morphology of inflorescence and fruits [33,34] (Supplementary Materials, Table S1). Due to the fact that both taxa possess the same vernacular name, there is much confusion in the literature, with authors investigating one taxon often citing studies where the other taxon has been investigated instead. The southern ‘umbu-cajá’ taxon from Bahia was discovered to be more closely related to S. venulosa and was subsequently described as S. bahiensis [33]. In a karyological study, this taxon was found to be karyotypically homozygous and to display exclusive bands in relation to both putative parents [35]. These findings were considered evidence for its recognition as a distinct species. All species investigated thus far are diploid, with 2n = 32 [25,35,36], and these findings include all species and some of the hybrids investigated here (S. bahiensis, S. dulcis, S. mombin, S. purpurea, S. tuberosa, S. venulosa, S. mombin × S. purpurea [Spondias × robe]).
2. Results
In the phylogenies reconstructed without the inclusion of ‘taxon genetic classes’ TGCs (see materials and methods) from any putative hybrid taxa, both cpDNA and External Transcribed Spacer (ETS) nrDNA data reveal strongly supported clades corresponding to the grouping of TGCs of each of the recognized species, and no significant differences are observed in the topologies between the cpDNA and ETS datasets. Figure 2 shows the resulting cpDNA and ETS consensus trees including all TGCs. Statistics of the matrices and parsimony analyses of the full combined cpDNA and ETS datasets are summarized in Table 2. For all recognized species, every TGC belonging to the species was placed in a well-supported but usually non-exclusive clade that also included TGCs belonging to putative hybrid taxa. None of the putative hybrids had their TGCs resolved in a well-supported clade. In both cpDNA and ETS datasets, no TGCs belonging to one of the recognized species were placed in clades composed of TGCs belonging to a different species. A high degree of incongruence between datasets was observed for the putative hybrid taxa: TGCs belonging to a putative hybrid which were resolved in a clade of one species in the cpDNA dataset, were often resolved in the clade of a different species in the ETS dataset (Figure 2).
Figure 3 depicts the haplotype networks reconstructed from the cpDNA and ETS matrices, both with and without the inclusion of accessions belonging to the putatively hybrid Spondias spp. taxa. The cpDNA and ETS haplotypes identified for each accession are listed in Supplementary Materials, Table S2. In the haplotype networks reconstructed excluding the putative hybrid taxa (Figure 3A,C), each of the recognized species possessed a distinct set of haplotypes—no haplotypes were found which were shared between distinct species. In the cpDNA network (Figure 3A), S. dulcis is represented by one haplotype, S. mombin by eight haplotypes, S. purpurea by one haplotype, S. tuberosa by two haplotypes, and S. venulosa by six haplotypes. In the ETS network (Figure 3C), S. dulcis is represented by one haplotype, S. mombin by six haplotypes, S. purpurea by one haplotype, S. tuberosa by one haplotype, and S. venulosa by two haplotypes. In the haplotype networks reconstructed including all accessions (Figure 3B,D), all the putative hybrid taxa share haplotypes with the recognized species, and there are no haplotypes exclusive to the putative hybrid taxa except for haplotypes H_19 and H_20 in the cpDNA network (Figure 3B), which are exclusive to S. bahiensis.
The majority of the individuals sampled for S. bahiensis (123 accessions) shared cpDNA haplotypes with S. venulosa (Figure 3B and Figure 4A) and ETS haplotypes with S. tuberosa (Figure 3D and Figure 4A). Of these accessions, 121 possessed the most common cpDNA haplotype of S. venulosa (H_13) and the most common ETS haplotype of S. tuberosa (H_09). One accession of S. bahiensis (Spo.bahie_2437.03) shared with S. venulosa a different cpDNA haplotype (H_16), but shared the same ETS haplotype with S. tuberosa (H_09) as the other accessions. A single accession of S. bahiensis (Spo.bahie_2263.23) possessed an exclusive cpDNA haplotype, H_20, which is linked by one mutational step to the most common cpDNA haplotype found in S. venulosa (H_13); the ETS haplotype of this accession was also the most common ETS haplotype found in S. tuberosa (H_09).
A total of 25 accessions of S. bahiensis shared both cpDNA and ETS haplotypes with S. tuberosa; of these, 23 accessions shared with S. tuberosa the most common haplotypes found in this species in both the cpDNA (H_11) and ETS (H_09) datasets. Four accessions of S. bahiensis possessed an exclusive cpDNA haplotype, H_20, which is linked by one mutational step to the most common haplotype found in S. tuberosa (H_11); three of these accessions shared the ETS haplotype H_09 with S. tuberosa, while one accession (Spo.bahie_2394.05) shared the ETS haplotype H_10 with S. venulosa (the most common ETS haplotype in this species).
The majority of the individuals sampled for the northern ‘umbu-cajá’ taxon (29 accessions) shared haplotype with the most common cpDNA haplotype of S. mombin (H_02; Figure 3B and Figure 4A) and the ETS haplotype with the most common ETS haplotype of S. tuberosa (H_09; Figure 3D and Figure 4A). Nine accessions of the northern ‘umbu-cajá’ taxon possess the most common haplotypes of S. tuberosa both in the cpDNA (H_11) and ETS (H_09) datasets. One accession (Spo.xmotu_2446.03) shared both cpDNA and ETS haplotypes with S. mombin: its cpDNA haplotype was the most common found in S. mombin (H_02), while its ETS haplotype was a rare haplotype found in S. mombin (H_07).
The putative hybrids involving S. purpurea (‘cajaguela’ and ‘umbuguela’) shared the cpDNA haplotype of S. purpurea, with ‘cajaguela’ sharing with S. mombin the ETS haplotype H_04, and ‘umbuguela’ sharing with S. tuberosa the most common ETS haplotype found in this species (H_09). The putative hybrid between S. venulosa and S. bahiensis possesses the most common cpDNA haplotype of S. venulosa (H_13) and the most common ETS haplotype of S. tuberosa (H_09). The putative hybrid between S. mombin and S. bahiensis possess the most common cpDNA haplotype of S. mombin (H_02) and the most common ETS haplotype of S. tuberosa (H_09). Figure 4 summarizes with which of the recognized species each accession of the putative hybrid taxa share haplotypes in both the cpDNA and ETS networks.
3. Discussion
3.1. Hybrid Origins of Spondias Taxa Found in Northeastern Brazil
Interspecific hybrids are most commonly identified by topological discordance between phylogenies reconstructed from sequences of nuclear and plastid regions, with relationships to different species recovered for the phylogenies from each data partition, denoting distinct parental contributions to the hybrid genome. Such cytonuclear incongruence is generally taken as evidence confirming morphology-based hypotheses of hybrid origin [47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62]. Nuclear alleles are biparentally inherited, whereas plastids are typically inherited from the female (seed) parent in most families of angiosperms [63,64,65]. Due to the biparental mode of inheritance of nuclear regions, newly formed hybrids potentially possess copies from both parental species and thus their DNA sequences are expected to display an additive pattern, creating polymorphic sites where the two parental species differ in their sequences [56,66]. No polymorphic sites were found in the ETS sequences from the putative hybrid taxa, what is probably the result of homogenization to one parental ETS type due to widely reported mechanisms of concerted evolution [66]. Homogenization by concerted evolution normally could take a period of many generations to complete, but it can also occur rather quickly [67,68], with nuclear sequences of one parent becoming fixed [59,66].
Topological discordance between the cpDNA and ETS phylogenies was found for most TGCs of the putative hybrid Spondias taxa, with TGCs placed in one clade with one of the recognized species in the cpDNA phylogeny, but clustered with a different species in the ETS phylogeny (Figure 2). Moreover, two of the putative hybrid taxa (the northern ‘umbu-cajá’ taxon and S. bahiensis) displayed TGCs distributed in different clades (and therefore associated with different species) in both the cpDNA and ETS phylogenies (Figure 2), indicating multiple origins for the individuals assigned to these taxa. The haplotype networks (Figure 3) also show that the distribution of haplotypes between the different partitions of the genome is markedly incongruent among the accessions of the putative hybrid Spondias taxa. Individuals belonging to the hybrid taxa usually shared haplotypes with one of the recognized species in one dataset, then with a different species in the other dataset (Figure 3). It is noteworthy that there is no sharing of haplotypes nor topological discordance between the recognized species when the putative hybrid taxa are removed from the analyses (Figure 3A,C). These results lend support to the hypothesis, inferred from the morphological features of these taxa, that they originated from events of hybridization between the recognized species.
Maternal inheritance of cpDNA occurs in about 80% of angiosperms [57,58,59] and Spondias species are assumed to follow this pattern. This assumption is confirmed by the patterns involving S. purpurea. In natural populations of this species, the individuals are dioecious with a predominance of male trees and sexual reproduction, whereas in cultivation, all trees are female and exclusively vegetatively propagated, with fruits produced via parthenocarpy [10,69,70]. Hybrids involving cultivated S. purpurea are therefore expected to always have this species as the seed parent. If maternal cpDNA inheritance occurs in Spondias then all hybrids involving S. purpurea should possess the cpDNA haplotypes of this species. This pattern has been confirmed in the present study for both putative hybrids ‘cajaguela’ and ‘umbuguela’ (Figure 4B, accessions 42–44) share the cpDNA haplotype with S. purpurea, and the nuclear ETS haplotype with S. mombin (“cajaguela”, which is thus the same taxon as S. × robe Urban) and S. tuberosa (“umbuguela”). As a result, the putative male parents of these hybrids, with a basis in morphology, have also been confirmed.
The majority of S. bahiensis accessions (82.55%, 101 of 123 samples) displayed cpDNA haplotypes of S. venulosa and ETS haplotypes coming from S. tuberosa, and nearly all of these accessions share the most frequent haplotypes of S. venulosa and S. tuberosa in the cpDNA and ETS networks, respectively (Figure 3B,D and Figure 4A). This could be indicative of a single origin for this group of S. bahiensis accessions, or it could simply be the result of recurring hybridization events between the most common lineages of the parental species, related to aspects of their pollination mechanism which require further study. There is evidence for the latter pattern, since two accessions of S. bahiensis did not share the most common cpDNA haplotype of S. venulosa. One of these accessions, which was collected in the southern region of the state of Bahia, had the haplotype found in S. venulosa accessions which were also collected in the same region. The other S. bahiensis accession possesses an exclusive haplotype that is linked by a single mutation to the most common S. venulosa haplotype; it could mean that this S. bahiensis lineage is old enough to have accumulated this mutation, a pattern also compatible with some exclusive bands found in its karyotype [30]. Alternatively, it could simply mean that it has a rare haplotype of S. venulosa that failed to be sampled in the accessions of that latter species included in our study.
Interestingly, 16.78% of the S. bahiensis accessions possess both cpDNA and ETS haplotypes coming from S. tuberosa, and most of these accessions share the most frequent haplotypes of S. tuberosa in the cpDNA and ETS networks. The morphology displayed by these accessions does not differ from the majority of S. bahiensis accessions, and does not support the idea that these accessions could represent pure S. tuberosa. These accessions could represent instances of S. bahiensis originated from the reversed parental combination observed in the majority of accessions (that is, having S. tuberosa as the seed parent and S. venulosa as the pollen parent), and homogenization of ETS sequences biased towards the S. tuberosa type. Alternatively, it is possible that these accessions could be the result of introgressive backcrosses between S. bahiensis and S. tuberosa, with the latter as the female parent.
Three S. bahiensis accessions have an exclusive cpDNA haplotype that is linked by a single mutation to the most common S. tuberosa haplotype; the most likely explanation is that these S. bahiensis accessions possess a rare haplotype of S. tuberosa that was not sampled for the species in this study. Since the population size of S. tuberosa is very large, this haplotype would occur at very low frequency in the species, but because the population size of S. bahiensis is much smaller than that of S. tuberosa, the frequency of the haplotype would be higher in S. bahiensis, with increased probability that the haplotype would be sampled in this taxon than in S. tuberosa. Alternatively, this exclusive haplotype could mean that this S. bahiensis lineage has diverged long enough from S. tuberosa for a mutation to have occurred and become fixed in the cpDNA regions analyzed. Only a single S. bahiensis accession was found to possess the pattern of cpDNA haplotype coming from S. tuberosa and ETS haplotype coming from S. venulosa. This could represent an instance of hybridization between S. tuberosa and S. venulosa, where the former was the seed parent and the latter the pollen parent. However, this accession possesses the cpDNA haplotype which was found exclusively in S. bahiensis; therefore, it is more likely that it represents an instance of introgression between S. bahiensis and S. venulosa with the former as the seed parent.
The majority (74.36%) of the accessions of the northern ‘umbu-cajá’ taxon have cpDNA haplotypes shared with S. mombin and ETS haplotypes shared with S. tuberosa, the haplotypes being the most frequent ones of the respective species (Figure 4B, accessions 3–8, 12–17, 19, 20, 25–37, 39, 41). This result lends support to the hypothesized hybrid origin of this taxon from crosses between S. mombin and S. tuberosa. A smaller number of accessions (23.08%) of the northern ‘umbu-cajá’ taxon share the most frequent haplotypes from S. tuberosa in both cpDNA and ETS datasets (Figure 4B, accessions 9–11, 18, 21–24, 38). As with S. bahiensis, these ‘all tuberosa’ accessions could represent instances of hybrids originated by crossings having S. tuberosa as the seed parent and S. mombin as the pollen parent, with later homogenization of ETS sequences biased towards the S. tuberosa type. The alternative explanation is that these accessions are the result of introgressive hybridization between individuals of the northern ‘umbu-cajá’ taxon and S. tuberosa, the latter being the female parent. A single accession of the northern ‘umbu-cajá’ taxon have both cpDNA and ETS haplotypes associated with S. mombin, which may indicate introgression in S. mombin or that there was homogenization of ETS sequences biased towards the S. mombin type (Figure 4B, accession 40).
During fieldwork carried out to collect samples for this study, we found a Spondias specimen whose morphological characteristics point out to a putative hybrid between S. bahiensis and S. mombin. In our data, this individual presented the most frequent cpDNA haplotype of S. mombin and the most frequent ETS haplotype of S. tuberosa (Figure 4B, accession 1). Although the results clearly indicated the hybrid origin of this taxon, they are inconclusive to determine the pollen parent of this hybrid, since S. bahiensis has ETS haplotypes from S. tuberosa; however, given the morphology of the hybrid, it is assumed that the paternal parent is indeed S. bahiensis.
A second specimen found in the field possesses morphological features very similar to S. venulosa; however, a diagnostic character of S. venulosa—the base of leaflets with margins distinctly revolute towards the abaxial surface and possessing a tuff of flexuous trichomes to 0.6 mm long—was not present in this specimen, even though this character was present in two other specimens of S. venulosa growing at the locality. This locality also presented cultivated individuals of S. bahiensis. This suggested that the aberrant ‘venulosa-like’ specimen could actually represent a hybrid between S. bahiensis and S. venulosa. The analyses have shown that the aberrant specimen has the most frequent cpDNA haplotype of S. venulosa and the most frequent ETS haplotype of S. tuberosa (Figure 4B, accession 2). This result supports the hypothesis that the aberrant specimen could be a backcrossed hybrid between S. bahiensis and S. venulosa, since S. bahiensis has ETS haplotypes from S. tuberosa.
3.2. Mode and Time of Origin of the Spondias Hybrids from Northeastern Brazil
The Spondias hybrids found in Northeastern Brazil are spontaneous—they did not arise from intentional crosses between the species. However, we cannot discard that they could be triggered by cultivation of one or more parental species, bringing more contact between pairs of species than the original distribution in the wild. But why is Northeastern Brazil a ‘melting pot’ of hybridization in Spondias? Species can be found co-occurring in other regions of the country—for example, S. mombin and S. purpurea in areas of Central–Western and Northern Brazil. Why are there so many different hybrids in Northeastern Brazil? The most plausible explanation is that, due to the climatic conditions (with a short rainy season) found in the region, there is more overlap in the flowering periods of the different species [71], allowing interspecific crosses to take place when two or more species are grown at the same place or in close proximity. Species growing in their natural environments are recorded as having different flowering periods, which may relate to the climatic conditions found where the plants occur: flowering of Spondias species generally take place during the dry season [71], which vary from place to place. For example, S. mombin has a flowering peak in August and September in Pará [72,73] whereas S. tuberosa has a flowering peak in November and December in Pernambuco and Paraíba [74,75,76].
The question of when the hybrids originated is easier to answer for the hybrids derived from S. purpurea, since the introduction of the species in Brazil is rather recent, only about two centuries ago [77,78], and it did not become widely cultivated in the northeastern region until at least one century later [79]. It is unclear when exactly S. purpurea was introduced in Brazil. It is possible that it was brought to the country sometime in the period 1809–1817, when French Guiana was occupied by Portugal. In this period, many plants were transferred to Brazil from the botanical garden La Gabrielle in Cayenne, French Guiana, which possessed plants from a number of French colonies [80,81]. Spondias purpurea was potentially one of the many tree species grown in the La Gabrielle botanical garden, since it was also grown in the French West Indies [82,83] and other areas of the Antilles [84]. The plants brought from French Guiana to Brazil were first cultivated at the Botanical Garden of Belém, Pará, in the Amazon region. Incidentally, ref. [85] (p. 138) observed cultivated plants that could be either S. mombin or S. purpurea in Belém, Pará, in 1826. D’Orbigny recorded the plants as S. myrobalanus, a name that has been applied to both S. purpurea and S. mombin; he also employed the vernacular name ‘Mombin’, which is again inconclusive, since both species can be referred to by this name.
More mentions of S. purpurea started to appear in the literature in the second half of the 19th century: [78] (pp. 373–374) cited a herbarium specimen, Riedel 103, of a plant of S. purpurea cultivated in Brazil; [86] (p. 18) recommended using S. purpurea as a shade tree in coffee plantations; [77] (p. 237) listed S. purpurea, writing that “it is a Brazilian tree that grows in the Amazon. Its fruit is red, and is known in that region as ‘Mombin’ and ‘Spanish plum’”; [87] (pp. 103–104) also mentioned S. purpurea, although this author made some confusion with S. tuberosa since he listed the vernacular names of the latter as well as the ways in which it is used. Botanist Adolpho Ducke collected S. purpurea in Tefé, Amazonas in 1912 (the specimen is deposited in Rio de Janeiro Botanical Garden Herbarium under number RB20625), and the specimen is annotated as probably being an introduction from Eastern Peru. These references to S. purpurea indicate that by the end of the 19th century the species was already cultivated in Brazil, but until the second half of the 20th century the species was probably not widely grown. For instance, refs. [79,88] reported that “… its introduction in Ceará is recent, and even more so in São Paulo, where the first seeds were planted in June 1938 in the Genetic Section of the Agronomic Institute of Campinas”. Spondias purpurea is recorded as being cultivated in Fortaleza, Ceará, around 1920 [89]. Therefore, it is safe to assume that the few known hybrids involving S. purpurea are either primary (F_1_) or early-generation hybrids.
The timescale for the origin of S. bahiensis and the northern ‘umbu-cajá’ taxon might be slightly more complex to establish. Spondias bahiensis exhibits high levels of phenotypic variability among different accessions [34,41,42,90,91,92,93,94,95,96,97] as well as considerable genetic diversity [98,99,100], and karyotypic differences in relation to both progenitor species [35]. The same applies to the northern ‘umbu-cajá’ taxon, which also displays morphological [100,101,102,103,104,105,106,107,108] and genetic [104,109] variability. If hybridization is recent in these taxa, then it must be recurrent and widespread, and some of the variability might be the result of introgression with the parental species. Alternatively, both taxa might represent lineages created by older hybridization events.
A common statement in the literature is that these hybrids are difficult to propagate sexually because most endocarps lack viable seeds [44,90,92,93,95,96,99,100,102,103,107,110,111,112,113,114]. Some authors do not cite references [44,110,111,113], and the remaining authors cite the works of Souza and collaborators [37,115]. Although the majority of the studies mentioned above were carried out in S. bahiensis, the studies by Souza and collaborators were based on the northern ‘umbu-cajá’ taxon, evidencing the confusion stemming from both taxa having the same vernacular names. Unlike the northern ‘umbu-cajá’, Spondias bahiensis produces a high percentage of viable seeds [41] which germinate readily (first author, personal observation). Although plants with more desirable features are often vegetatively propagated, volunteer seedlings are tolerated in cultivation gardens and allowed to grow. Given the levels of both genetic and morphological diversity observed in S. bahiensis and the northern ‘umbu-cajá’ taxon, these taxa seem to represent hybrid lineages rather than primary (F_1_) hybrids. If this is the case, then the key question is whether these hybrids arose through natural contact between the parental species or because humans intentionally or unintentionally brought them together enabling gene flow to occur between hitherto geographically or ecologically isolated species.
The parental species of S. bahiensis and the northern ‘umbu-cajá’ taxon are S. mombin, S. tuberosa and S. venulosa, with the northern ‘umbu-cajá’ taxon being the result of hybridization between the former two species and S. bahiensis being the result of hybridization between the latter two species. Spondias mombin, S. tuberosa and S. venulosa are ecologically confined to different habitats where they normally do not occur in sympatry under natural conditions. Spondias mombin is adapted to more humid conditions and is widely distributed in the Amazon and Atlantic forests of Brazil; S. tuberosa is restricted to the semiarid region which encompasses all states in Northeastern Brazil and the northern region of Minas Gerais, occurring within the xerophytic vegetation known as Caatinga; S. venulosa occurs in semi-deciduous forests and its distribution encompasses the states of Bahia, Espírito Santo, Minas Gerais and Rio de Janeiro. However, the species could have come in contact in periods of past climatic changes [116,117,118,119], during which the alternation of dry and humid periods caused the vegetation adapted to either more xeric or more mesic habitats to expand and contract.
The probable area of origin for S. bahiensis is the eastern region of Bahia, east of the Chapada Diamantina (a mountain range in the middle of the state); this is the only region where S. tuberosa and S. venulosa could have come into contact in the past. In Bahia, S. venulosa occurs in semi-deciduous forests in the eastern slopes of the Chapada Diamantina and nearer the coast in a strip of semi-deciduous forest that constitute a transitional zone between the Caatinga vegetation and more humid phases of the coastal Atlantic forest. Spondias mombin also occurs in the Atlantic forest of Bahia but remains ecologically isolated from S. tuberosa. Incidentally, S. bahiensis is most diverse and abundant in Bahia, and its introduction to Minas Gerais and other states in Northeastern Brazil must be somewhat recent. The probable area of origin of the northern ‘umbu-cajá’ taxon is more difficult to determine, but likely areas of prior contact between S. mombin and S. tuberosa are near the coast in Maranhão, Piauí, and in Ceará, where the Caatinga vegetation S. tuberosa occurs, coming very close to the coastal areas where S. mombin is normally found. The central areas of Brazil are covered by fire-prone, savanna vegetation that harbors no Spondias species.
An alternative scenario for the origin of S. bahiensis and the northern ‘umbu-cajá’ taxon is that they arose as a product of human activities—there is archaeobotanical evidence that the indigenous people consumed fruits of Spondias species. An investigation of archaeological plant remains at two sites in northern Minas Gerais revealed that S. tuberosa was gathered continuously from 150–4250 BP [120,121,122]; investigations at a number of sites in Pernambuco revealed that S. tuberosa was intermittently gathered from 888 to 9150 Before Present (BP) [123,124,125,126]. It is conceivable that the indigenous human populations would not only consume fruits in situ but also transport them as they move from place to place. These human-mediated plant movements could have promoted the contact between hitherto-isolated species via the collection of fruits of a species in their original habitat and the later discard of the seeds in environments where other species occurred. Contact between species may also have occurred more recently, post-colonization of Brazil by the Portuguese; some (potentially unintentional) cultivation of S. tuberosa is recorded as early as 1587 by [127] (p. 172); the authors wrote that “these trees already occur in the farms of the Portuguese, born from seeds”. If the Portuguese settlers discarded seeds of one species in habitats where other species occur, then these seeds could germinate and produce volunteer seedlings that upon growing to maturity could potentially cross with the other species, since flowering in Spondias is dictated by water availability [71] and when growing in the same environment the different species usually have synchronous or overlapping flowering [71]. This kind of process, whereby hybrids are serendipitously created via plants established in backyard dumps, is well-documented for the genus Leucaena in Mexico [128], and could also be the case in Spondias.
Hybridization is a frequent outcome in various contexts, when otherwise-isolated plant species were brought into sympatry [128,129,130,131,132]. However, it remains uncertain if the hybridization events that created S. bahiensis and the northern ‘umbu-cajá’ taxon preceded or were precipitated by human activities. References to both taxa are very scant in the literature prior to about 1980, and there are also very few botanical collections in herbaria. Perhaps the earliest mention to these taxa is in [133] (p. 83); the authors realized that the northern ‘umbu-cajá’ taxon was something different from S. tuberosa; they wrote that “certainly one other variety [of S. tuberosa] exist, since besides the common type found in the Caatinga of Bahia, there is another type, larger and with bigger canopy, that has pubescent leaves and is rather common in Piauí”. The earliest collection of S. bahiensis was conducted by Inacio de Menezes in 1940 (the specimen is deposited in Rio de Janeiro Botanical Garden Herbarium under number RB42811), and there is a mention to S. bahiensis in a 1949 article about fruit flies [134]. The lack of earlier records makes it impossible to establish for how long these plants have been known by man, which could give clues to how old these hybrid lineages are.
The other hybrids investigated—the hybrid between S. bahiensis and S. mombin and the hybrid between S. bahiensis and S. venulosa—are further examples that show how novel combinations between Spondias taxa are continuously being created in Northeastern Brazil; they demonstrate that the evolution of the hybrid Spondias taxa from Northeastern Brazil is rather complex one and is by no means finished.
There is an inherent problem for more accurate estimation of the timescales in the Sanger sequencing markers used, because, as in most phylogeographical studies, the sequence divergence is very low, and haplotypes within species differ by one or a few mutations only. In the future, to better address the mode and tempo of hybridization, more variable markers from genomic data, such as SNP, should be used.
4. Materials and Methods
4.1. Taxon Sampling
For the molecular analysis, we sampled 355 accessions of Spondias (Supplementary Materials, Table S2), including S. bahiensis (149 accessions), S. dulcis (five accessions), S. mombin (60 accessions), S. purpurea (five accessions), S. tuberosa (46 accessions), S. venulosa (46 accessions), and the putative hybrid taxa ‘umbuguela’ (two accessions) and ‘cajaguela’ (one accession), the northern ‘umbu-cajá’ taxon (39 accessions), a hybrid between S. venulosa and S. bahiensis (one accession), and a hybrid between S. mombin and S. bahiensis (one accession). All samples were collected in the field; for those taxa with a large geographical distribution, we tried to collect samples from thorough their distribution in Brazil. The collection localities, voucher information, and GenBank accession numbers are given in the Supplementary Materials, Table S2.
All the samples used in this study were collected by the senior author and his colleagues or cooperators, and no special permissions were required for collection. The samples were collected either from public or privately owned land, and in cases of the latter, we had the permission of the landowners to collect the samples. No collections were made in protected areas, and the field studies do not involve endangered or protected species.
4.2. DNA Extraction, Amplification and Sequencing
Genomic DNA were extracted from fresh or silica gel dried leaf tissue using a modified cetyltrimethylammonium bromide (CTAB) 2× protocol [135] modified for CTAB gel sample storage [136]. For the genetic analyses we amplified the psbA-trnH intergenic spacer [51] and the rps16 intron [137] from the cpDNA genome, and the ETS region from the nuclear encoded small subunit ribosomal DNA (SSU rnDNA) using the primers ETS1F [138] and 18S-IGS [139]. We used TopTaq Master Mix Kit (Qiagen, Venlo, The Netherlands) to amplify both regions. Polymerase chain reaction (PCR) was performed in a total volume of 10 μL containing 1 μL (ca. 30 ng) of template DNA, 0.2 μL each of forward and reverse primers at 15 μM concentration, 6 μL TopTaq mix, 2 μL TBT [140] and 0.4 μL of water. The thermal profile for amplifying the cpDNA regions consisted of an initial denaturing step of 80 °C for 5 min; followed by 35 cycles of 95 °C for 1 min, 52 °C for 1 min, and 65 °C for 5 min; and a final extension of 65 °C for 5 min. The thermal profile for amplifying the ETS region was that described by [139]. To check amplification success, 1.5 μL of each PCR product were quantified in ethidium-bromide-stained 2% agarose gels. Prior to sequencing, the PCR products were cleaned using the polyethylene glycol (PEG)-NaCl 11% precipitation [141]. DNA sequencing was performed with Big Dye Terminator Cycle 3.1 Sequencing Kit (Applied Biosystems, São Paulo, Brazil) with the same primers used for the PCR reactions. The cycle sequencing reaction followed a program of 25 cycles of denaturation at 96 °C for 10 s, annealing at 50 °C for 5 s, and extension at 60 °C for 4 min. Products were then sequenced using an ABI 3130XL genetic analyzer (Applied Biosystems). The amplified ETS sequences did not exhibit sequence heterogeneity and thus cloning was not performed for any of the putative hybrid taxa.
4.3. Sequence Edition, Alignment and Analysis
Electropherograms were edited and assembled using the Staden Package [142]. Sequences were manually aligned using Seaview 4.2.6 [143,144]. Both ends of the aligned matrices were cropped so that all accessions possessed the same sequence length. The matrices were then saved in FASTA format.
Phylogenetic analyses were performed on a reduced dataset consisting of “taxon genetic classes”, hereinafter referred to as TGCs (unique combinations of taxa and DNA sequences). In order to identify the TGCs, all DNA regions were manually combined into a single FASTA file (sequences of the psbA-trnH, rps16 and ETS regions were concatenated for each accession). The resulting matrix was entered in DnaSP 5.10 [145], where haplotype data was generated in Nexus format with the option of including sites containing gaps in the analysis. The haplotype identified for each accession (Supplementary Materials, Table S2, column “Combined haplotype”) was then compared to the taxon to which each accession is assigned, and unique combinations of taxa and haplotypes were given a name (Supplementary Materials, Table S2, column “Taxon genetic class”). Accessions assigned to a TGC possess identical DNA sequences. A total of 48 TGCs were identified, and these were used as terminals for the phylogenetic analyses. Two matrices were prepared, one consisting of ETS sequences from each TGC, and a second matrix consisting of the combined psbA-trnH and rps16 intron sequences from each TGC. The two cpDNA regions were combined in tree reconstructions since the entire chloroplast genome is regarded as being a single linkage group and thus all cpDNA regions are expected to exhibit the same phylogenetic pattern [146].
Phylogenies were reconstructed both including and excluding TGCs from the putative hybrid taxa in order to assess their influence in tree topologies. The cpDNA and ETS datasets were analyzed using maximum parsimony as optimality criterion, because we were interested in detecting discrete mutation patterns in the taxa, using PAUP* 4.0b10 [147]. The single TGC for the introduced species S. dulcis was used to root the tree—the choice of S. dulcis as an outgroup stems from the fact that this Asian species is sufficiently distant from the ingroup to resolve it [26,148]. Parsimony analysis was performed using a heuristic search to generate 1000 replicates of random taxon addition using equal (Fitch) weights and tree bisection and reconnection (TBR), 10 trees held at each step, MulTrees off, saving only the shortest trees or the shortest from each replicate. The resulting trees were used as starting points in another round of TBR with MulTrees on. In the analyses presented here, gaps were treated as missing data, poly repeats were included, and branches with a minimum length of zero were collapsed. Support for tree topology was evaluated with 1000 bootstrap (BS) replicates in PAUP* 4.0b10 (TBR, 10 trees held at each step, MulTrees on). From the bootstrap analyses of the cpDNA and ETS datasets, we collapsed branches with less than 80% support in the majority rule consensus trees. The trees were plotted facing each other in R version 3.1.1 [149,150] using the function cophyloplot of the package APE version 3.1-4 [151]. The resulting graphic image was saved as a PDF file containing the image in vector format and then edited using InkScape 0.48.4 [152].
In order to visualize the distribution of haplotypes from the putative hybrid taxa relative to the distribution of haplotypes from the parental species, we prepared cpDNA and ETS matrices containing all accessions of the species (without the putative hybrids), and cpDNA and ETS matrices with all accessions included. The resulting matrices were input to DnaSP 5.10 [145], where haplotype data were generated in Roehl Data Format (.rdf) with the option of removing from analysis sites containing gaps. The .rdf files were then input to Network 4.6.1.1. [153] to calculate and draw median-joining networks [154]. The haplotype networks were then assembled and edited in InkScape 0.48.4 [152]. We also prepared a graphic image contrasting cpDNA and ETS haplotypes found for the accessions of the putative hybrid taxa. Haplotypes shared between putative hybrids and recognized species were colored with the color assigned to the corresponding species.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Stebbins G.L. The role of hybridization in evolution Proc. Am. Philos. Soc.1959103231251
- 2Raven P.H. Systematics and plant population biology Syst. Bot.1976128431610.2307/2418721 · doi ↗
- 3Rieseberg L.H. The role of hybridization in evolution: Old wine in new skins Am. J. Bot.19958294495310.1002/j.1537-2197.1995.tb 15711.x · doi ↗
- 4Ellstrand N.C. Whitkus R. Rieseberg L.H. Distribution of spontaneous plant hybrids Proc. Natl. Acad. Sci. USA 1996935090509310.1073/pnas.93.10.509011607681 PMC 39411 · doi ↗ · pubmed ↗
- 5Arnold M.L. Natural Hybridization and Evolution Oxford University Press New York, NY, USA 1997
- 6Levin D.A. The Origin, Expansion, and Demise of Plant Species Oxford University Press Oxford, UK 2000
- 7Rieseberg L.H. Raymond O. Rosenthal D.M. Lai Z. Livingstone K. Nakazato T. Durphy T.L. Schwarzbach A.E. Donovan L.A. Lexer C. Major ecological transitions in wild sunflowers facilitated by hybridization Science 20033011211121610.1126/science.108694912907807 · doi ↗ · pubmed ↗
- 8Soltis P.S. Soltis D.E. The role of hybridization in plant speciation Annu. Rev. Plant Biol.20096056158810.1146/annurev.arplant.043008.09203919575590 · doi ↗ · pubmed ↗
