Ballou’s Ancestral Inbreeding Coefficient: Formulation and New Estimate with Higher Reliability
Tetsuro Nomura

TL;DR
This paper provides a mathematical foundation for Ballou’s ancestral inbreeding coefficient and introduces a more reliable method to estimate it.
Contribution
The paper introduces a new hybrid method combining stochastic and deterministic computations to improve the reliability of FBAL−ANC estimates.
Findings
The new method reduces the variance of FBAL−ANC estimates compared to conventional gene-dropping simulations.
Bypassing part of the stochastic process enhances the reliability of the estimates.
Abstract
Deleterious recessive alleles causing inbreeding depression may be eliminated from populations through purifying selection facilitated by inbreeding. Providing evidence of this phenomenon (i.e., inbreeding purging) is of great interest for conservation biologists and animal breeders. Ballou’s ancestral inbreeding coefficient (FBAL−ANC) is one of the most widely used pedigree-based measurements to detect inbreeding purging, but the theoretical basis has not been fully established. In this report, the author gives a mathematical formulation of FBAL−ANC and proposes a new method for estimation based on the obtained formula. A stochastic simulation suggests that the new method could reduce the variance of estimates, compared to the conventional gene-dropping simulation. Inbreeding is unavoidable in small populations. However, the deleterious effects of inbreeding on fitness-related traits…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5- —Japan Science and Technology Agency.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenetic and phenotypic traits in livestock · Genetic Mapping and Diversity in Plants and Animals · Genetics and Plant Breeding
1. Introduction
Inbreeding is defined as a mating between relatives and is unavoidable in small populations [1]. Inbred individuals show a reduction in phenotypic performance, especially in fitness-related traits, which is a phenomenon known as inbreeding depression [1,2,3]. Inbreeding depression has been documented in various animal and plant species [1,4,5,6]. It has been considered that inbreeding depression is largely caused by partial dominance, i.e., the existence of partially deleterious recessive alleles, although over-dominance and epistasis may also play a role [3,7].
One may intuitively expect that the more inbred a population, the greater the severity of inbreeding depression. However, as more deleterious recessive alleles are exposed to inbreeding, they could be purged from the population by selection, consequently reducing the inbreeding depression [3,8,9]. This phenomenon has been referred to as inbreeding purging [3,9]. Providing evidence of inbreeding purging is not only theoretically but also practically of great interest since populations less affected by inbreeding depression could be established if inbreeding purging exists. To date, many studies intending to detect inbreeding purging have been carried out in captive wild [10,11,12], domesticated animal [13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29] and human populations [30], some of which suggests the existence of inbreeding purging [10,12,15,17,28,30]. However, it has been also shown that inbreeding purging is not a ubiquitous phenomenon [10,12,31].
In these studies, various pedigree-based measurements of purging opportunity induced through inbreeding in ancestors have been estimated (for a review, see [31]). One of the most widely used measurements is Ballou’s ancestral inbreeding coefficient ( ), which is defined as the cumulative proportion of an individual’s genome that has been previously exposed to inbreeding in its ancestors [10]. Irrespective of the wide use of [12,13,14,15,16,17,18,19,20,21,23,24,25,26,28,30,31], the theoretical basis has not been fully established; in the pioneering work of Ballou [10], a recurrence equation to compute from pedigree was given, but later it was shown that computed from the equation overestimates the values obtained from stochastic simulation (gene-dropping simulation) [32]. To date, however, the exact computational procedure remains unknown. In all the published works, estimated values from gene-dropping simulation [12,13,14,15,16,17,18,19,20,21,23,24,25,26,28] or, as an approximation, values computed from Ballou’s equation have been used [12,30,31,33].
In the present report, I translate Ballou’s definition of into a mathematical expression. Although the deterministic computation of from the expression is limited to simple cases, from the expression a new stochastic method for estimating with a higher reliability is proposed. The performance of the new method is evaluated by stochastic simulation. Finally, I discuss some theoretical aspects of and the relation to other pedigree-based parameters to detect inbreeding purging.
2. Notation
Consider the pedigree of an individual X originated from N_A_ founders (ancestors with unknown parents), each with unique alleles. Founders are described as A1, A2, …, , and two alleles of founder A_i_ are denoted as and . We attach superscripts (0, N, 1) to , according to its history of autozygosity:
: the allele has experienced no autozygous state in the past,
: the allele experienced an autozygous state for the first time in a given individual,
: the allele has already experienced an autozygous state at least once.
Note that is a transient notation; it is immediately replaced by when transmitted to the next generation. The number of inbred ancestors in the pedigree is denoted as , and the inbred ancestors are described as , ,…, .
The partial inbreeding coefficient [34] of individual k for founder allele is denoted as , which expresses the probability that individual k is autozygous for , i.e., the genotype of k is . With the autozygous history notations, is expressed by the sum of the three probabilities:
where is the probability that the individual k has the genotype . Due to symmetry in pedigree, . Thus, the partial inbreeding coefficient ( ) of individual k for founder i is
By the definition of the partial inbreeding coefficient [34], Wright’s [35] classical inbreeding coefficient ( ) of individual k is
For the sake of simplicity, we write the above expression as
3. Derivation of Expression
Denoting male and female parents of individual X as S and D, respectively, the recurrence equation given by Ballou [10] is
However, using a gene-dropping simulation, Suwanlee et al. [32] showed that Equation (2) overestimates . As a revised version of Equation (2), they proposed an equation [32]:
where is the proportion of genome of S that has been exposed to autozygosity at least once in the past, given that the genome is in an autozygous state in S (or equivalently, the conditional probability that an allele of S on an arbitrary locus has been exposed to autozygosity at least once in the past, given that the allele is in an autozygous state in S) and is the similar proportion in D.
The term in Equation (3) implies the proportion of genome newly exposed to inbreeding in S, and is the similar proportion in D. Denoting these terms as and , respectively, we have
Here we consider the simple pedigree shown in Figure 1. With Equation (4), of individual X in the pedigree can be expanded to
In this expression, is the genetic contribution [36] of ancestor (K, L, O, S, D) to X.
Analogously in any pedigree, we can expand of an individual and can express it with and of all ancestors in the pedigree. But we should note that, by the definition of , an individual without inbred ancestors should have , and a non-inbred individual should have . This leads to a considerable simplification of the expanded form. In general, of individual X can be expressed as
where is the genetic contribution of inbred ancestor B_k_ to X. This is an explicit expression of Ballou’s definition of the ancestral inbreeding coefficient. Note that is expressed only with of inbred ancestors and their genetic contributions.
As an application of Equation (5), consider the real pedigree shown in Figure 2, which is a part of the pedigree of a mare (individual X) in a captive population of Przewalski’s horse [37]. Such a pedigree will be typically found in the early history of a captive population expanded from a limited number of wild-caught founders. From Equation (5), of individual X is expanded as
To complete the computation of , we need to obtain values of For the pedigree shown in Figure 2, it is trivial that and , since these ancestors have no inbred ancestors. But the computation of is complicated. Prior to the computation, we generally consider the allele-based expression of . Since can be alternatively viewed as the expected frequency of the founder allele that is newly exposed to inbreeding in B_k_, for a founder allele is written, with an analogy to the partial inbreeding coefficient , as
Summing this expression over all founders and alleles within each founder and applying an analogy to Equation (1), we obtain the allele-based expression of as
To apply Equation (7) to the computation of in Figure 2, we introduce the nine condensed identity states (S_1_–S_9_ shown in Figure 3) [38] between the parents (B1 and B2) and their probabilities of occurrence ( ), that is, the condensed identity coefficients [38]. Four identity states (S_3_, S_5_, S_7_ and S_8_) are relevant to the computation of : from S_3_ and S_5_, a child with genotype will be born with the probability 1/2, and from S_7_ and S_8_, a child with genotype will be born with the probabilities 1/2 and 1/4, respectively (Figure 3). I obtained , , and from the ribd package [39] in R [40]. Thus, and . Substituting these values into Equation (7) gives . Finally, substituting the obtained values of (k = 1, 2, 3) into Equation (6), we get . The computational process is summarized in Table 1. The estimated from a gene-dropping simulation with 10^6^ replicates using GRain (v2.2) [41,42] was 0.2263, while Ballou’s original Equation (2) gave an overestimated value as
4. New Estimate and Its Performance
4.1. New Estimate of FBAL−ANC
If of an individual with multiple ( ) inbred ancestors is considered, the deterministic computation is complicated since it requires the condensed identity coefficients among the multiple inbred ancestors. In theory, computation of the condensed identity coefficients among multiple individuals will be possible [43], but the possible number ( ) of the condensed identity states exponentially increases as the number ( ) of involved individuals increases; e.g., for = 3, = 66 and for = 4, = 712 [44]. In addition, we must trace the autozygous history of the alleles involved in each condensed identity state. These make the generalized computation of intractable for a pedigree with multiple inbred ancestors. In fact, for the repeated full-sib mating, I found a compact set of recurrence equations which gives and at any generation. But unfortunately, the generalization seems to be hopeless.
As an alternative to the deterministic computation of , I propose the use of estimated from a gene-dropping simulation. As shown later, estimates of from this method have a favorable property. The simulation process is essentially same as the ordinary gene-dropping simulation [41,42]. In the simulation, alleles are flagged when they experience an autozygous state, and for an individual (necessarily, an inbred individual), newly flagged alleles ( in our notation) are counted over replicates of the simulation. The total number of counts divided by the number of replicates of the simulation gives a stochastic estimate of of the individual. Substituting the estimates of for all inbred ancestors into Equation (5) gives an estimate of . This method is referred to as the ‘hybrid method’, in a sense that it consists of a stochastic process (the gene-dropping simulation) and a deterministic process (the computation of genetic contributions from the inbred ancestors).
As an illustrative example, we apply the hybrid method to the pedigree of Przewalski’s horse shown in Figure 2. Estimates of for the inbred ancestors (B1, B2, B3) obtained from the gene-dropping simulation with 10^6^ replicates were 0.1250, 0.1243 and 0.2027, respectively. Weighting these estimates by the genetic contributions and summing them over all inbred ancestors (c.f., Equation (5)) gives as a hybrid estimate. The computational process is summarized in Table 1.
4.2. Performance of Hybrid Estimate
The performance of the hybrid estimate was evaluated with a more complicated pedigree (Figure 4), which is a part of the pedigree of the Spanish Habsburg dynasty [30,45]. Pedigree analysis with has suggested that inbreeding purging had been acting in the early history of this dynasty [30]. I estimated of King X (Charles II) from the hybrid method and evaluated the performance of the estimate by comparing with the estimate from the ordinary gene-dropping simulation. For both methods, 100 trials each with (=5000, 10,000, 50,000 and 100,000) replicates were carried out. To fairly compare the two methods, I used the same simulation program originally coded in Fortran95. The estimated from the gene-dropping simulation with 10^6^ replicates using GRain (v2.2) [41,42] was 0.2415.
The results of the simulation are summarized in Table 2, and for visualization, estimates from 100 trials of each method are plotted in Figure 5, for the case of = 10,000. For a given number of replicates, the hybrid method reduced the variance of estimates to 40–60% of those from the ordinary gene-dropping simulation, implying that, by the use of the hybrid method, the reliability of the estimate could be enhanced.
5. Discussion
Equations (5) and (7) indicate that of an individual is defined with ‘allele frequency’, but not directly with ‘autozygosity’ in the individual. Thus, as mentioned by several authors [18,22,33], a direct relationship between and Wright’s classical inbreeding coefficient (F) is not expected. In many works [13,14,15,16,17,18,19,20], the correlation between and F has been reported, ranging from 0.31 [20] to 0.95 [13]. The wide range of variation can be viewed as a reflection of the definition of .
In some cases, and F will show quite different values; for example, if a population is subdivided into several isolated lines and mating between two lines occurs, the offspring will have F = 0, while may show a positive value because of the accumulated within the parental lines. Similarly, when a wild-caught animal with unknown parents is introduced into a captive population, a child of the introduced animal is expected to show F = 0 and due to the accumulated in the native parent of the child. In this context, could be one criterion for selecting breeding animals. If other conditions are the same, an animal with a higher should be preferred as a breeding animal to an animal with a lower , because the former is expected to have a smaller number of deleterious recessive alleles that may cause inbreeding depression in the descendants.
Kalinowski’s ancestral inbreeding coefficient ( ) is another measurement of purging opportunity induced by ancestral inbreeding [11]. This coefficient is based on a decomposition of F as , where is Kalinowski’s new inbreeding coefficient [11]. is the probability that alleles are in an autozygous state in the individual while they have been in an autozygous state at least once in the past, and is the probability that alleles are in an autozygous state for the first time in the individual [11,42]. In our notation,
Of course, as verified from Equation (1), . Unlike , has a direct relation to F. Thus, it is natural that a higher correlation has been found between and F than between and F in many works [14,15,16,17,18,19].
For the pedigree in Figure 2, is computed from Figure 3 as
Since , we get
The corresponding estimates from GRain (v2.2) [41,42] with 10^6^ replicates are and . However, the exact computation of and for an individual with multiple inbred ancestors will be intractable for the same reason as the difficulty in generalized computation of and .
Gulsija and Crow [46] gave parameters to evaluate the potential reduction in an individual’s inbreeding load from pedigree data. Assuming that in the same path in a pedigree no two ancestors are autozygous for the same founder allele, they derived a parameter (opportunity for purging) to measure the opportunity for purging by the expected contribution of alleles from inbred ancestors to individual X [46]. The derived expression of is in our notation:
On the above assumption, it should be that and . Thus, from Equation (5) we have
For a complex pedigree involving several inbred ancestors in the same path, an inbred individual descending from inbred ancestors will be less likely to carry deleterious recessive alleles than when their ancestors have not been inbred [47]. To remove this remote ancestral effect from , Gulsija and Crow [46] showed an equation. However, it seems to be too complex to implement in practice [47]. Note that Equation (5) contains only the terms with , implying that the remote ancestral effects are completely excluded from .
In the present study, the hybrid method was proposed for estimating from pedigree data. Although the examined cases are limited, it was suggested that the method could enhance the reliability of the estimate, compared to the ordinary gene-dropping simulation [41,42]. This favorable property results from the bypass of a part of the stochastic process in the ordinary gene-dropping simulation; in the hybrid method, contributions of the estimated to individual X are fully deterministically computed with a genetic contribution, whereas in the ordinary gene-dropping simulation, whole transmission of alleles from founders to individual X are subject to stochastic events (Mendelian segregations), which inevitably inflates sampling variance of the estimates.
Prior to the implementation of the hybrid method, finding inbred ancestors and computing their genetic contributions are required. The extra task can be easily overcome. Rapid algorithms are now available for computing the inbreeding coefficients [48,49], which allows us to find inbred ancestors in a large pedigree efficiently [50]. Genetic contributions can be systematically obtained by computing a lower triangle matrix , where is the genetic contribution of i to j [51]. There is a simple algorithm for computing L, applicable to a large pedigree [52].
Although the exact computation of with the obtained expression is limited to small and simple pedigrees, the expression deepens our understanding of . A useful outcome from the expression will be the hybrid method for estimating . The performance should be intensively evaluated under various scenarios in future studies.
6. Conclusions
In this article, the author provided a mathematical basis for and proposed a new method for estimating . A stochastic simulation suggested that the new method could remarkably enhance the reliability of estimates, compared to the conventional gene-dropping method.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Falconer D.S. Mackay T.F.C. Introduction to Quantitative Genetics 4th ed.Longman London, UK 1996247252
- 2Charlesworth D. Wills J.H. The genetics of inbreeding depression Nat. Rev. Genet.20091078379610.1038/nrg 266419834483 · doi ↗ · pubmed ↗
- 3Hedrick P.W. Gurcía-Dorado A. Understanding inbreeding depression, purging, and genetic rescue Trends Ecol. Evol.20163194095210.1016/j.tree.2016.09.00527743611 · doi ↗ · pubmed ↗
- 4Crnokrak P. Roff D.A. Inbreeding depression in the wild Heredity 19998326027010.1038/sj.hdy.688553010504423 · doi ↗ · pubmed ↗
- 5Keller L.F. Waller D.M. Inbreeding effects in wild populations Trends Ecol. Evol.20021723024110.1016/S 0169-5347(02)02489-8 · doi ↗
- 6Lande R. Schmske D.W. The evolution of self-fertilization and inbreeding in plants. I. Genetic models Evolution 19853924402856365510.1111/j.1558-5646.1985.tb 04077.x · doi ↗ · pubmed ↗
- 7Simmons M.J. Crow J.F. Mutations affecting fitness in Drosophila populations Ann. Rev. Genet.197711497810.1146/annurev.ge.11.120177.000405413473 · doi ↗ · pubmed ↗
- 8Hedrick P.W. Purging inbreeding depression and the possibility of extinction: Full-sib mating Heredity 19947336337210.1038/hdy.1994.1837989216 · doi ↗ · pubmed ↗
