Four simulated scenarios combining population size (n) and migration rate (m).
Abstract
Genetic diversity comprises the total of genetic variability contained in a population and it represents the fundamental component of changes since it determines the microevolutionary potential of populations. There are several measures for quantifying the genetic diversity, most notably measures based on heterozygosity and measures based on allelic richness, i.e. the expected number of alleles in populations of same size. These measures differ in their theoretical background and, in consequence, they differ in their ecological and evolutionary interpretations. Therefore, in the present chapter these measures of genetic diversity were jointly analyzed, highlighting the changes expected as consequence of gene flow and genetic drift. To develop this analysis, computational simulations of extreme scenarios combining changes in the levels of gene flow and population size were performed.
Keywords
- allelic richness
- computational simulations
- gene diversity
- molecular markers
- population genetics
1. Introduction
Genetic diversity comprises the total of genetic variability contained in a population and it represents the row material for evolutionary changes since it determines the microevolutionary potential of populations.
The most popular measure of genetic variation is the average heterozygosity expected in Hardy–Weinberg equilibrium. Nei [1] called this measure as gene diversity index, and defined it as either the average proportion of heterozygotes per locus in a randomly mating population or the probability that two alleles randomly and independently selected from a gene pool will represent different alleles. Expected heterozygosity at
Being
The total number of alleles at a locus has also been used as a measure of genetic variation and is an important measure of the long-term evolutionary potential of populations [3]. The major drawback of the number of alleles is that, unlike heterozygosity, it is highly dependent on sample size. Therefore, samples sizes must be equal in order to obtain meaningful comparisons between samples because of the presence of many alleles at low frequencies in natural populations. In this way, the allelic richness estimator (
1.1 Loss of genetic diversity in reduced sized populations
The starting question for analyzing the effect of reduced sized populations on genetic diversity levels is how population size (N) influence on the allele and genotype frequencies. In case that Hardy–Weinberg principle assumption of infinite population size being violated, genetic drift will occur in populations. Genetic drift is a stochastic sampling process that determines what alleles will constitute the gene pool in the next generation. Fragmentation and isolation due to habitat loss and landscape modification can reduce the population size of many species of plants and animals throughout the world hence understand genetic drift and its effects is extremely important for biodiversity conservation [3].
The implementation of molecular biology techniques for differentiation of individuals directly at DNA level allows inferring genetic diversity parameters in real populations even these parameters were defined prior to the development of DNA-based molecular markers. In addition, technological development of capillary electrophoresis has improved the resolution power for allele identification and advances in computer power has allowed the analysis of a huge number of highly polymorphic loci simultaneously in a simply and quickly manner.
1.2 Molecular markers as workhorses for genetic diversity studies
A molecular marker is known as any specific DNA fragment that may or may not correspond to coding regions of the genome [6] and is representative of differences at the genomic level [7]. In case that a molecular marker shows segregation according to the Mendelian laws of inheritance, it can also be defined as a genetic marker and it provides genetic information [6]. Molecular markers offer advantages over conventional alternatives based on phenotype, since contrary to morphological data, molecular data are stable and detectable in all tissues without being related to the development, differentiation, growth, or defense state of the cell and they are not influenced by environmental effects [7, 8].
Although there are several type of molecular markers the ideal genetic marker must be reliably measurable, exhibit highly variable loci, be codominant, and be densely distributed throughout the genome. The microsatellite markers also called Simple Sequence Repeat (SSRs) meet all these requirements [9]. SSRs are monotonous repeats of short nucleotide motifs of 1 to 6 base pairs (e.g., cgtcgtcgtcgtcgt, which can be represented by (cgt)n where n = 5). These repetitive elements can be found interspersed in the three eukaryotic genomes: nucleus (SSRs), mitochondria (mtSSRs) and chloroplasts (cpSSRs) [10]. The different SSRs alleles are mainly generated through simple repeat addition and subtraction mechanisms that occur with equal probability [11], and they are rarely found in coding regions [9]. SSRs are informative and practical markers because of they provide information about the amount and distribution of genetic diversity and the processes that determine the genetic structure and variation within and between natural populations [12]. Regarding methodological concerns, they present high stability with high intra- and inter-laboratory repeatability and they can be implemented in low complexity laboratories using external sequencing services. A limitation for SSRs implementation is that the sequence of repetitive flanking region is required to the development of specific primers although the cross transference of primers between closely related species is usually successful. SSRs have become the most widely used DNA marker in population genetics for genome mapping, molecular ecology, and conservation studies [3]. Despite the fact that massive sequencing methods to identify single nucleotide polymorphisms (SNPs) have gained prominence, microsatellites continue to be widely used tool because the analysis of generated data is simple and easily comparable with previous studies.
1.3 Simulations as a tool for predicting what is expected under certain conditions
Simulations help to recreate the stochastic process that accompanies the transmission of genes from parents to offspring because they recreate the movement of alleles under a model with same conditions several times. In addition, using different model conditions can help to disentangle sampling effects and scale dependencies, as well as historical influences of gene flow.
Any model (analytical, simulation, and otherwise) makes simplifying assumptions, excepting that it be “an entire reconstruction of the actual system—whereupon it ceases to be a model” [13].
The focus of this chapter is define the simplest model that show the effects of population size and gene flow on contemporary levels of genetic diversity, attending to the influence that multiplicity and abundance play on the classic genetic diversity estimators.
2. Materials and methods
2.1 Simulations
In order to test the effect of population size and gene flow on the magnitude of genetic diversity parameters simulated genetic data were obtained using IBDsim program [14]. This program simulates genetic data under isolation by distance model using a backward simulation strategy at population level. Stepping Stone Model was considered which assumes discrete populations, discrete number of generations, genetic drift within each population, and migration between adjacent or spatially proximal population [15, 16, 17] being
Population size ( | Migration rate ( | |
---|---|---|
0. 5 | 0.005 | |
100 | A - C | A - D |
20 | B - C | B - D |
2.2 Analysis of simulated data
Expected heterozygosity (
In addition, the spread and skew of both estimated parameters in all simulations by each scenario was shown using box and whisker plots that display a five-number summary: minimum, maximum, median, upper and lower quartiles. The central rectangle spans the first quartile to the third quartile, or the interquartile range (IQR). A segment inside the rectangle shows the median while whisker to the left and to the right show the locations of the minimum and maximum. These estimations were calculated using Microsoft Excel software.
3. Results
Combination of
A-C | A-D | B-C | B-D | |
---|---|---|---|---|
A-C | — | 9.05511E-11 | 2.27959E-75 | 2.20212E-68 |
A-D | 6.23453E-15 | — | 3.10501E-68 | 3.01124E-66 |
B-C | 4.77563E-87 | 1.35851E-69 | — | 8.60895E-19 |
B-D | 9.19086E-97 | 1.10061E-81 | 4.24449E-15 | — |
Parameter | Statistic | A-C vs B-C | A-D vs B-D |
---|---|---|---|
Mean | 2.769 (42.24%) | 2.575 (43.69%) | |
Median | 2.900 (43.94%) | 2.600 (54.93%) | |
Mean | 0.201 (25.77%) | 0.246 (32.98%) | |
Median | 0.202 (26.77%) | 0.248 (33.20%) |
Parameter | Statistics | A-C vs A-D | B-C vs B-D |
---|---|---|---|
Mean | 0.662 (10.10%) | 0.468 (12.36%) | |
Median | 0.700 (11.31%) | 0.400 (10.81) | |
Mean | 0.034 (4.35%) | 0.079 (13.64%) | |
Median | 0.037 (4.72%) | 0.083 (14.26%) |
4. Discussion
Genetic diversity is a pre requisite for population adaptation to environmental changes [12]. Large populations of naturally outbreeding species usually have extensive genetic diversity, but genetic diversity is usually reduced in populations and species of conservation concern [12]. Theoretical analyses based on simulations give information for understanding empirical results.
The total allele number by locus is a complementary measure of genetic diversity because it is more sensitive to loss of genetic variation as consequence of small population size than heterozygosity. In this way,
The effects of changes in population size on genetic diversity estimators considering different gene flow levels were studied in the present chapter by means of simulations (A-C vs. B-C and A-D vs. B-D, respectively). As expected, reductions in
The effects of gene flow levels on genetic diversity estimators considering different population sizes were studied in the present chapter by means of simulations (A-C vs. A-D and B-C vs. B-D, respectively). In large populations,
Gene flow is a microevolutionary process that maintain the genetic exchange among local populations increasing population genetic diversity [21]. Gene flow can be quantified by the parameter
5. Conclusion
The comprehensive quantification of genetic diversity levels demand the estimation of
Acknowledgments
The authors wish to thank National Council of Scientific and Technical Research (CONICET, Argentina).
References
- 1.
Nei M. Mint: Analysis of Gene Diversity in Subdivided Populations. Proceedings of the National Academy of Sciences.1973; 70 (12) 3321-3323. DOI: 10.1073/pnas.70.12.3321 - 2.
Nagylaki T. Mint: The expected number of heterozygous sites in a subdivided population. Genetics.1998; 149: 1599-1604 - 3.
Allendorf FW, Luikart GH. Conservation and the Genetics of Populations. Blackwell Publishing; 2007. 642 p - 4.
El Mousadik A, Petit RJ. Mint: High level of genetic differentiation for allelic richness among populations of the argan tree [Argania spinosa (L.) Skeels] endemic to Morocco. Theoretical and Applied Genetics. 1996; 92: 832-839. DOI: 10.1007/BF00221895 - 5.
Petit R, El Mousadik A, Pons O. Mint: Identifying Populations for Conservation on the Basis of Genetic Markers. Conservation Biology. 1998; 12(4): 844-855 - 6.
Ferreira M, Grattapaglia D. Introducao ao uso de marcadores moleculares em análise genética. EMBRAPA-CENARGEN; 1996. 220 p - 7.
Agarwal M, Shrivastava N, Padh H. Advances in molecular marker techniques and their applications in plant sciences. Plant Cell Reports 2008; 27:617-631 - 8.
Marcucci Poltri S. Marcadores Moleculares aplicados a Programas de Mejoramiento Genético de Eucalyptus. In: Secretaría de Agricultura. Ganadería, Pesca y Alimentos editors. Mejores árboles para más forestadores, 2005. 241 p - 9.
Karhu A. Evolution and applications of pine microsatellites. [thesis]. Faculty of Science. University of Oulu. Oulu. 52 p - 10.
Tautz D, Renz M. Mint: Simple sequences are ubiquitous repetitive components of eukaryotic genomes. Nucleic Acid Research. 1984; 12(10):4127-4138 - 11.
Schlötterer C, Tautz D. Mint: Slippage synthesis of simple sequence DNA. Nucleic Acids Research. 1992; 20: 211-215 - 12.
Frankham R, Ballou JD, Briscoe D.A. Introduction to Conservation Genetics. Cambridge University Press, 2002. 617 p - 13.
Epperson BK, Mcrae BH, Scribner K, Cushman SA, Rosenberg MS, Fortin MJ, James PM, Murphy M, Manel S, Legendre P, Dale MR. Mint: Utility of computer simulations in landscape genetics. Molecular Ecology. 2010; 19: 3549-3564 - 14.
Leblois R, Estoup A, Rousset F IBDSim: a computer program to simulate genotypic data under isolation by distance. 2008; Molecular Ecology Resources 9(1): 107-109. DOI: 10.1111/j.1755-0998.2008.02417.x - 15.
Kimura M. Mint: “Stepping stone” model of population. Annu Rep Natio Inst Genet. 1953; 3: 62-63 - 16.
Kimura M, Weiss GH. Mint: The stepping stone model of population structure and the decrease of genetic correlation with distance. Genetics. 1964; 49: 561-576 - 17.
Weiss G H, Kimura M. A mathematical analysis of the stepping stone model of genetic correlation. Appl Probab. 1965; 2: 129-149. DOI: 10.2307/3211879 - 18.
Leblois R, Beeravolu C R, Rousset F. IBDSim version 2.0 User manual - 19.
Leblois R, Estoup A, Rousset F. Mint: Influence of mutational and sampling factors on the estimation of demographic parameters in a “continuous” population under isolation by distance. Mol Biol Evol. 2003; 20(4): 491-502. DOI: 10.1093/molbev/msg034 - 20.
Goudet J. Mint: FSTAT (vers.2.9.3.2): a computer program to calculate F statistics. Heredity. 1995; 86:485-486 - 21.
Hartl, DL, Clark AG. Principles of population genetics. Sinauer Associates, Inc Publishers; 2007. 652 p - 22.
Slatkin M, Barton NH. A comparison of three indirect methods for estimating average levels of gene flow. Evolution. 1989; 43(7):1349-1368 - 23.
McNeely JA, Miller KR, Reid WV, Mittermeier RA, Werner TB. Conserving the world’s biological diversity. IUCN, World Resources Institute, Conservation International, WWF-US, and the World Bank, 1990 - 24.
Garner BA, Hoban S, Luikart G. Mint: IUCN Red List and the value of integrating genetics. Conservation Genetics. 2020; 21: 795-801. DOI: 10.1007/s10592-020-01301-6