Microsatellites as a Tool for the Study of Microevolutionary Process in Native Forest Trees

The main aim of this work is an attempt to help researchers that use microsatellite markers to analyze microevolutionary forces in natural populations of native forest species. This kind of studies drives the researchers to make decisions regarding management or conservation of such species. This chapter pays attention to the entire process—from development of microsatellite markers, going through data analysis and ending with interpretation of these results. This work helps to researchers that are not familiarizing with methods and population genetics theories to analyze nuclear and chloroplast microsatellite data. These methods allow quantification of genetic variation and genetic structure in native forest species, and theoretical content allows knowledge about the past and the present genetic states of populations for making inferences about the future of these populations.


Introduction
Patterns of distribution of genetic variation in the landscape reflect the responses of species to evolutionary forces operating within current and past environments, and it can tell us much about how species have evolved and may continue to evolve in the future [1].Most studies on genetic variation patterns within tree species were primarily motivated by attempts to improve our understanding of biodiversity at the intraspecific level or the evolutionary dynamics within plant species in an early stage of domestication [2].However, forest tree species have many valuable subjects to be explored, and problems could be solved using microsatellite markers in combination with appropriate statistical analyses to make recommendations for conservation of forest genetic resources [3], infer the origin of forest plants and woods [2], and conduct molecular tree improvements [4].
The main aim of this work is an attempt to help researchers that use microsatellite markers to analyze microevolutionary forces in natural populations of native forest species.This kind of studies drives the researchers to make decisions regarding management or conservation of such species.

The challenge to work with microsatellite markers in species without economical interest
Native forest species are interesting models of biodiversity study because they give valuable information about current and past conditions that could have influence on the amount and distribution of genetic variation in natural populations.Hence, long-lived tree species have witnessed climatic, demographic, and/or ecological changes, and all these changes left genetic traces that can be studied using microsatellite markers (Simple Sequence Repeats -SSRs).However, every interesting point about working with native species has its unfavorable counterpart because of low economic value of native forest species.One of the limitations is the lack of DNA sequence information needed to develop and use simple sequence repeats (SSRs).Unfortunately, SSRs are not universal markers, and species specificity of SSR loci in plants is a major constraint to their ubiquitous adoption [5], although limited cross-species transferability of SSR loci of closely related taxa is possible.
The starting point of a genetic study using SSRs in native forest trees is getting species-specific SSRs, e.g., searching in the nucleotide section of GeneBank public database.In the case of unavailability of species-specific SSRs, an alternative is to search SSR primers developed for a phylogenetically closely related species because as mentioned above, empirical studies have demonstrated that cross-species transfer of nuclear microsatellite markers is possible [6].Using latter methodology, SSRs developed for one species can be used to detect polymorphism at homologous sites in related species.However, the repeat sequence and the flanking regionscontaining primer binding sites must be conserved across taxa to detect polymorphism at homologous sites in related species [5].
The success of heterologous PCR amplification will depend upon evolutionary distance between the source and the target species because empirical studies have shown an inverse relationship between primer site conservation and evolutionary relationship between tested taxa [5].Cross-species transferability of polymorphic markers in plants is mainly successful within genera (success rate close to 60% in eudicots and close to 40% in the reviewed monocots), whereas between genera, cross-species transfer rates are approximately 10% for eudicots [6].There are studies with native forest tree species in which cross-species transfer were successful, e.g., Quercus [7][8][9], Prosopis [10], Eucalyptus [11], Enterolobium [12], Pithecellobium [13], Araucaria [14], and Taxus [15].
In the worst of cases, cross-species transfer of SSRs may not work.Hence, we propose two nonmutually excluding alternatives: the development of species-specific microsatellites for nuclear genome (nuclear Simple Sequence Repeats -nuSSRs) and the use of chloroplast microsatellite markers (chloroplast Simple Sequence Repeats -cpSSRs).These alternatives are very different regarding to genetic information they provide and its cost in terms of time and money.Many laboratories have enough resources and expertise for conducting SSR-based research but not for characterizing new loci [5].
Microsatellite markers are present in chloroplast genome but particular traits of this genome provide different population genetic information than nuSSRs.Organelle genomes are typically nonrecombinant, uniparentally inherited, and effectively haploid [16].Unlike the conventional approach for obtaining nuclear microsatellites, when cpSSRs primers designed for one species can regularly cross-amplify in related species, giving an opportunity to develop efficient "universal" SSR primers that show widespread intraspecific polymorphism.Chloroplast SSR primers developed by Weising and Gardner [17] are the most popular in Angiosperms.However, the low mutation rates associated with the chloroplast genome meant that detection of enough variation represents a major technical barrier for the widespread application of a particular marker [16].

Development of species-specific nuclear microsatellite markers
Population geneticists, forestry breeders, and ecologists starting a new research that must contend with a dichotomous decision: the isolation of species-specific microsatellite markers or application of multilocus fingerprinting approaches.The advantages of hypervariable, codominant markers as SSRs are well documented [18], but in many cases, the perceived difficulties of SSRs isolation act as a deterrent for the utilization of this class of markers [19].
In recent years, publications of new species-specific nuSSRs in forest tree species and in other plant taxa are frequent in most of journals.However, development of species-specific nuSSRs is time and cost consuming.Also, specific laboratory and technical conditions are needed, costly and laborious cloning and screening procedures limit the number of species that can be studied.
As a consequence of the diverse publications and techniques for nuSSRs development, in this section, we will exclusively focus on the SSR development procedure in plants addressing to describe the typical situation that the researchers must consider when working with nonmodel organisms.Our own experience comes from the development of specific nuSSRs for Anadenanthera colubrina var.cebil (Mimosoideae, Leguminosoideae), a native forest tree species from South America [20].Laboratory work was started to cross-species transfer of nuSSRs developed for other legume tree species because SSR primers from species of the same genera were not available.Eighteen primer pairs from six different species were tested including Koompasia malascensis, Acacia nilotica, Geoffroea spinosa, Prosopis sp., Dinizia excelsa, and Parkia panurensis.Results of cross-species transfer were unsatisfactory, and development of species-specific nuSSRs was necessary.
The successful isolation of SSRs involves several steps: (1) preparation of a microsatelliteenriched genomic library, (2) cloning and sequencing of fragments containing microsatellites, (3) primer design, and (4) testing the functionality of SSR primers and polymorphisms in tested genotypes.There is a potential loss of loci at each stage.A number of loci that will finally constitute the working primer set are a fraction of the original number of sequenced clones, which is called attrition rate [19].Microsatellite markers were developed from two microsatellite-enriched genomic libraries screening for an increase in the variability of microsatellite motifs.The results were notoriously different as only one library gave positive results.The libraries were developed using the enrichment procedure proposed by Fischer and Bachmann [21] and modified by Prinz et al. [22].Table 1 shows the attrition rates for this work.The development of specific nuSSRs for A. colubrina var.cebil demanded 3 months of work in a fully equipped laboratory.Human resources involved in the development of SSRs included a technical assistant, a doctoral student, and an experienced researcher.From our experience, we suggest to pay attention on the following: starting the process with DNA of good quality and enough quantity; ensuring good conditions of sterility during enrichment procedure and in the whole process; making two simultaneous libraries using different sets of repeated motifs for enrichment; avoiding repetitive bases in primer sequences; analyzing the primer sequences directly in the electropherograms to ensure that primers were designed on sequences of good quality with high peaks; resequencing amplified products after functionality tests; and aligning the original fragment obtained from the enrichment procedure.
New and revolutionary sequencing methods, referred to as next-generation sequencing (NGS), are extremely high-throughput technologies that produce thousands or millions of sequences at once at a fraction of the cost of traditional Sanger methods [23].A specific application of this new technology in plants is the possibility of rapid and cost-effective discovery of microsatellite loci [23].Despite this modern technology is more cost effective than traditional enrichment procedures, currently it is not yet widely used for nuSSRs development in plant species.A commonly cited weakness of microsatellites is their high development cost and relatively lowthroughput when compared to SNPs but the same technologies that have widened the use of SNPs have also benefited microsatellite development processes [24].
Once a set of nuSSRs primers was developed, the final step was the statistical analyses of data to confirm the utility of these markers to population genetic studies.These analyses consist of: (1) estimation of observed and expected heterozygosity, (2) test of Hardy-Weinberg equilibrium, (3) test of genotypic linkage disequilibrium, (4) test of null alleles and genotyping errors, and (5) perform neutrality test.There are free software available for these analyses, e.g., Genalex [25], Genepop [26], and/or Microchecker [27].Expected good results for these analyses include high heterozygosity, high number of loci in Hardy-Weinberg equilibrium and linkage disequilibrium, low number of loci with null alleles and absence of genotyping errors, and lack of traces of selection.

What do microsatellite markers say us about natural populations of forest tree species?
The development of molecular genetic markers has had a great impact on our understanding of the processes that determine structure and variation within and among natural populations [16].Microsatellites, as other molecular markers, are particular characteristics of DNA molecule that enable the identification of individuals at DNA level.However, a molecular marker must be considered as genetic marker when its particular genetic features are known.The knowledge on precise molecular basis and a mode of inheritance of a genetic polymorphism are crucial for the appropriate interpretation of molecular marker data in a population context [28].
Plants show a remarkable variety of inheritance modes, and further, some of their reproductive patterns permit genetic study with means not available in other types of organisms [29].The mitochondrial genome in plants shows a large size, slow nucleotide substitution rates and extensive levels of intramolecular recombination, and has been of limited use in genetic diversity studies.The chloroplast genome shows conserved gene order and a general lack of heteroplasmy and recombination, and it is an attractive tool for demographic and phylogenetic studies [16].There is considerable potential for hypervariable chloroplast microsatellites to provide markers with uniparental inheritance for indirect measures of seed or pollen gene flow.Studies of angiosperms, where chloroplast DNA (cpDNA) is predominantly maternally inherited, might offer further insights and provide information on the patterns and extent of localized seed dispersal [16].Furthermore, its uniparental mode of inheritance makes it possible to elucidate the relative contributions of seed and pollen gene flow to the genetic structure of natural populations by comparing nuclear and chloroplast markers [16].
The use of genetic markers with uni-and biparental inheritance (i.e., cpSSRs and nuSSRs) differentiates the historical contributions of the movement of seed and pollen on the levels of gene flow.This information is relevant to distinguish between genetic consequences of colonization by seed and the exchange of genes through pollen between established populations [30,31].Given haploid genome of organelles, effective population size in hermaphrodite outcrossing plants is half that of diploid nuclear genome, and as a result, chloroplast-specific markers should be good indicators of historical bottlenecks, founder effects, and genetic drift [16].
Differences in mutation rates, ploidy levels, and recombination presence or absence between nuclear and chloroplast genomes make cpSSRs and nuSSRs valuable tools for the study of the effects of historical and recent fragmentation on the contemporary genetic variation and current population genetic structure.This allows contrasting the relative role of genetic drift and gene flow as microevolutionary process that shapes population genetic structure [32].
In addition, due to their high rate of polymorphism, nuclear microsatellites are often cited as being very useful for studying recent evolutionary events among subpopulations within an individual species [24].2. Microevolutionary processes and demographic events that can be studied by microsatellite markers.
The movement of alleles within and between natural populations and their interaction with genetic drift, mutation, and natural selection determine the genetic composition of a population, including its genetic diversity and genetic structure [28].Microsatellites allow taking a high-resolution snapshot of a given allelic composition at a given time for certain loci, and the studying of mechanisms that generate and maintain genetic variability is possible by means of population genetics theories and methods.
Great potential exists for the application of coalescent-based models to cpSSRs [16].Coalescent approaches can be extremely useful in assessing a range of demographic histories but their application to intraspecific studies in plants has been hampered by the slow mutation rate of the nonrecombinant genomes, such as chloroplast DNA.Although limitations exist, cpSSRs represent a potentially informative data source with which coalescent-based approaches can be explored [16].Microevolutionary processes and demographic events that can be studied by microsatellite markers are showed in Table 2.

Population genetic data analysis
This section attempts to guide the researchers to make decisions regarding the statistical analysis of nuclear and chloroplast microsatellite data.Nevertheless, those attempting to use these analyses for the first time will need to read the cited bibliography here for each particular analysis.Advances in computing technology have inspired the use of intensive statistical approaches such as maximum likelihood, Bayesian probability theory, and Markov chain Monte Carlo simulation contributing to the recent technical advancements of molecular ecology [33].Fourth extreme and simple states of allelic configuration of a theoretical population integrated by three subpopulations are showed in Figure 1.Each state was defined from the relationship between genetic diversity levels and genetic structure of the theoretical population.In the nature, these extremes are exceptions while complex allelic configurations are the usual situations.Statistical analyses are the most appropriate tools to infer levels and distribution patterns of genetic variation while population genetic theory gives the knowledge for interpreting statistical data analysis results.

Genetic diversity
A prerequisite for starting population studies is to detect the genetic diversity underlying phenotypic variation, and understand the genetic diversity as the total genetic variation among individuals within a population.Several measures of genetic diversity have been developed over the years.The simplest measure of genetic diversity from molecular data is the number of alleles at a given locus (N A ), which is also known as gene multiplicity [34].At the same time, these alleles have their own frequencies in each population, which represent their abundance.
As multiplicity and abundance vary independently, genetic diversity can be expressed as the effective number of alleles (N E ) [35].N E will be equal to N A if alleles show the same frequency.
In case that allele frequency distribution is not uniform, N E will be lower than N A .The number of alleles that can be found in only one population is defined as private alleles (N P ) [36].In this way, N P is a simple measure of genetic distinctiveness.Private alleles can also have low frequencies being able to call them rare alleles.These kinds of alleles are very informative because their presence and frequency allow quantification of gene flow levels.
Since the number of detected alleles in a population depends on its size, it is not advisable to compare genetic diversity parameters among subpopulations with different sizes.An useful parameter to compare the number of alleles between samples that differ in size is the allelic richness (R).This parameter predicts the expected number of alleles if samples have the same size using the rarefaction method [37].However, the original and most important measure of genetic diversity is Nei's gene diversity index (h) estimated as ℎ = ∑ 1 1 − 2 , where x i indicates the allele frequency [38].This parameter represents the probability that two alleles randomly and independently selected from a gene pool will represent different alleles.This index analyzes allele frequency variation directly in the terms of heterozygosity without consideration of the number of alleles at a given locus or the pattern of evolutionary forces [38].
In this way, the treatment of this index is biologically most appropriate because it has been formulated entirely in terms of allele and genotype frequencies [39].
Since particular genetics features of chloroplast genome, combination of cpSSRs alleles from different loci allows determining the chloroplast haplotypes.Hence, haplotype genetic multiplicity could also be characterized from haplotype number, genetic abundance from haplotype frequencies, genetic distinctiveness from the number of private haplotypes, and genetic diversity from Nei´s haplotypic diversity index (H) estimated as where n is the number of analyzed individuals and p i is the frequency of haplotypes in the population [40,41] (Table 3).Genalex [25] Allelic richness (R) ADZE [60] FSTAT [61]  a Free available software.b For angiosperm species.

Genetic
Table 3. Classification of methods for analysis of microevolutionary processes and demographic events and suggested software.
The differences in the genetic diversity parameters among populations must be statistically significant to arrive at the conclusion of which population is the most diverse.These differences could be tested by permutation, a nonparametric procedure.

Genetic structure
Population genetic structure is the amount of genetic variability and its distribution within and among local populations and individuals within a species [42].Given the central role of population genetic structure to microevolutionary processes, additional tools for its measurement and quantification are necessary.In this way, we perform a classification of the several statistical methods to study population genetic structure.However, the researchers must keep in mind the scope and aims of the study to define the analyses of their data.

Individual-based methods
The starting points for these analyses are individual microsatellite's genotypes.The simplest methods are those based on distances, e.g., median joining trees (MJ) and networks (NWK) [43].These methods are graphical representation of genetic distances among multilocus genotypes (nuSSRs data) or among haplotypes (cpSSRs data).Distance-based methods are usually easy to apply and are often visually appealing.However, the clusters identified may be dependent on both the distance measure and graphical representation chosen, being difficult to assess confidence of clusters obtained [44].
Nowadays, the most popular individual-based methods are those based on models as Bayesian admixture analysis for nuSSRs data [44] and Bayesian mixture analysis for linked loci for cpSSRs data [45].Methods based on Bayesian theory are extensively used because they give information about the genetic origin of individuals making clusters and assigning individuals to these clusters to infer population structure based on a probabilistic criterion.In addition, these methods include a priori information of the geographic origin of individuals to help in the population genetic structure determination [44,45] and the identification of migrants or descendants of recent immigrants [44].The results of Bayesian admixture analyses must be analyzed by the Evanno method [46] to determine the most likely level of population subdivision.

Subpopulation-based methods
The starting points of these analyses are groups of individuals.Here these groups are called subpopulations.Different criterion could be considered to grouping individuals: a nongenetic criterion (e.g., geographical groups of individuals, cohort, etc.) or a genetic criterion (e.g., previously identified Bayesian clusters).Subpopulation-based methods are based on the analysis of molecular variance (AMOVA) [47].This method consists in the analysis of distribution of molecular genetic variation in the previously established different hierarchical levels.
Once genetic structure was determined, the strength of genetic structure must to be quantified.The most appropriate way is the estimation of Wright's fixation index (F ST ).Sewall Wright [48] devised the fixation index to describe correlations among alleles sampled at hierarchically organized levels of a population.Hence, this index could be estimated from the AMOVA as 2 , where 2 is the variance among subpopulations and σ T 2 is the total population variance [49].Statistical significance estimation of this index could be performed using permutations.F ST index could be estimated among pairwise subpopulations to determinate patterns of genetic differentiation (Table 3).

Gene flow
Genetic exchange between local populations is called gene flow, and it is an evolutionary force that occurs between populations with distinct gene pools [42].We are going to introduce two ways to estimate levels of gene flow among subpopulations from microsatellite data.The first way is the indirect method based on genetic differentiation among populations [50].After Wright [48] developed fixation index, he went on to demonstrate that there is a simple relationship between the genetic divergence of two populations, measured as F ST , and the amount of gene flow between them, which is given as ST = 1/4 + 1 , where Ne is the effective size of each population and m is the migration rate between populations, and therefore N e m is the number of breeding adults that are migrants.Hence, for nuSSRs data, the number of migrants could be estimated as = 1/ − 1 /4 and it quantifies the historical gene flow by pollen and seed, while for cpSSRs data, the number of migrants could be estimated as = 1/ − 1 /2 and it quantifies the historical gene flow by seeds in angiosperm species [42].The second way to estimate gene flow is the method of rare alleles described by Slatkin [51].He proposed the estimation of Nm from the spatial distribution of rare alleles.He demonstrated that log 10 1 , where 1 indicates the average frequency of private alleles, is approximately lineal with log 10 .
The estimation of relative rates of pollen and seed gene flow could be estimated from an estimator proposed by Ennos [31], which is based on the conception that the effectiveness of pollen and seed in bringing about gene flow depends upon the mode of inheritance of the genetic marker.In most of the angiosperms, gene flow occurs by pollen and seed for nuclear and paternally inherited markers; however, gene flow occurs only by seeds for maternally inherited markers.Consequently, different levels of population differentiation for markers with contrasting modes of inheritance are expected.The relative levels of pollen versus seed gene flow among populations could be estimated as = where STb and STm are fixation index for nuclear and chloroplast markers for an angiosperm species, respectively.
Finally, Pritchard et al. [43] extended their approach to infer genetic structure including in the algorithm, the geographic position of individuals.In essence, it assumes that each individual originated, with high probability, in the geographical region in which it was sampled, but to allow some small probability that it is an immigrant (or has immigrant ancestry).Immigrants would be individuals whose genetic makeup suggests they were misclassified, and in this way it is possible to quantify recent gene flow (Table 3).

Inbreeding
In its most basic sense, inbreeding is mating between biological relatives [42].It is not a microevolutionary process because its effect does not change allele frequencies of populations.However, it is important because genotypic composition of populations could be determined by its influence.The presence of inbreeding informs us about reproductive dynamics of the species.Inbreeding coefficient (F IS ) could be estimated from AMOVA if the hierarchical level "within individuals" is included in it.Inbreeding could also be estimated using a Bayesian approach (f) [52].
Although microsatellites are a very efficient tool for many population genetics applications, they may occasionally produce null alleles, which, when present in high proportion at a particular locus, the observed heterozygosity would be underestimated.As a consequence, the population parameter estimates based on the proportion of heterozygotes could be affected by null alleles.Estimates of Wright's inbreeding coefficient F IS based on microsatellite data could be unclear regarding to the extension of actual level of inbreeding in the studied population and in the degree affected by the presence of null alleles.Population inbreeding model can be applied for simultaneous estimation of null allele frequencies and of the inbreeding coefficient as a multilocus parameter [53] (Table 3).

Demographic events
There has been little focus on the potential of chloroplast microsatellites for demographic inference.Navascués et al. [54] investigated the utility of cpSSRs data for the detection of demographic expansions.The study of historical demography by means of genetic information is based on coalescent theory [55].One alternative is the development of the F S neutrality test for determination of population expansion events [56].This test is based on different expectations for the number of haplotypes when comparing a stationary with expansion demography [56].Another alternative is the estimation of D Tajima index as Tajima = π − ()/ π− where, π is the number of different sites between sequences, V is the numerator variance, and θ is estimated as = /, where S is the number of polymorphic sites and a is calculated by ] and where n is the number of analyzed sequences [57].Both parameters should be estimated from the distribution of the differences between individuals within a population, and these differences are considered as allelic differences between cpSSRs haplotypes, being this data considered as binary [54].
A new and robust method to examine a species' phylogeography using microsatellite markers is the approximate Bayesian computation (ABC).This model-based method is useful to infer parameters and compare models in population genetics [58] (Table 3).

Understanding population genetic data analysis results
The challenge in the study of population genetic events based on microsatellite markers is the interpretation of the statistical analysis results from the biological point of view.The aim of this section is to serve as a guide about how to interpret these results in a forest tree species in order to infer which are the forces that determine the distribution of current genetic variation of nuSSRs and cpSSRs?
In the field of population genetics, it is becoming increasingly necessary to focus more attention on understanding the practical limitations of various analyses and applying increased caution when interpreting results generated by molecular markers [24].Regardless of the question, a molecular marker must fundamentally be selectively neutral and follow Mendelian inheritance in order to be used as a tool for detecting demographic patterns and microevolutionary forces as genetic drift and gene flow [33].
Recombination, selection, and genetic drift affect different genes and regions of the genome in a different way.Consequently, multiple samples of the genome by combining the results from many loci provide a precise and statistically powerful way of comparing populations and individuals [33].Microsatellites have high mutation rates that generate the high levels of allelic diversity necessary for genetic studies of processes acting on ecological time scales [67,68].
The new approaches use more of the information in a data set than the summary statistics of traditional approaches (e.g., F ST ).
Nowadays, demography and history of populations and relationships of individuals can be described in a detailed manner because the typical data set contains high number of individuals sampled at many loci.Hence, genetic tools allow to address many basic ecological questions for the first time or in new ways [33].
Genetic diversity is essential for the long-term survival of species; without it, species cannot adapt to environmental changes and are more susceptible to extinction.Measuring levels of genetic variation within and among populations is an important first step in evaluating the evolutionary biology and tree improvement potential of a species [1].Most forest tree species possess considerable genetic variation, much of which can be found within populations, and the expected heterozygosity is approximately 50% higher in a population of forest trees than average heterozygosity expected in populations of annuals and perennials cycle short life species [1].
A number of factors that contribute to the high levels of genetic diversity typically found in forest tree populations are large population size, longevity, high levels of outcrossing, strong gene flow by pollen and seed between populations, and balancing selection [68].Nuclear DNA is often highly variable, and it is biparentally inherited.Efficient gene flow, in particular, via pollen is the main factor contributing to the high diversity within populations of trees but low differentiation among spatially separated populations [69].
Gene diversity index is the expected heterozygosity averaged over all loci sampled and is the most widely used measure of genetic variation employing genetic markers.Because low-frequency alleles contribute very little to h, it is relatively insensitive to sample size.However, when sampled populations show markedly differences in size, additionally allelic richness (R) could be informed.The observation of variation at nonrecombining chloroplast DNA (cpDNA) is of particular importance for plants.Low mutation rates of cpDNA are responsible for, in general, low variation within species [69].As a consequence of previous statements, higher levels of genetic diversity with nuSSRs than with cpSSRs are expected.
Populations of forest trees often differ in allele frequencies (especially when they are separated geographically), and it is often of interest to determine the degree to which genetic variation in a region is distributed within and among populations.This information is useful for understanding the degree to which gene flow by pollen and seed counters population subdivision due to selection or genetic drift [70].It also has practical value when planning seed collections for breeding or gene conservation purposes.Therefore, knowledge of natural patterns of genetic variation and their evolutionary bases also are of great practical significance [1].The pattern of genotypic variation (heterozygosity vs. homozygosity) among individuals within a subpopulation is highly dependent upon the mating system, whereas the distribution of allelic variation within and among subpopulations is influenced by both gene flow and genetic drift.Because of the opposite effects of gene flow and genetic drift, the balance between them is a primary determinant of the genetic population structure of a species [42].Total diversity in forest trees is also generally higher than that found in other plants.However, only a small proportion of the total gene diversity in trees is due to differences among populations [1].
Woody species contain more variation within populations but have less variation among them than species with other life forms.Woody species that present large geographic ranges, outcrossing breeding systems, and wind or animal-ingested seed dispersal are more genetically diverse than woody species with other combinations of traits [68].Hence, from AMOVA results in forest tree species, higher levels of genetic variation is expected in the hierarchical level within populations than the hierarchical level among populations.Genetic differentiation among populations estimated by F ST also varies widely among tree species, ranging from low values in species that have more or less continuous distributions to high values in species with disjunct population distributions [1].For the interpretation of F ST , the value scale suggested by Wright [71]  The lower F ST in trees is most likely because most tree species are outcrossing while a large proportion of annuals and herbaceous plants are either self-pollinated or self-pollination features prominently in their mating system.High levels of self-pollination not only promote inbreeding but also limit pollen gene flow between populations.Both pollen and seed gene flow between populations of forest trees can be extensive [1].
Patterns of seed dispersal shape the composition and genetic structure of plant populations.Species with low levels of gene flow by seeds have high probability to show genetic hetero-geneity among subpopulations, whereas species with high levels of gene flow by seeds have low levels of genetic structure [72].
Compared to biparentally inherited and paternally inherited markers, maternally inherited markers detected strong genetic differentiation between populations [69] because normally seeds are distributed to shorter distances than pollen [73].Moreover, being a haploid genome, effective population size for hermaphrodite outcrossing plants is half that of the corresponding diploid nuclear genome [16].Hence, gene flow between populations of small size has a lower effective to counteract the effects of genetic drift in loci transmitted maternally [74].When these occur, F ST values for chloroplast DNA can be markedly higher than those for nuclear genes [75].Genetic structure of chloroplast genetic variation is also affected by the interaction of seed dispersal with other ecological and genetic processes.Deposition patterns of seeds, pollen dispersal, density of adults, microhabitat selection, and several aspects of the ecology of the species could have significant effects on patterns of genetic variation within species [72].While both pollen and seed dispersal determines gene flow in plants, seed dispersal is most important because it allows species to colonize habitats and therefore influences the dynamics of populations [76].
Methods based on distance allow grouping individuals according to genetic distance while its graphical representation allows to relate these groups with other information, e.g., the geographical origin of individuals or phenotypic traits [44].Even though these methods are statistically weak, they still represent a first approach to analyze population genetic structure.Conversely, grouping from methods based on models are statistically powerful and allow to determine the number of clusters using genetic information assigning individuals probabilistically to these clusters even when they require model assumptions (e.g., Hardy-Weinberg equilibrium within populations and complete linkage equilibrium between loci within populations) [44].
Species can become subdivided into genetically distinct subpopulations when gene flow is restricted, leading to variation in the frequency of a gene over space [42].The number of migrants N e m represents an estimation of gene flow, and it is important to keep in mind that m is defined in the terms of gene pools, and therefore m represents the amount of exchange of gametes between subpopulations and not necessarily individuals [42].Most trees species are wind pollinated, and the pollen can be blown from hundreds of miles by the wind.Hence, tree populations that are quite distant can still experience gene flow.Because gene flow requires both movement and reproduction, m is not just the amount of dispersal of individuals between subpopulations but instead m represents a complex interaction between the pattern of dispersal and the mating system [42].As a consequence of this, in forest tree species, it is really important to know pollen and seed dispersal mechanisms and species matting system to interpret the estimated levels of gene flow from microsatellite data.
The effects of gene flow on genetic variation among and within subpopulations can be summarized as gene flow decreases genetic variation among subpopulations and increases genetic variation within subpopulations.Genetic drift causes an increase in genetic variation among subpopulations and decreases genetic variation within a subpopulation.Hence, the effects of gene flow on genetic variation within and among subpopulations are the opposite of those of genetic drift [42].
For neutral alleles, when gene flow is interrupted, genetic drift is more effective than mutation to produce genetic differentiation among subpopulations [77].Thereby, gene flow could be a force that maintains species integrated as well as influences the ecologic processes, e.g., determine the persistence and adaptation of local populations, determine pattern of distribution of species, etc. [78].In this way, studies of gene flow become relevant for the interpretation of microevolutionary patterns and genetic structure of populations [80].Even a small amount of gene flow can cause two populations to behave effectively as a single evolutionary lineage.
One "effective" migrant per generation (N e m = 1) defines an inflection point from the relation between F ST and N e m, with increasing effective number of migrants F ST declines only very slowly when N e m ≥ 1 but with decreasing effective number of migrants F ST rises very rapidly when N e m ≤ 1.As a consequence of this, N e m = 1 marks a transition in the relative evolutionary importance of gene flow to drift.It is impressive that only one or more effective migrants per generation on average are needed to cause gene flow to dominate over genetic drift, leading to great genetic homogeneity among subpopulations [42].
Ennos [31] demonstrated that estimation of the relative rates of pollen and seed migration among plant populations is possible from a simple comparison of F ST values for nuclear and maternally inherited organelle genetic markers.Estimated rates of pollen migration are greater than rates of seed migration for all six species investigated by Ennos [31]; however, differences among species were substantial.The greatest contrast between pollen and seed migration rates was found for oak species, where interpopulation pollen flow is estimated to be 200 times greater than interpopulation seed flow.This result was interpreted by the species reproductive system.Oaks show high rates of interpopulation pollen dispersal because they are outbreeding, wind-pollinated, and disperse pollen from a substantial height.Also, dispersal of acorns by birds and rodents is likely to be restricted [31,79].In contrast, lower differences between pollen and seed migration rates were found for wild barley.Gene flow by pollen is estimated to be only four times greater than interpopulation gene flow by seeds.Opportunities for interpopulation pollen dispersal in such a highly self-pollinating species are expected to be rare, and it is not surprising that pollen and seed flow should be of the same order of magnitude for this species [31,81].Forest trees species are generally outbreeding, whereby levels of pollen flow versus seed flow (r) may vary according to the potential distances of dispersal related to mechanism of pollen and seed dispersal.
By itself, the mating system does not alter allele frequencies but does affect the relative proportions of different genotypes in populations, which under some circumstances deeply influences the viability and vigor of offspring [1].Inbreeding coefficient (F IS ) measures the fractional reduction in heterozygosity relative to a random mating population with the same allele frequencies.Even though genotypic frequencies in natural populations of forest trees often approximate those expected under random mating, mating systems that depart from random mating do occur and have important implications.Individuals of most temperate forest trees are bisexual and have the capacity for self-fertilization.In addition, nearby trees may be related (e.g., siblings originating from seeds of the same mother tree), providing opportunity for mating between relatives.Therefore, forest trees typically have mixed mating systems, whereby many and perhaps most mates are paired essentially at random.
There is also some mating between genetically related individuals, which occurs more often than expected from random pairings [1].Inbreeding is of great significance to the genetic makeup of both natural populations and breeding populations of forest trees because it has two major consequences: (1) in comparison to random mating, inbreeding increases the frequency of homozygous offspring at the expense of heterozygotes and (2) mating between close relatives is usually detrimental to the survival and growth of offspring, called inbreeding depression.Therefore, the magnitude of inbreeding among parent trees used to produce seed for reforestation, such as in seed production areas or seed orchards, is of great practical concern [1].
In previous section, we presented three ways to estimate inbreeding coefficient that has different statistical power and assumptions: (1) F IS estimated from AMOVA could be used as first measure of inbreeding for a determinate hierarchical structure, (2) F IS estimated by a Bayesian approximation is a measure statistically more powerful to determine the level of inbreeding in a population, and (3) F IS could be estimated considering null alleles when certain levels of null alleles were determined in the microsatellite loci considered in order to determine the proportion of homozygote genotypes consequence of inbreeding than homozygotes caused by null alleles.
The current distribution and population structure and potential fate in the future are better understood from the knowledge of historical distribution, postglacial phylogeography, and evolution of a species [82].Regarding to the assessment of demographic history using the F S neutrality test for population, F S statistic takes a large negative value within a population affected by expansion due to an excess of rare haplotypes (recent mutations).Significance of the test must to be calculated with data bootstraps.A F S statistic with p(F S ) < 0.02 (α = 0.05, due to a particular behavior of this statistics, [56]) is considered as an evidence of population expansion.
Whereas using D Tajima neutrality test, a D Tajima statistic is expected to be close to zero in a population of constant size while statistically significant negative values indicate a sudden expansion of population size and positive values indicate population subdivision or recent bottlenecks.The statistical significance of D Tajima is tested generating random samples using a coalescent simulation algorithm under the hypothesis of population balance.The p-value for D Tajima is obtained by the ratio of random D Tajima less than or equal to the observed D Tajima .
Computer-intensive statistical methods have been developed to extract as much information from the data as possible and to provide a flexible framework within which complex models of population history can be handled [83].
Approximate Bayesian computation is a computer-intensive method that has wide applicability, where populations diverge genetically through time, influenced by random genetic drift and migration, ABC uses summary statistics measured from microsatellite loci to make inferences about demographic parameters in different population models.The method can be used to infer effective sizes of current and ancestral populations, immigration rates, splitting times, and tree topology [83].
As a final recommendation, researchers must define which is/are the problem/s and question/ s that they would resolve with their study before starting a study with molecular marker in a native forest tree species.This is a founder requisite to determine sampling design, to decide molecular markers to use (keeping in mind the information required and laboratory work to obtain molecular data), and appropriate statistical analysis to obtain the required information.
Of course, the researchers must pay attention to biological features of the studied species at the moment to design the study and back to these features at the moment to interpret the results of statistical analyses in a biological context.

Conclusion
This chapter helps to researchers that are not familiarizing with statistical methods and population genetics theories to analyze nuclear and chloroplast microsatellite data.Methods allow quantification of genetic variation and genetic structure in native forest species while theories allow knowledge about the past and the present genetic states of populations for making inferences about the future of these populations.

Figure 1 .
Figure 1.Extreme states of allelic configuration of one theoretical population integrated by three subpopulations.Circles: subpopulations; Squares: nuSSRs genotypes or cpSSRs haplotypes.Colors show differences in the allelic compositions.
Microsatellites as a Tool for the Study of Microevolutionary Process in Native Forest Trees http://dx.doi.org/10.5772/65042
is a useful tool.The four values are (1) 0-0.05 indicate little genetic differentiation, (2) 0.05-0.15indicate moderate genetic differentiation, (3) 0.15-0.25 indicate great genetic differentiation and, (4) values above 0.25 indicate very great genetic differentiation.Nuclear F ST in trees is frequently 10% or lower, which is only one-half to one-quarter of the F ST estimates typically found in annuals or other herbaceous species.