Open access peer-reviewed chapter

Research Advances and Perspectives of Conservation Genomics of Endangered Plants

Written By

Qing Ma, Gang Wu, Wenjie Li, Seyit Yuzuak, Fachun Guan and Yin Lu

Submitted: 03 June 2023 Reviewed: 21 June 2023 Published: 30 August 2023

DOI: 10.5772/intechopen.112281

From the Edited Volume

Endangered Species - Present Status

Edited by Mohammad Manjur Shah

Chapter metrics overview

116 Chapter Downloads

View Full Metrics

Abstract

Understanding in the evolutionary processes, endangered mechanisms, and adaptive evolution history are key scientific issues in conservation biology. During the past decades, advances in high-throughput sequencing and multi-disciplinary crossover have triggered the development of conservation genomics, which refers to the use of new genomic technologies and genomic information in solving the existing problems in conservation biology. Conservation genomics mainly focuses on the endangered mechanism and conservation strategies aiming at protection of survivability and diversity of endangered species. Application of conservation genomics into the study of endanger plant species has provided innovated protection concept for biologists and promoted the development of population-based conservation strategies. This chapter summarizes the studies of population genomics for agronomically and commercially important plants threatened and endangered, discusses the advantages of conservation genomics for the analysis of genetic diversity, inferences about the history of population dynamics, evaluation of natural forces on wild plant populations, and the establishment of effective conservation strategies. This chapter also presents the development trends in genomics for the conservation of endangered plant species.

Keywords

  • endangered species
  • genomics
  • conservation strategy
  • genetic diversity
  • conservation biology

1. Introduction

In modern times, frequent and intense human activities and habitat degradation have caused major and serious threats on the plant’s biodiversity [1, 2]. Global energy consumption is becoming increasingly severe, and the global climate and environment is undergoing drastic changes. The size of existing populations is rapidly shrinking and many species are on the brink of extinction. The resulting population bottleneck further reduced the adaptive potential and functional diversity of species, and poses dangers and difficulties in the recovery of populations under natural conditions. Many wild animals and plants are facing an unprecedented survival crisis [3, 4, 5, 6, 7].

In response to this crisis, conservation biology has emerged. After years of development in conservation biology, preservation of biodiversity in the context of ecosystem has arrived at a consensus, and so the mechanism and priciple factors responsible for biodiversty loss has been a hot topic in the field of conservation biology. The requirement of better evaluation of species endangerment degree, more accurate estimation of endangered species habitats, and more suitable conservation strategies has led to the development and application of multiple related technologies, such as geographic positioning system (GPS) technology, mathematical methods, and genomics study [8]. In recent decades, the rapid development of high-throughput technology has greatly increased the availability of genomic data and effectively solved the management problems of many endangered species and related key protection regions.

Advertisement

2. An overview of conservation genomics

2.1 Main features

As a branch of conservation biology, conservation genomics focuses on the study of endangered mechanisms and conservation strategies for endangered species, that aims to protect species survival ability and reduce the risk of extinction. This field is an integration of theoretical ideas of genetics and analytical methods of genomics.

Traditional conservation genetics research often relies on only a few molecular markers such as isozyme, microsatellites, or mitochondrial and chloroplast genomes. Since most of these molecular markers are anchored in a few neutral regions of the genome, there are certain limitations in tackling with questions in species conservation [9, 10]. Compared with conservation genetics, conservation genomics is more direct and effective in addressing some important issues in conservation biology.

For example, when estimating effective population size (Ne), genome data can provide a large number of genetic markers, which can be used to reconstruct pedigree and extract haplotype information, with which Ne values can be monitored, population migration direction predicted, and migrating individuals can be identified. When inferring the adverse effects of inbreeding depression, whole genome information can directly locate the key sites, accurately predict the population’s ability to eliminate harmful mutations, and screen the initial founders of captive populations based on the genotype of individual inbreeding sites.

Furthermore, when estimating the interference of climate change and human activities on wild populations, the large amount of genetic variation information obtained through genomic sequencing can help accurately evaluate the responsive ability of different individuals and help to find individuals suitable for ex-situ conservation [11]. In fact, these issues have long been well-recognized in conservation biology research, but due to the lack of key genomic information, there has been no further development until the advent of next-generation sequencing, which successfully opened up new research perspectives.

2.2 Significance in conservation biology

In the past 200 million years, an average of 900,000 species of animals have become extinct per million years, with a “background rate” of approximately 90 species per century, including approximately four species of higher plants [12]. In history, the five mass extinctions that have occurred to date have generally occurred as a result of significant geological events or rapid environmental changes. Studies on general biodiversity loss indicate that we are now in the sixth mass extinction [13]. It is conservatively estimated that the extinction rate of species in the past century has been 22 times faster than historical benchmarks [14].

In 2022, 11,538 plant species were included in the red list updated by the International Union for Conservation of Nature and Natural Resources (IUCN), of which about 5336 (46.2%) were critically endangered, 10,202 (88.4) were endangered, 9376 (81.2%) were vulnerable, 134 (1.2%) were conservation dependent, and 3761 (32.6%) were near threatened [15]. China is one of the countries with the richest biodiversity in the world, with about 30,000 species of higher plants, more than 5% of which are endemic, and ranks third in the world. According to statistics reported so far, there are 4000 to 5000 plant species in China that are endangered or threatened, accounting for 15% -20% of all plant species. Nearly 100 species are facing extinction. Therefore, the application of modern multidisciplinary research methods is of great importance in order to protect and strengthen plant germplasm resources and maintain plant diversity.

Conservation genomics not only provides more comprehensive, accurate and reliable research results on species classification, genetic diversity, population genetic structure and so on, but also provide insights into the historical processes of species origin, differentiation, population size evolution, and the molecular mechanism of population local adaptation, inbreeding status and genetic basis of inbreeding depression [11, 16, 17]. Various models based on conservation genomics have been developed to directly identify biodiversity hotspots to provide priority protection.

In addition, new classes of markers obtained from whole genome sequence can be used to screen for functional genes and key adaptive sites contributing to important ecological adaptations, such as the stresses of climate change, resistance to herbivory, and disease, making it possible to predict species’ response to the environment in the past and future [18].

Beginning from the studies of traditional genetic diversity, genetic structure, and population dynamics, conservation genomics allows us to delve into the reconstruction of evolutionary history and species adaptive evolution and explore the process, causes, and evolutionary potential of endangerment. Application of genomics study in conservation biology can guide practical management actions of endangered species from both the spatial and temporal perspectives.

2.3 Research techniques and strategies

2.3.1 Development of genome sequencing technology

Genome sequencing technology has gone through a development process from first generation to third generation sequencing. The first generation sequencing is also known as Sanger sequencing. Its core principle is to image different lengths of DNA fragments containing isotope markers through gel electrophoresis during DNA synthesis and identify the type of DNA base at each position [19]. The first generation sequencing has the advantages of long read length and high accuracy, but its low throughput, long sequencing time and high cost made it unsuitable for large-scale genome sequencing.

The second generation sequencing or next generation high-throughput sequencing determines the DNA sequence by capturing the special markers (usually fluorescent molecular markers) carried by the newly added bases during DNA replication [20]. It has advantages such as high throughput, fast speed, and low cost, but there are also shortcomings such as limited read length and assembly fragmentation.

The third generation sequencing technology, also known as single molecule real-time sequencing technology can obtain DNA base information in real-time through captured optical or electrical signals [21, 22]. Its biggest feature is single molecule sequencing, which does not require PCR amplification during the sequencing process. Together with its advantages of long read length (10–150 kb), fast speed, and no GC preference, the third generation sequencing greatly improves the integrity of genome assembly [23]. However, a disadvantage of this technology is the relatively high error rate of single base sequencing, which can reach 15%. It may require the use of second-generation sequencing data to correct the sequenced bases.

2.3.2 Simplifying genome sequencing

At present, two main categories of research techniques are widely used in conservation genomics: simplified genome sequencing and whole genome sequencing [16].

Simplifying genome sequencing is one of the commonly used techniques. It means sequencing of partial genomes, which greatly reduces the complexity of the genome, thereby lowering the cost and computational burden of sequencing. Simplifying genome sequencing has many advantages over whole genome sequencing, such as cost-effective, good stability, shorter experimental time, simpler library construction program, gain of a larger number of SNPs (single nucleotide polymorphisms), and independence of the reference genome. Hence, this technology is widely used in the conservation genomics studies of endangered animals and plants [16, 24, 25, 26]. Simplified genome sequencing can be divided into restriction site associated DNA sequence (RAD-seq) [27], RNA transcriptome sequencing (RNA-seq) [28], and whole exome sequencing (WES) [29]. The commonality of these three methods is that they normally only evaluate a small portion of the genome. However, due to the incomplete genome coverage and missing data, simplifying genome data poses challenges for subsequent population genetic analysis, such as in the inference of population phylogenetic relationship. Firstly, when there are polymorphisms or sequencing errors, it is difficult to conduct precise cluster analysis of the same restrictive loci. Secondly, it is still quite complicated to assemble each gene cluster into unique loci and ultimately construct phylogenetic relationships. Finally, the scale of genetic variation information obtained from RAD seq and the availability of RAD-seq data in phylogeny inference are influenced by various factors such as the restriction enzymes used, the size of selected fragments, and the sequencing coverage of different samples [27, 30, 31]. By contrast, whole genome re-sequencing method relying on the reference genome can significantly improve the quantity and quality of detected genetic markers [27, 32].

2.3.3 Whole genome sequencing

Whole genome sequencing can be divided into two categories: De novo whole genome sequencing and whole genome re-sequencing. De novo sequencing refers to the first assembly of a new genome sequence. The difficulty and quality of genome assembly depend on genome size, complexity, computational conditions, and bioinformatics techniques. Currently, De novo whole genome sequencing mainly utilizes third-generation sequencing techniques, including single molecule nanopore DNA sequencing from Oxford Nanopore Technologies (ONT), single molecule real-time sequencing (SMRT) from Pacific Biosciences (PacBio), and true single molecular sequencing (tSMS) from Helicos Biosciences.

In comparison, whole genome re-sequencing aims at comparison of genomic variation among different individuals and populations based on genomic sequencing of different individuals of species with known genome sequences. It mainly utilizes second-generation sequencing techniques, such as 454 GS FLX Titanium Platform supplied by Roche, HiSeq 2000 Platform supplied by Illumina, and the ABI SOLiD Platform to obtain a large number of short reads. After comparing with the reference genome, population level single nucleotide polymorphism (SNP) data can be obtained and used for population genetic analyses.

Whole genome re-sequencing requires high-quality reference genomes for read length alignment and mutation detection. The lack of high-quality reference genomes is the main obstacle for the use of whole genome re-sequencing technology in conservation biology. Despite the rapid development of sequencing technology, whole genome data is still unavailable in many endangered species. A statistical analysis has shown that among all the plant species with whole genome data, only 3.25% were threatened species included in IUCN red list and only 5.34% were included in the List of Wild Plants Under State Protection in China [33]. Nevertheless, with the increasing awareness of conservation and the launch of some important projects, it is expected that more and more endangered plant genomes will be analyzed in the future. For example, the Earth BioGenome Project proposed to give priority to sequencing the genomes of more than 23,000 endangered species included in the IUCN Red List [34]. The implementation of this project will provide assistance for the conservation genomics research of endangered species.

Advertisement

3. Application of conservation genomics

3.1 Determination of conservation units

In conservation biology, ‘species’ is commonly used to express the concept of conservation units, including taxonomic levels below species, such as subspecies and populations. It is very important for biodiversity conservation to determine population units that meet the requirements of taxonomy for standardized management. The successful implementation of the conservation plan largely depends on the correct identification of the taxonomic status of conservation targets [35]. In conservation biology, researchers tend to define species based on phylogenetic analysis. However, some different subspecies may be mistakenly treated as a different species. Other widely accepted definition methods require calculation of genetic distance or prove for reproductive isolation. In certain cases, threatened species may be described using minor morphological or distributional features, which brings controversial opinions on the taxonomic boundaries and the necessity of conservation.

Whole genome records the entire history of evolutionary process of each species. By comparing the genomic data rather than a few genes like traditional genetic analytical methods, more robust phylogenetic relationships can be constructed, providing new solutions for the identification of closely-related species and the discovery of hidden species [16].

For example, the rare and endangered plant Buddleja alternifoli (Scrophulariaceae) in Inner Mongolia is mainly distributed in the three major regions of the Himalayas, Hengduan Mountains, and the Loess Plateau. Significant morphological differences in between the populations in the Loess Plateau region and the other two regions were detected, while the populations in the Himalayas and Hengduan Mountains have no evident differences in morphology, thereby raising the doubt whether the populations distributed in the three regions belong to one species. Ma et al. first assembled the high-quality genome of Buddleja alternifoli and obtained sample re-sequencing data from 48 populations distributed in three regions. They found that Buddleja alternifoli formed three independent and distinct branches consistent with geographical distribution. The population differentiation coefficient FST was greater than 0.5. They speculated that Buddleja alternifoli in these three regions should be defined as three different species. Given the fact that Buddleja alternifoli is already an endangered species, results from genomic analysis implicated that each newly defined species has fewer numbers and narrower distribution range than previous prediction. The actual endangered degree of the species may be higher. When implementing protection management, these three possible species should be managed separately [36]. Similarly, although some endangered plants cannot be determined from their morphology whether they belong to different species, genetic differentiation is already very large and can actually be located as different species.

On the contrary, mis-identification of taxonomic status may cause unnecessary conservation actions and costs. Banksia vincentia was originally identified as a member of a species complex constitutes by Backobourkia collina, B. cunninghamii, B. neoanglica. It was treated as a critically endangered species in New South Wales, Australia. However, nuclear genomic data did not support ‘B. vincentia’ as a distinct species and showed that it is nested within B. neoanglica. The value of conservation of ‘B. vincentia’ needs to be further evaluated [37].

Differentiation of genetic diversity is often existed between different populations of the same species. One of the goals of conservation biology is to protect the genetic diversity of vulnerable species as much as possible. The most important step in population management is to determine and delineate the boundaries of conservation units (CUs) within species, such as evolutionarily significant units (ESUs). If a population basically has reproductive isolation with other populations of the same species, and represents an important evolutionary component of the species, then the population can be regarded as an ESU [38]. The significant units of evolution represent the vast majority of genetic diversity between populations within a species [39, 40]. In addition to ESUs, there are also management units (MUs) and adaptive units (AUs). By dividing these population units within a species, each population can receive targeted and efficient supervision, including reasonable planning of harvest yield to avoid excessive harvest, introduction of new individuals to the population to avoid mixing populations with different adaptations, and prioritize protection for certain population units to save budgets.

Within the genomic framework, it was suggested that genetic structure analysis of all possible loci to should be used to identify ESUs, neutral loci should be used to identify MUs, and adaptive loci should be used to identify AUs [39]. At present, using whole genome sequencing combined with re-sequencing data for population genetic structure analysis is a routine in population genomics [36, 41, 42]. The results can serve as a basis for ESUs classification of endangered plants. Using whole genome de novo sequencing, chromosome level assembly, and population re-sequencing of the endangered species Tetracentron sinensis, biologists found that 55 individuals of the species were distributed in representative regions of China with four ancient genetic components corresponding to four different ESUs [42].

In fact, in biological systems with simple evolutionary processes, traditional genetic techniques can directly divide protection units, but it is difficult to determine clear protective units for complex evolutionary systems, such as populations with hybridization and introgression in evolutionary history [43]. Hybridization refers to interspecific hybridization between individuals from different species or populations, such as hybridization between two closely related species with bidirectional gene exchange. Introgression refers to the transfer of alleles from one species to another, and gene exchange is unidirectional. Hybridization and introgression make it more difficult to partition protective units, as analyzing different parts of the genome is likely to yield different results. Due to the impact of human activities, the displacement of organisms and habitat transfer have significantly increased the hybridization and introgression rates among various species worldwide, further increasing the threat to existing species. Genomics technology can not only effectively distinguish between natural hybridization and artificial hybridization, but also predict the impact of hybridization on species fitness (heterosis or outbreeding decline) [44].

Overall, conservation genomics serve as an efficient tool for resolution of phylogenetic relationships and elucidation of population genetic structure, population evolution history. It can help correctly define the taxonomic status of an endangered species, precisely evaluate the hybridization risk and genetic diversity, and provide valuable management information for species conservation.

3.2 Analysis of genetic diversity

Genetic diversity is closely related to the adaptive evolution and evolutionary potential of species. Traditional genetic diversity assessment is based on mitochondrial genes, microsatellites and other molecular genetic marker to calculate the genetic diversity of different populations. However, these calculations are only based on the assessment of allele frequency of certain loci represented by genetic marker, which cannot comprehensively reflect the level and panorama of genetic diversity of gene sequences in key coding regions of species. With the rapid development of genomics, it has become possible to assess genetic diversity at the whole genome level. Genomic diversity refers to the overall genetic diversity of a species or population based on variation loci at the whole genome level. In recent years, with the continuous decoding of genome sequences of endangered species and the accumulation of population genome data based on re-sequencing, more and more endangered species’ genomic diversity has been evaluated.

Using genome re-sequencing method, the genetic diversity of some plant species listed in IUCN as critically endangered, endangered, or vulnerable have been studied. Generally, threatened species display lower genetic diversity than threatened and endangered species. The genetic diversity of critically endangered species endemic to China ranged from 0.0016 to 0.0030, while that of endangered species ranged from 0.0031 to 0.0038 [33].

Huang et al. (2012) sampled 446 diverse individuals of the critically endangered wild rice Oryza rufipogon across Asia and Oceania and sequenced them with twofold genome coverage. The sequence diversity was estimated at 0.003 which agrees with previous suggestions that as an immediate progenitor of cultivated rice, part of the genetic diversity of O. rufipogon was lost during domestication [45]. Ma et al. reported the first chromosome-level genome of a critically endangered species Rhododendron griersonianum which contributed to about 10% of horticultural rhododendron varieties. Lower genetic diversity of R. griersonianum (0.0019) compared to other relative species within the same genus and most other woody plants suggested that R. griersonianum could face large challenges to its future survival. Therefore, ex situ conservation and artificial supplementary pollination should be conducted as a priority [36]. It was also found that some living fossil plants, such as Ginkgo biloba, have high genetic diversity despite relatively few phenotypic variations, supporting the hypothesis of evolutionary capacitance [41].

It is worth noting that some research populations have limited sampling and insufficient representativeness, which may also lead to differences in the levels of genetic diversity reported in different endangered plants. The conservation genomics research of endangered plants in the future needs to ensure sufficient sample size in order to comprehensively evaluate the genomic diversity at the species or population level.

3.3 Inference of population dynamics history

The history of population dynamics is an important research topic in conservation genomics and population genetics [46]. As important external forces in the process of population evolution, climate change and geological events can affect the population dynamic changes. In addition, gene mutation, genetic drift, natural selection and other forces act as internal driving forces, causing the population evolution to gradually deviate from the “ideal population” and form a new pattern of genetic diversity. Therefore, studying the history of population dynamics can help us understand the shaping of population structure by climate change, geological events, and human activities, and develop reasonable management plans to cope with environmental changes [47, 48, 49]. The assessment of population dynamics in traditional genetics is mainly based on the analysis of fossil records and the history of climate or geological changes. With the development of molecular genetics, the widespread use of genetic marker provides us more comprehensive and accurate understanding about species evolution history. Molecular genetics studies based on mitochondria, chloroplasts or microsatellite genetic markers can be used to analyze the difference of polymorphism in different populations, or to infer population history dynamics by comparing the theoretical values under normal evolutionary state of populations. However, these methods require analysis of a large number of samples at the population level. On the other hand, they can only be used to trace the most recent population dynamics event instead of reflecting the overall species historical dynamics.

In the speculation of population dynamic history, the effective population size (Ne) is a key parameter reflecting the historical status of a population. It has important guiding significance for species conservation [50]. The effective population size (Ne) represents the population size under random mating conditions [51, 52]. Generally speaking, it is smaller than the actual population size. We can directly estimate the genetic drift rate of a population through Ne, and infer the genetic diversity and inbreeding level of the population [53]. Keeping the population size large enough to minimize the impact of genetic drift and inbreeding has always been an important goal of endangered species protection [54]. The “50/500” theoretical rule in conservation biology points out that populations with Ne less than 50 are vulnerable to inbreeding depression, so when isolated populations are protected alone, the population Ne should be kept above 50. Although occasional decrease in population size to this order of magnitude will not have an immediate adverse impact on the population, it is necessary to ensure that the population’s Ne is above 500 in order to maintain genetic variation in the long term [55, 56].

Thus it can be seen, the Ne value has important guiding significance for conservation biology. The development of advanced genome technologies not only allows for more accurate estimation of population Ne and reconstruction of the process of population number changes of a species, but also serves as an important supplement to the field of conservation genomics.

Davidia involucrata is a rare plant endemic to China, known as the “panda among the trees”. A high-quality genome of D. involucrate was constructed using PacBio Hi-C technology. Re-sequencing analysis showed that the effective population size of D. involucrata has been decreasing since the Quaternary Ice Age, indicating that climate change may be one of the main reasons why D. involucrata has become endangered. Therefore, the sensitivity of D. involucrata to climate change should be fully considered regarding genetic conservation of the species [57].

The effect of ice age on population size decrease can also be observed in another threatened species, Cercidiphyllum japonicum, which is one of the most widely distributed forest species in the Tertiary relict leaf forest of East Asia. It is listed as class II protection species according to the List of Wild Plants Under State Protection in China. Comparative genomics and population genomics research based on high-throughput sequencing technology has successfully reconstructed the evolutionary history of the Tertiary relict plant genus Cercidiphyllum in East Asia at the whole genome level. It found that the dry and cold environment in the middle Miocene (c. 10 Ma) led to the species differentiation of C. japonicum and C. magnificum. During the alternation of Pliocene and Pleistocene and the last glacial maximum, the population of C. japonicum and C. grandiflorum contracted sharply. However, different from the case of D. involucrate, selection clearance and balanced selection related to local adaptation jointly maintained genetic variations of genes involved in key physiological processes, thereby improving the ability of C. japonicum to adapt to different environments [58].

On the other hand, recovery of temperature during the interglacial period may be beneficial for population expansion. Genomic-based study on the population dynamics history of Liriodendron chinense and Liriodendron tulipifera showed that throughout the Quaternary ice age, the population of L. tulipifera continued to decrease, while the population of L. chinense recovered and reached its peak at approximately 0.4 mya, which is the interglacial period between the Guxiang Ice Age (0.3–0.13 million years ago) and the Nie Nie Xiongla Ice Age (0.72–0.5 million years ago). This may explain the higher genetic diversity of L. chinense than L. tulipifera [59].

Other than climate factors, geological historical events and human interference are also important reasons for population size fluctuations. Genomic conservation study of Rhododendron griersonianum suggested that habitat loss caused by human activities, the extremely low genetic diversity of Rhododendron rubra, and the genetic bottlenecks, inbreeding, and harmful mutations related to heat adaptation caused by geological historical events are the main reasons for the formation and maintenance of the ‘extremely small population’ of R. griersonianum [36].

In summary, whole genome sequencing can help infer fluctuations in effective populations and track historical events of population dynamics, as well as infer the impact of past geological and climatic events on the number and genetic composition of contemporary effective populations. As shown in the above research, extreme geological and climatic changes and human activities are important reasons for the rapid decline in effective population size and genetic diversity of endangered species. Linking population size changes with historical environmental changes can also help predict the impact of future environmental changes on population distribution and genetic diversity.

3.4 Inbreeding and asexual reproduction

According to Hardy–Weinberg principle, in an ideal state, individuals in the population mate randomly, and the allele frequency is stable in inheritance. However, there are generally some non-random mating patterns in natural populations, the most common of which are inbreeding and asexual reproduction, which will affect the homozygosity of the genome. The genetic differences in individual genomes within a population are the key to population adaptation and evolution. Genetic recombination will generate new gene combinations, so outbreeding between unrelated individuals can effectively increase the genetic diversity of a population. On the other hand, selfing or inbreeding between close relatives can significantly reduce the level of genetic variation within the population. Compared with the outbred individuals, the population fitness of the inbred progeny will be significantly reduced due to the accumulation of harmful mutations. This is called inbreeding depression [60]. The degree of inbreeding depression depends on the genetic load of the population. In nature, inbreeding is the result of limited population size, so some rare species are often more vulnerable to threats due to their small populations and isolation. Most individuals in their populations exhibit consistent genotypes with their ancestors [61]. The reproductive method of clone reproduction lacks genome recombination, resulting in extremely low population genetic diversity [62]. In addition, the accumulation of harmful mutations from generation to generation further reduces the adaptive potential of the species [62], leading to a “dead end” in species evolution [63].

Whole genome sequencing can provide us genome-level genetic markers to accurately estimate the inbreeding level of the population. Due to the wide coverage and low computational cost of SNPs on the genome, they are currently commonly used for phylogenetic analysis and Run of Homozygosity (ROH) analysis to evaluate inbreeding levels. Phylogenetic relationship analysis refers to the calculation of relative values of genetic similarity between individuals, which can be inferred through genotype markers. Common evaluation parameters include coefficient of kinship, co-ancestry, and identity by descent (IBD) [64]. Continuous homozygous regions on the individual genome are the result of inbreeding. When inbreeding occurs, parents will pass the same haplotype segments to their offspring to generate continuous homozygous segments. Therefore, the degree of inbreeding can be reflected by the proportion of ROH in the genome (FROH) [65, 66]. After a period of time, these fragments will be disrupted by recombination, so the length of ROH can also be used to estimate the time of inbreeding events [66].

At present, various software were developed to use genomic data to detect harmful mutations in populations. For example, SIFT prediction software based on sequence homology can help analyze whether newly emerged non synonymous mutations are harmful mutations [67]. PolyPhen-2 prediction software based on sequence homology and protein structure can help predict harmful mistranslation mutations in populations [68]. These software have been applied to the detection of inbreeding depression and harmful mutation of several endangered plant populations. For example, genomic study on two closely-related species of Ostrya found that the effective population sizes of the critically endangered species Ostrya rehderiana and the widespread species Ostrya polynervis both decline in the Quaternary Glacial Age. The effective population size of Ostrya polynervis rose rapidly after the end of Glacial Age, while the effective population size of Ostrya rehderiana continued to decline since the Holocene after the end of the Glacial Age, along with accumulation of harmful mutations in the genome and occasional infertility of natural fruiting. Surprisingly, the extremely harmful mutations in Ostrya rehderiana were significantly reduced compared to the widely distributed species Ostrya polynervis. Therefore, the prolonged decline in effective population size leads to weakened inbreeding inhibition, coupled with a reduction in extremely harmful mutations, allowing the endangered Ostrya rehderiana to remain robust and survive indefinitely [7]. Instead of simply increasing the total number of surviving individuals by collecting inbred seeds or cloning cuttings in endangered trees, artificial hybridization strategies should be designed in the future to reduce the risk of inbreeding and inheritance, and loss of diversity caused by drift transmission [7].

Based on this idea, Ma et al. analyzed the inbreeding and harmful mutation patterns of different populations of the endangered species Acer yangbiense, and developed personalized genetic rescue models. For example, genetic rescue is carried out for populations with high FROH and high homozygous harmful mutations. Pollination of female flowers in populations with the highest number of harmful mutations using pollens from the lowest number of homozygous harmful mutations can not only avoid introducing more harmful mutations, but also hybridize homozygous harmful mutations [69].

The above studies provide important reference value for future related research. More genomic data on endangered plants and continuously improved algorithms can help researchers and managers better identify endangered populations and understand the impact of different management methods on species fitness and genome.

3.5 Adaptive evolution

Adaptive evolution refers to the continuous accumulation of genetic variation conducive to survival and reproduction under the pressure of natural selection to better adapt to the environment and increase fitness. Traditional adaptive evolution research is mainly conducted through candidate gene methods. However, the candidate gene method relies on screening, targeted amplification, and comparison of known candidate genes. The analysis efficiency is low. The emergence of genomic data has made it possible to detect signals of adaptive evolution at the whole genome level, and has been widely applied in the study of adaptive evolution and population localization adaptation of endangered species.

Compared with only considering the genetic diversity of the whole genome, the adaptability of key regions or loci on the genome has a more significant impact on the adaptability of species. Under the neutral background of the whole genome, this part of the region will be an obvious outlier. Therefore, the possible adaptive association regions can be determined by detecting highly differentiated regions or selected regions on the population [70, 71].

In the genomic conservation study of G. biloba, 25 genes involved in insect and fungal defenses and responses to abiotic stress such as dehydration, low temperature and high salt, suggesting that ginkgo possessed unusually high resistance or tolerance to both abiotic and biotic stress, particularly herbivores and pathogens [41]. In C. japonicum, 823 genes that may be related to local adaptation were detected through FST and genotypic environmental association analysis. These genes are mainly enriched in cell development and proliferation, auxin metabolic pathway and stress response [58]. Nostoc flagelliforme is a first-class protected wild plant in China, mainly distributed in arid and desert areas in the north and northwest. Genome sequencing and comparative genome analysis of Nostoc flagelliforme showed that the expression of genes related to protection of photosynthetic apparatus, synthesis of monounsaturated fatty acids, ultraviolet radiation response promoted adaptation of Nostoc flagelliforme to high intensity ultraviolet radiation and extreme drought conditions. The study provides new insights into the research on the adaptability of blue-green algae to adversity [72].

In some other endangered plant species, the genomic loci under natural selection were also identified. Wang et al. used two genotypic-environmental association methods to conduct genome screen on 185 wild soybean germplasm individuals distributed in the three major agricultural ecological regions of China. Multiple genes involved in local adaptation, such as flowering time and temperature related genes were identified. A positive selection site was found on chromosome 19, which contains two adjacent MADS box transcription factors, possibly related to the ability of wild soybeans to adapt to high latitude environments [73].

In addition, through genome analysis of some endangered wild relatives of plants with economic value, we can find the impact of artificial domestication on adaptive evolution. In a genomic study of 81 cultivated and wild tea (Camellia sinensis) individuals, the chromosomal selection-sweeping regions were identified and enriched in the endemic and ancient species. It was found that artificial domestication not only improved the flavor of locally cultivated tea trees, but also improved their resistance to non-biotic stress [74]. Niu et al. re-sequenced 38 individual samples of Dendrobium officinale and its five related species from 13 regions. Combined genome-wide association studies (GWAS) identified 13 GWAS loci in total. The related genes at these loci are mostly associated with morphological traits such as plant height, leaf length, and stem length, which may be affected by artificial domestication [75].

In conclusion, genome wide sequencing is a powerful tool for detecting natural selection signals, revealing the genetic basis of phenotypic traits, and identifying local adaptation. It can promote our understanding of the inherent mechanisms of genetic variation and adaptive characteristics and help us take targeted protective measures to promote the adaptation of endangered plants to rapidly changing natural environments.

Advertisement

4. Future prospects

The current global rate of species extinction is accelerating, and the loss of biodiversity and ecosystem degradation has posed significant risks to human survival and development. Solving the problems of long-term survival for threatened species and constructing long-term protection mechanisms are urgently needed tasks.

Genomics study has innovated species conservation methods from multiple directions, such as identification of lineages and inbreeding events, identification of adaptive loci and outbreeding decline loci based on a large number of genetic markers. Conservation genomics can not only solve scientific problems such as genetic diversity, population genetic structure, and population dynamics that traditional conservation genetics focuses on, but also further trace the evolutionary history of species from ancient times to the present, and analyze the molecular mechanisms of population local adaptation and species adaptive evolution. In addition, conservation genomics also provides the possibility to study the genetic basis of interspecific interactions, promoting population level management rather than individual-level protection.

The classic approach in genomics study on endangered plants mainly include the following pipelines: constructing high-quality genomes, conducting comparative genome analysis to identify genes that expand and contract, elucidating the evolutionary process of genomes through WGD analysis or repeat sequence analysis. At the same time, it can further analyze and verify the gene family related to special phenotype and special mechanism by combining with transcriptome analysis. Together with genome re-sequencing, the population historical dynamics can be reveled and valuable genes can be found for genetic breeding.

At the same time, it should also be noticed that although genomics study can provide preliminary solutions for biodiversity conservation, there are still some limitations. Firstly, in order to truly achieve endangered species protection and resource value development, more genomic information of endangered species should be obtained. The techniques of artificial pollination, hybridization, and expansion of endangered plants should also be tackled. Secondly, the vast majority of comparative genomics and population genomics research is currently based on the analysis of single nucleotide polymorphism (SNP) markers, while little is known about the role of structural variations (SV) in the adaptive evolution and local adaptation of endangered species. With the widespread application of third-generation sequencing technology, the ability to accurately analyze SV has greatly improved, which will undoubtedly promote understanding of the role of SV in the adaptive evolution of endangered species and local population adaptation. Finally, although genomics technology has updated some of our scientific knowledge in the field of conservation biology, it may not necessarily be useful for species conservation. For example, based on genomics analysis, we can interpret the function of genetic variations at the genome level and their impact on individual fitness. However, it is still not enough for us to evaluate the survival ability of the population, unless we can link individual fitness with population growth rate for discussion [76]. And this requires long-term research on individual fitness and its impact on population growth rate. This may be the biggest challenge currently facing conservation genomics.

Advertisement

Acknowledgments

We would like to thank the Zhejiang Provincial Welfare Technology Applied Research Project [LGN21C020007], the National Natural Science Foundation of China [No. 31800187], the 2021 and 2022 Foreign Specialized Projects of the Ministry of Science and Technology [QN2021016002L, G2022016022L].

Advertisement

Conflict of interest

The authors declare no conflict of interest.

References

  1. 1. Haddad NM, Brudvig LA, Clobert J, et al. Habitat fragmentation and its lasting impact on Earth’s ecosystems. Science Advances. 2015;1:e500052
  2. 2. Newbold T, Hudson LN, Hill SL, et al. Global effects of land use on local terrestrial biodiversity. Nature. 2015;520:45-50
  3. 3. Cronk Q. Plant extinctions take time. Science. 2016;353:446-447
  4. 4. Miraldo A, Li S, Borregaard MK, et al. An Anthropocene map of genetic diversity. Science. 2016;353:1532-1535
  5. 5. UK, R.B.G. State of the world’s plants report-2016. 2016
  6. 6. Xu S, He Z, Zhang Z, et al. The origin, diversification and adaptation of a major mangrove clade (Rhizophoreae). National Science Review. 2017;4:721-734
  7. 7. Yang Y, Ma T, Wang Z, et al. Genomic effects of population collapse in a critically endangered ironwood tree Ostryarehderiana. Nature Communications. 2018;9:1-9
  8. 8. DeSalle R, Amato G. Conservation Genetics in the Age of Genomics. New York, USA: Columbia University Press; 2009. pp. 1-24
  9. 9. Avise JC. Perspective: Conservation genetics enters the genomics era. Conservation Genetics. 2010;11:665-669
  10. 10. Allendorf FW. Genetics and the conservation of natural populations: Allozymes to genomes. Molecular Ecology. 2017;26:420-430
  11. 11. Allendorf FW, Hohenlohe PA, Luikart G. Genomics and the future of conservation genetics. Nature Reviews Genetics. 2010;11:697-709
  12. 12. Myers N. Threatened biotas: “Hot spots” in tropical forests. Environmentalist. 1988;8:187-208. DOI: 10.1007/BF02240252
  13. 13. Barnosky AD, Matzke N, Tomiya S, et al. Has the Earth’s sixth mass extinction already arrived? Nature. 2011;471:51-57
  14. 14. Ceballos G, Ehrlich PR, Barnosky AD, et al. Accelerated modern human–induced species losses: Entering the sixth mass extinction. Science Advances. 2015;1:e400253
  15. 15. IUCN. The IUCN Red List of Threatened Species. Version 2022-2. 2023. Available from: https://www.iucnredlist.org
  16. 16. Fuentes-Pardo AP, Ruzzante DE. Whole-genome sequencing approaches for conservation biology: Advantages, limitations and practical recommendations. Molecular Ecology. 2017;26:5369-5406
  17. 17. Hohenlohe PA, Funk WC, Rajora OP. Populationgenomics for wildlife conservation and management. Molecular Ecology. 2021;30:62-82
  18. 18. Vasemägi A, Primmer CR. Challenges for identifying functionally important genetic variation: The promise of combining complementary research strategies. Molecular Ecology. 2005;14:3623-3642
  19. 19. Sanger F, Nicklen S, Coulson AR. DNA sequencing with chain-terminating inhibitors. Proceedings of the National Academy of Sciences. 1977;74:5463-5467
  20. 20. Mardis ER. Next-generation DNA sequencing methods. Annual Review of Genomics and Human Genetics. 2008;9:387-402
  21. 21. Roberts RJ, Carneiro MO, Schatz MC. The advantages of SMRT sequencing. Genome Biology. 2013;14:405
  22. 22. Loman NJ, Quick J, Simpson JT. A complete bacterial genome assembled de novo using only nanopore sequencing data. Nature Methods. 2015;12:733-735
  23. 23. Lee H, Gurtowski J, Yoo S, et al. Third-generation sequencing and the future of genomics. BioRxiv. 2016;2016:048603. DOI: 10.1101/048603
  24. 24. Wang GT, Wang ZF, Wang RJ, et al. Development of microsatellite markers for a monotypic and globally endangered species, Glyptostrobus pensilis (Cupressaceae). Applications in Plant Sciences. 2019;7:e01217
  25. 25. Bao WQ , Ao D, Wuyun T, et al. Development of 85 SNP markers for the endangered plant species Prunus mira (Rosaceae) based on restriction site-associated DNA sequencing (RAD-seq). Conservation Genetics Resources. 2020;12:525-527
  26. 26. Cai CN, Xiao JH, Ci XQ , et al. Genetic diversity of Horsfieldia tetratepala (Myristicaceae), an endangered plant species with extremely small populations to China: Implications for its conservation. Plant Systematics and Evolution. 2021;307:50
  27. 27. Andrews KR, Good JM, Miller MR, et al. Harnessing the power of RADseq for ecological and evolutionary genomics. Nature Reviews Genetics. 2016;17:81-92
  28. 28. Qi YX, Liu YB, Rong WH. RNA-Seq and its applications: A new technology for transcriptomics. Hereditas (Beijing). 2011;33:1191-1202 (in Chinese with English abstract)
  29. 29. Warr A, Robert C, Hume D, et al. Exome sequencing: Current and future perspectives. 3G (Genes, Genomes, Genetics). 2015;5:1543-1550
  30. 30. Chong ZC, Ruan J, Wu CI. Rainbow: An integrated tool for efficient clustering and assembling RAD-seq reads. Bioinformatics. 2012;28:2732-2737
  31. 31. Eaton DAR. PyRAD: Assembly of de novo RADseq loci for phylogenetic analyses. Bioinformatics. 2014;30:1844-1849
  32. 32. Jones MR, Good JM. Targeted capture in evolutionary and ecological genomics. Molecular Ecology. 2016;25:185-202
  33. 33. Jing ZY, Cheng KG, Shu H, et al. Whole genome resequencing approach for conservation biology of endangered plants. Biodiversity Science. 2023;31(5):22679
  34. 34. Lewin HA, Robinson GE, Kress WJ, et al. Earth BioGenome Project: Sequencing life for the future of life. Proceedings of the National Academy of Sciences, USA. 2018;115:4325-4333
  35. 35. Frankham R. Where are we in conservation genetics and where do we need to go? Conservation Genetics. 2010;11:661-663
  36. 36. Ma H, Liu YB, Liu DT, et al. Chromosome-level genome assembly and population genetic analysis of a critically endangered Rhododendron provide insights into its conservation. The Plant Journal. 2021;107:1533-1545
  37. 37. Rossetto M, Yap J, Lemmon J, et al. A conservation genomics workflow to guide practical management actions. Global Ecology and Conservation. 2021;26:e01492
  38. 38. Waples RS. Definition of “species” under the Endangered Species Act: Application to Pacific salmon. US Government Technology Report. 1991;53:3
  39. 39. Frunk WC, McKay JK, Hohenlohe PA, et al. Harnessing genomics for delineating conservation units. Trends in Ecology & Evolution. 2012;27:489-496
  40. 40. Forester BR, Murphy M, Mellison C, et al. Genomics-informed delineation of conservation units in a desert amphibian. Molecular Ecology. 2011;31:5249-5269
  41. 41. Zhao YP, Fan GY, Yin PP, et al. Resequencing 545 ginkgo genomes across the world reveals the evolutionary history of the living fossil. Nature Communications. 2019;10:4201
  42. 42. Liu PL, Zhang X, Mao JF, et al. The Tetracentron genome provides insight into the early evolution of eudicots and the formation of vessel elements. Genome Biology. 2020;21:291
  43. 43. Waples RS, Kays R, Fredrickson RJ, et al. Is the red wolf a listable unit under the US Endangered Species Act? Journal of Heredity. 2018;109:585-597
  44. 44. Allendorf FW, Leary RF, Spruell P, et al. The problems with hybrids: Setting conservation guidelines. Trends in Ecology & Evolution. 2001;16:613-622
  45. 45. Huang XH, Kurata N, Wei XH, et al. A map of rice genome variation reveals the origin of cultivated rice. Nature. 2012;490:497-501
  46. 46. Girod C, Vitalis R, Leblois R, et al. Inferring population decline and expansion from microsatellite data: A simulation-based evaluation of the Msvar method. Genetics. 2011;188:165-179
  47. 47. Elmer KR, Reggio C, Wirth T, et al. Pleistocene desiccation in East Africa bottlenecked but did not extirpate the adaptive radiation of Lake Victoria haplochromine cichlid fishes. Proceedings of the National Academy of Sciences. 2009;106:13404-13409
  48. 48. Franks SJ, Kane NC, O’Hara NB, et al. Rapid genome-wide evolution in Brassica rapa populations following drought revealed by sequencing of ancestral and descendant gene pools. Molecular Ecology. 2016;25:3622-3631
  49. 49. Fahrig L. Ecological responses to habitat fragmentation per se. Annual Review of Ecology, Evolution, and Systematics. 2017;48:1-23
  50. 50. Wang J, Santiago E, Caballero A. Prediction and estimation of effective population size. Heredity. 2016;117:193-206
  51. 51. Wright S. Evolution in Mendelian populations. Genetics. 1931;16:97
  52. 52. Wright S. Inbreeding and homozygosis. Proceedings of the National Academy of Sciences of the United States of America. 1933;19:411
  53. 53. Kempthorne O. Evolution and the genetics of populations. Volume 2: The Theory of Gene Frequencies. JSTOR. 1971;1971:120-121
  54. 54. McElhany P, Rucklelshaus MH, Ford MJ, et al. Viable Salmonid Populations and The Recovery of Evolutionarily Significant Units. U.S. Dept. Commer. NOAA Tech. Memo. NMFS-NWFSC.; 2000. pp. 42-156
  55. 55. Franklin IR. Evolutionary changes in small populations. Conservation Biology. 1980;1980
  56. 56. Jamieson IG, Allendorf FW. How does the 50/500 rule apply to MVPs? Trends in Ecology & Evolution. 2012;27:578-584
  57. 57. Chen Z, Ai F, Zhang J, et al. Survival in the tropics despite isolation, inbreeding and asexual reproduction: Insights from the genome of the world’s southernmost poplar (Populus ilicifolia). The Plant Journal. 2020;103:430-442
  58. 58. Zhu SS, Chen J, Zhao J, et al. Genomic insights on the contribution of balancing selection and local adaptation to the long-term survival of a widespread living fossil tree, Cercidiphyllum japonicum. New Phytologist. 2020;228:1674-1689
  59. 59. Chen J, Hao Z, Guang X, et al. Liriodendron genome sheds light on angiosperm phylogeny and species-pair differentiation. Nature Plants. 2019;5:18
  60. 60. Glémin S, Bataillon T, Ronfort J, et al. Inbreeding depression in small populations of self-incompatible plants. Genetics. 2001;159:1217-1229
  61. 61. Frankham R. Inbreeding and extinction: A threshold effect. Conservation Biology. 1995;9:792-799
  62. 62. McKey D, Elias M, Pujol B, et al. The evolutionary ecology of clonally propagated domesticated plants. New Phytologist. 2010;186:318-332
  63. 63. Mattila TM, Laenen B, Slotte T. Population genomics of transitions to selfing in Brassicaceae model systems. In: Statistical Population Genomics. New York, NY: Humana; 2020. pp. 269-287
  64. 64. Visscher PM, Medland SE, Ferreira M, et al. Assumption-free estimation of heritability from genome-wide identity-by-descent sharing between full siblings. PLoS Genetics. 2006;2:e41
  65. 65. Lencz T, Lambert C, DeRosse P, et al. Runs of homozygosity reveal highly penetrant recessive loci in schizophrenia. Proceedings of the National Academy of Sciences. 2007;2007(104):19942-19947
  66. 66. McQuillan R, Leutenegger AL, AbdelRahman R, et al. Runs of homozygosity in European populations. The American Journal of Human Genetics. 2008;83:359-372
  67. 67. Ng PC, Henikoff S. SIFT: Predicting amino acid changes that affect protein function. Nucleic Acids Research. 2003;31:3812-3814
  68. 68. Adzhubei IA, Schmidt S, Peshkin L, et al. A method and server for predicting damaging misense mutations. Nature Methods. 2010, 2010;7:248-249
  69. 69. Ma YP, Liu DT, Wariss HM, et al. Demographic history and identification of threats revealed by population genomic analysis provide insights into conservation for an endangered maple. Molecular Ecology. 2022;31:767-779
  70. 70. Nielsen R. Molecular signatures of natural selection. Annual Review of Genetics. 2005;39:197-218
  71. 71. Oleksyk TK, Smith MW, O’Brien SJ. Genome-wide scans for footprints of natural selection. Philosophical Transactions of the Royal Society B Biological Sciences. 2010;365:185-205
  72. 72. Shang JL, Chen M, Hou S, et al. Genomic and transcriptomic insights into the survival of the subaerial cyanobacterium Nostoc flagelliforme in arid and exposed habitats. Environmental Microbiology. 2019;21(2):845-863
  73. 73. Wang J, Hu ZB, Liao XL, et al. Whole-genome resequencing reveals signature of local adaptation and divergence in wild soybean. Evolutionary Applications. 2022;15:1820-1833
  74. 74. Xia E, Tong W, Hou Y, An Y, et al. The reference genome of tea plant and resequencing of 81 diverse accessions provide insights into genome evolution and adaptation of tea plants. Molecular Plant. 2020;13:14
  75. 75. Niu ZT, Zhu F, Fan YJ, et al. The chromosome-level reference genome assembly for Dendrobium officinale and its utility of functional genomics research and molecular breeding study. Acta Pharmaceutica Sinica B. 2021;11:2080-2092
  76. 76. Coulson T, Benton TG, Lundberg P, et al. Estimating individual contributions to population growth: Evolutionary fitness in ecological time. Proceedings Biological Sciences. 2006;273:547-555

Written By

Qing Ma, Gang Wu, Wenjie Li, Seyit Yuzuak, Fachun Guan and Yin Lu

Submitted: 03 June 2023 Reviewed: 21 June 2023 Published: 30 August 2023