Open access peer-reviewed chapter

Genomics Era for Plants and Crop Species – Advances Made and Needed Tasks Ahead

By Ibrokhim Y. Abdurakhmonov

Submitted: November 23rd 2015Reviewed: December 1st 2015Published: July 14th 2016

DOI: 10.5772/62083

Downloaded: 1014

Abstract

Historically, unintentional plant selection and subsequent crop domestication, coupled with the need and desire to get more food and feed products, have resulted in the continuous development of plant breeding and genetics efforts. The progress made toward this goal elucidated plant genome compositions and led to decoding the full DNA sequences of plant genomes controlling the entire plant life. Plant genomics aims to develop high-throughput genome-wide-scale technologies, tools, and methodologies to elucidate the basics of genetic traits/characteristics, genetic diversities, and by-product production; to understand the phenotypic development throughout plant ontogenesis with genetic by environmental interactions; to map important loci in the genome; and to accelerate crop improvement. Plant genomics research efforts have continuously increased in the past 30 years due to the availability of cost-effective, high-throughput DNA sequencing platforms that resulted in fully sequenced 100 plant genomes with broad implications for every aspect of plant biology research and application. These technological advances, however, also have generated many unexpected challenges and grand tasks ahead. In this introductory chapter, I aimed briefly to summarize some advances made in plant genomics studies in the past three decades, plant genome sequencing efforts, current state-of-the-art technological developments of genomics era, and some of current grand challenges and needed tasks ahead in the genomics and post-genomics era. I also highlighted the related book chapters contributed by different authors in this book.

Keywords

  • Plant genome sequencing
  • genetical genomics
  • genomic selection
  • 1KP
  • 1001 plant genomes
  • GEEN

1. Introduction

The Plant Kingdom is a key of the food chain in our planet. Plant domestication by humankind occurred in early societal development, and subsequent agricultural practice and unintentional and intentional plant breeding led to developing productive crop species that provided food and feed products for all living organisms, including humans [1, 2]. Plant species are very diverse and there are about 300,000 plant species in the world [3]. Humankind presently grows ~2000 plant species [4] in the agriculturally suitable land of 15.5 million square kilometers to fulfill the human diet. Crop domestication with subsequent breeding and farming has created 15 priority crop species, which provide more than 90% of food products [1, 5]. Besides feeding properties, plants supply clothing and housing materials, balance agrobiosenosis and earth ecology, provide medicines and treatment for many diseases, produce energy and biofuels, and have many other key properties and usages to understand life in our planet [610].

Plant domestication, coupled with the need and desire to get more food and feed products, has resulted in continuous development of breeding and genetics efforts [2, 4]. Early primitive selection attempts have subsequently developed the methods of shuffling traits/characteristics between plant genotypes via controlled sexual crosses that discovered the genetics of key characteristics of crops. Furthermore, the development of biological sciences and understanding of the Mendelian and quantitative genetics of phenotypic variations in plant genotypes, equipped with optimized, targeted, and efficient selection, phenotyping, and statistical methods as well as advanced agrochemical technologies of the past centuries, have revolutionized crop breeding efforts. These advances have resulted in the development of superior crop genotypes that have helped to increase agricultural production [11]. Thanks to the “Green Revolution” [11, 12], the efficient exploitation of plant genetic diversity and plant germplasm resources, novel cultivar development, and better and suitable agrochemical technologies for the past 50 years, the world average cereal crop yield has increased 2.6 times (1.35–3.51), whereas there was 5-fold increase in maize production [11]. There are many such examples of successful conventional breeding efforts. Despite this, food deficiency and human starvation still exist widely and will become even worse with an increase of global human population to ~9 billion by 2050 [13], whereby ~1 billion people may suffer hunger [14]. There is a desire and need to feed the increasing human population, sustain agricultural production, and overcome newly emerging biosecurity issues in the era of global climate change with ever worsening environmental conditions on earth, and societal globalization and technological advances [15, 16].

These prompted the plant research community to enrich and power the conventional plant breeding and genetics methods with precise tools beyond conventional hybridization, selection, and cultivation/farming practices. This is also dictated by the long duration of conventional breeding and crop improvement, impacted by the limitations in phenotypic evaluations, masking the effect of the environment, polygenic nature of many key traits with many unnoticed minor genetic components [11], negative genetic correlations between important agronomic traits [15, 17, 18], linkage drags, and distorted segregation issues in hybridization between diverse genotypes [15, 1719].

To address all these, plant researchers have attempted to decipher the molecular basis of genetic diversities by cloning and sequencing the genes encoding the trait of interest and utilize them in plant breeding as tools in vertical or even via revolutionizing horizontal gene transfers [11]. Progress made toward this goal has elucidated plant genome composition and led to decoding the entire DNA sequences of plant genomes conditioning plant ontogenesis. Here comes “genomics” that was derived from the use of the term “genome”—a haploid set of chromosomes—coined by Winkeler in 1920. First used in 1986, genomics defined “the enterprise that aimed to map and sequence the entire human genome” [20]. Similarly, “plant genomics” is a discipline of plant sciences targeting to decode, characterize, and study the genetic (DNA/RNA) compositions, structures, organizations, and functions as well as molecular genetic interactions/networks of a plant genome [2029]. Plant genomics aims to develop large-scale high-throughput technologies and efficient tools and methodologies to elucidate the basics of genetic traits/characteristics, genetic diversities, and by-product production; to understand the phenotypic development throughout plant ontogenesis with genetic by environmental interactions; to map important loci throughout the genome; and to accelerate the crop breeding and selection in a genome-wide scale.

Plant genomics research efforts have continuously increased in the past 30 years. The numbers of scientific publications on plant genomics research have drastically increased and reached 17,210 scientific publications in 2015, as indexed in the PubMed database [30], with its first increase in 2000/2001, following a significant peak after 2010 (Figure 1). The first fully sequenced plant genome was the model plant Arabidopsis, which was published in 2000. Since then, almost 50 plant genomes were fully decoded by 2013 [31] and the plant sciences community has finished more than 100 plant genomes by 2015 [32]. Furthermore, the plant sciences community extendedly portrayed a sequencing vision of 1001 Arabidopsis accessions [33, 34] and sequencing 1000 plant species [35] that “will have broad implications for areas as diverse as evolutionary sciences, plant breeding and human genetics” while generating many unexpected challenges and grand tasks ahead.

Figure 1.

Dynamics of “plant genomics” keyword-retrieved scientific publications in the past three decades. Source: PubMed [30].

2. Genome of plants and crop species

2.1. Challenges and advantages

Compared to other eukaryotic systems, plant genomes are more complex, which create challenges to study its DNA compositions. First of all, the extraction of high-quality DNA from plant tissues, abundantly enriched with phenolic and other metabolic compounds with high affinity to DNA, is conventionally challenging. This interferes with efficient library preparation for whole-genome sequencing [1], although researchers have optimized methodologies to overcome existing issues [36].

Furthermore, plant genomes have widely different chromosome numbers, transposon/retro-transposon transcript retention property, and highly varied ploidy levels with many supergenes, pseudogenes, and repetitive elements including low-, medium-, and high-copy number DNA sequences such as transcribed genes, rRNA genes, and retro-elements or short repetitive sequences, respectively. As a result, plant genomes can be 100 times larger in sizes when compared to animal or other model eukaryotic genomes [1] and may contain many paralogous DNA sequences that make sequencing and genome assemblies difficult, which often will generate false-positive errors [37]. For instance, one of the largest examples of sequenced plant genomes, sugarcane (12 Gbs) and hexaploid wheat genome with 17 Gbs in size, represents 80% repetitive elements [1, 32].

Moreover, these massive repetitive “junk” DNA sequences, organized as a simple tandem repeat, repeat single-copy interspersion, inverted repeats, and compound tandem array arrangements, somewhat mask functionally vital single-copy genes, which create a challenge to characterize and clone important individual genes [32, 37].

Open pollinated, self-pollinated, and clonally propagated plant species have a high level of nucleotide diversity. This can be exemplified by the nucleotide diversity of maize, barley, and grape genomes, where maize genome, for instance, has 10-fold (up to 13%) more polymorphic sites between individual genotypes compared to humans with similar genome size [32, 37]. These polymorphism sites create a challenge in sequence assembly due to the higher rates of nucleotide mismatches to the reference genome.

Lastly, plants tend to have abundant copies of chloroplast genome with two inverted repeat organizations as well as large inversions in some plants with some exchanged regions between nuclear genomes. This creates another challenge in the assembly of repetitive and exchanged regions of chloroplast genomes [32]. The same issue exists in the case of mitochondrial genomes, although it is common for animal genomes as well. All these challenges and complications mentioned above may result in generating fragmented, isolated, and incorrect assemblies in the background of high-copy repeats and paralogous sequences.

However, some specific methodologies and bioinformatics tools have been developed to minimize these challenges. These include the optimized DNA isolation from difficult plant materials [36], use of high-density linkage maps, identification and sorting out of paralogous alleles using local patterns of linkage disequilibrium, and sequencing diploid relatives or ancestor-like genomes of polyploid plants [37]. The use of laser capture microdissection techniques can isolate individual cell types or chromosome or its arm that could minimize the ploidy or paralogy complexities [27]. Moreover, the use of third-generation single-molecule sequencing approaches [1] and novel assembly methods such as optical mapping and long-range Hi-C interactions can also minimize some of challenging cases with the plant genomes mentioned here, which have been well addressed and covered in detail in a chapter by Deschamps and Llaca in this book.

At the same time, along with these challenges and complexities, plants also offer advantages [37] in genome analyses over other eukaryotic systems. This is due to the clonal propagation and indefinite seed storage properties, which create an opportunity for repeated collection of the same DNA samples for sequencing and studying its phenotype multiple times in many generations across replicated environments [37]. There are no ethical issues associated with the multiple use of plant materials, as it is a sensitive issue for animal cases. The possibility of self-pollination or forced crosses advantageously helps to create highly homozygous samples to reduce existing heterozygosity. There is an opportunity of obtaining double haploid plant genomes [37]. Plant genomes tend to have large chromosomal segments conserved across a large number of taxa in closely related plant species. The collinearity and synteny of plant genomes are very useful to use reference genomes of model species to study homologous and orthologous genes from yet unsequenced genomes [20].

2.2. Sequenced plant genomes

The ability to sequence DNA molecules, which was made possible in the 1970s with the introduction of the “plus and minus” sequencing technique of Sanger and Coulson [38] and Maxam and Gilbert [39], is generally considered to be the starting point of genomics sciences. Later, the simple, long-read chain-terminating dideoxynucleotide DNA sequencing method [40] has become a method of choice to decode genetic sequences. Its eventual automation [41] had extended the capacity and power of this chain-termination sequencing methods to decode the entire genome sequences of living organisms. Because of technological advances and automated sequencing instrumentations [27], a large-scale sequencing of cDNA libraries made it possible to perform serial analysis of gene expression (SAGE) and expressed sequences tags (ESTs). These were the first genomics technologies for all organisms, including plant genomes [42]. Furthermore, these advances powered by microarray tools routinely used by many individual laboratories worldwide have helped to identify the genome structures and functional and regulatory elements across genomes [27] and have facilitated to develop high-throughput reliable molecular markers for genome/trait mapping studies.

The development and generation of massively parallel sequencing technologies [44] provided cost-effective, new-generation sequencing (NGS) platforms that have helped to completely decode the entire genome of many different organisms within a short period. For instance, in plants, the first sequenced genome was a model plant Arabidopsis thaliana with 125 Mbs in size, 25,489 individual genes, and 14% repetitive elements, which was published in 2000 [5]. Further, more than 109 plant genomes have been fully sequenced by 2015 [32], including 21 monocots and 83 eudicots, 10 model and 15 non-model plant genomes, five non-flowering plants, and 69 crop species with 6 crop model genomes and 15 wild crop relatives [32]. Following the Arabidopsis model, several rice (Oryza sativa) genomes in 2002 to 2005, black cottonwood (Populus trichocarpa) genome in 2006, and grape (Vitis vinifera) genome in 2008 were fully sequenced. Sequencing whole plant genomes has increased in subsequent years, and 10 plant genomes had been sequenced in 2011. About 80% of sequenced genomes were accomplished in the past 3 years (2012–2014; Figure 2).

Figure 2.

A number of sequenced plant genomes from 2000 to 2014. Source: Ref. [32].

The smallest plant genomes sequenced so far [32] are the two eudicot plants: corkscrew (Genlisea aurea) with 64 Mbs genome size and 17,755 genes [45] and bladderwort (Utricularia gibba) with 77 Mbs genome size and 28,500 genes [46]. In contrast, the largest genomes sequenced are from gymnosperm plants, including Norway spruce (19,600 Mbs) [47], white spruce (20,000 Mbs) [48], and loblolly pine (23,200 Mbs) [49]. The largest genome sequenced from crop species is the hexaploid wheat (Triticum aestivum) with a genome size of 17,000 Mbs [50]. An average size for all published plant genomes is 1850 Mbs. Per published plant genome data [32], the gene numbers of the smallest to largest genomes are within the range of 17,755 (corkscrew) to 124,201 (hexaploid wheat) with an average of 40,738 genes for all sequenced genomes. Repetitive elements are highly variable among published genomes that varied from 3% (bladderwort) to 85% (maize, Zea mays) with an average estimate of 46% per genome. These sequenced plant genomes not only provided an updated knowledge on structural compositions and complexities of plant genomes but also elucidated the evolution of gymnosperm and angiosperm plants and specific gene families contributing to the radiation of flowering plants. We learned some direct correlation between genome sizes and gene numbers/repetitive elements, although it does not strictly follow the rules, which was evidenced by several exceptions. For example, one of the largest genomes, Norway spruce, has ~28,000 genes, which is similar to the smallest genome bladderwort. Moreover, medium-sized maize genome (2300 Mbs) or wild tomato (1200 Mbs) contains more or approximately the same (<80%) contents of repetitive elements compared to the largest sequenced genome of loblolly pine (23,200 Mbs) [32].

In this book, the chapter by Galla et al. (Section 2) presents the results of the first draft of the full genome sequence and assembly of a fresh salad plant leaf chicory (Cichorium intybus subsp. intybus var. foliosum L., 2n=2x=18, and 1.3 Gbs genome size), named as Radicchio in Italian. The results of decoding the full genome of leaf chicory will “extend the current knowledge of the genome organization and gene composition of leaf chicory, which is crucial for developing new tools and diagnostic markers useful for our breeding strategies in Radicchio” and will be an important addendum to the list of sequenced plant genomes.

2.3. Sequencing “1001 genotypes” and “1000 plant species”

The availability of a few whole reference genomes limits our full understanding of ecotypic variations that affect the function and adaptive evolution of plant species in various climatic conditions. It reduces the power of genome-wide tagging of biologically meaningful natural variations. In other words, the general perceptions are that “a single reference genome is not enough” for plant biology to explain and understand the existing natural variations in particular plant species and its populations [33, 34]. It also limits the development of efficient tools for genome analyses. To address this, as mentioned above, the Arabidopsis plant research community has developed a vision of sequencing a larger number of Arabidopsis genotype accessions, including various ecotypic and experimental population samples. As of today, the “1001 genome sequencing project of Arabidopsis accessions” has completed the full genome sequencing of 1100 Arabidopsis accessions [33, 34] “to record the genetic variation in the entire genome of many strains of the reference plant Arabidopsis thaliana” and with the future objective to develop efficient genome analysis tools and software [33].

To understand the tree of life of the Plant Kingdom and study its evolutionary aspects in comparison to other life forms, the international multi-disciplinary consortium of “The 1000 Plants (oneKP or 1KP) Initiative” has generated a large-scale gene sequencing data for more than 1000 various plant species [35]. Rather than concentrating on single species accessions as in the 1001 Arabidopsis whole-genome sequencing project [33, 34], the “1KP” project targeted 1000 distinct plant species with the objective of generating only functionally expressed (i.e., transcriptome) gene sequences. The plant species selected for the project had no restriction, and the samples were “chosen to represent every species known to science, across the Plant Kingdom, at some phylogenetically or taxonomically defensible levels” [35]. The 1KP sample list consists of 1328 entries [51] broadly grouped by phylogenetically (angiosperm, non-flowering, and green algae species) and by application (agriculture, medicine, biochemistry, and extremophytes). Most of these species have been sequenced for the first time (Table 1).

To date, an average of 2000 Mbs transcriptome sequence data have been generated for these 1KP plant species using 28 Illumina Genome Analyzer next-generation DNA sequencing machines at the Beijing Genomics Institute (BGI-Shenzhen, China) [35]. Ultimately, the obtained genomic sequence data will be used to analyze the phylogenetic, taxonomic, and evolutionary relationships of plant species, to study plant speciation, and to determine the timing of gene duplications during speciation events [35, 52]. However, the biggest limitation is associated with sequencing only transcriptomes rather targeting the whole genome, which limits obtaining many non-coding and repetitive portions of genomes. The results of “1001” and “1KP” sequencing efforts will undoubtedly open a new paradigm for plant genomics and its above-mentioned sub-disciplines. The results should not only accelerate crop improvement and boost the agricultural and medicine production worldwide but also help to understand the basics of plant life, evolution, speciation, and plant adaptations to the extreme environments in the era of global climate change and technological advancements.

Plant species*Number of samples
Phylogenetic groups
Angiosperms830
Angiosperms: Onagraceae samples50
Non-flowering plants257
Green algae241
Application groups
Medicinal samples142
Medicine - Alkaloid samples30
Medicine - Chemotherapeutic samples12
Biochemistry - Lipid Biosynthesis samples15
Agriculture - C3/C4 samples93
Agriculture - Weeds25
Extremophyte samples31
Halophytes samples18

Table 1.

Plant species samples chosen for the “1KP” plant genome sequencing project.

*The number of samples overlaps among groups. Source: Ref. [51].


In this book, we have presented several chapters targeting to review and discuss the strategies for sequencing and assembly challenges (by Deschamps and Llaca), new-generation sequencing platforms for comparative genomics of cereal crops (by Sikhakhane et al.) and non-model cactus plant Nopal (Opuntia spp.; by Alonso-Herrada et al.), and characterization of small RNA world of plant genomes (Hernández-Salazar et al.). These chapters describe the current advances and future needs on these topics.

3. Crop improvement in the genomics and post-genomics era

3.1. Genomics-assisted selection or genomic selection

At present, the reference genomes for many agricultural plants including specialty crops have been sequenced, as reviewed by Michael and VanBuren [32], which created a new paradigm for modern crop breeding. Crop breeding, which is powered and enriched by molecular markers, genetic linkage maps, QTL mapping, association mapping, and marker-assisted selection methods in the past century [37, 53], has now greatly accelerated and become ever productive and efficient in the plant genomics era [26]. This is due to the (1) availability of large-scale transcriptome and whole-genome reference sequences [32]; (2) high-throughput SNP marker collection and cost-effective, automated, and high-throughput genotyping platforms (HTP) and technologies (e.g., genotyping by sequencing or GBS), allowing breeders to screen multiple genotypes within a short time [23, 26]; (3) identification and use of expression QTLs (genetical genomics) in breeding [22]; and (4) opportunity to perform genome-wide selection (i.e., genomic selection) [26].

The biggest driving force for genomics-assisted crop breeding in the plant genomics era has been the inexpensive sequencing and re-sequencing opportunity for population individuals of genetic crosses and breeding lines. This helps to precisely identify and link genetic variations to the phenotypic expressions, taking into account the rare and private allelic variations that are abundant in crop line population or germplasm resources [26, 53, 54]. Furthermore, the availability of SNP marker collections and automated genotyping platforms provided a better genome converge to perform genome-wide genotype-to-phenotype associations (GWAS) [11, 37]. Also, when whole-genome sequences are not available and SNP markers are present in a limited number, the breeders using GBS and HTS platforms can readily genotype their mapping population and can provide genomic selections for the targeted crops of interest [23, 26, 54]. Although it was first applied for animal breeding [55], recently genomic selection has been successfully applied to a number of plant species [5662], including studies using GBS in the context of genomic selection [26]. Most importantly, the application of available genomics tools and a large number of high-throughput DNA markers and new-generation genotyping platforms have made the “breeding by design” [63] possible and have developed “virtual breeding” approaches [64] for efficient crop improvement. Several chapters in this book have covered the advances toward plant resistance genomics and molecular breeding against bacterial diseases in ryegrasses (see the chapter by Dr. Takahashi) as well as biotic/abiotic stress tolerance in agriculture crops (see the chapters by Onaga and Wydra, and Rao et al.).

The availability of genome sequences and a large number of SNP marker collections also provided the analysis of copy number variations (CNVs) in crop genomes, and their links to the key traits have greatly enhanced the crop improvement programs [11, 22, 23, 26, 37]. Furthermore, although challenges are evident, the opportunity provided by post-genome sequencing advances has help to integrate and enrich genomic selection with key proteome and metabolome markers. This significantly fostered and powered up the breeding of complex traits [22] of crops. Consequently, the knowledge gained through plant genomics coupled with proteomic and metabolomic advances has facilitated the emergence of an innovative approach of “personalized” agriculture through the utilization of chemical genomics [21]. This requires the translation of knowledge and expertise of the pharmaceutical industry on the development of “personalized medicine” to treat each person based on its reaction to the medical drugs into the agriculture. Because of high-throughput genome analysis, it is possible to date that many plant compounds, including herbicides, growth regulators and phytohormones, elicitors, low molecular metabolites (e.g., salicylic acids), and/or synthetic hybrid chemicals, can be screened for genetic response of individual crop genotypes and to study their mechanism of actions contributing to agricultural productivity. Once identified, highly genotype-specific chemical compounds can be developed that impact better than traditionally applied “fit for all” chemicals/growth stimulators and fertilizers. A combination of such chemical genomics approach, proteomics and metabolomics with genetic engineering, and genomic selection will further provide a way for “personalized” agriculture that sustains crop production (for detailed discussions, see a review by Stokes and McCourt [21]).

3.2. Novel transgenomics tools and biotech crops

Crop improvement is also greatly impacted by novel transgenomics and genome editing technologies developed as a result of plant genome characterization and understanding in the era of plant genomics. In the past two decades, a variety of novel transgenomics technologies have been developed to replace or enrich the traditional transgenesis-based genetic engineering and plant molecular biotechnology [65]. These novel technologies include antisense, RNA interference (RNAi), artificial microRNA expression (amiR), virus-induced gene silencing (VGS), zinc-finger nuclease (ZFN), transcription activator-like effects nucleases (TALENs), oligonucleotide-directed mutagenesis (ODM) of Cibus Rapid Trait Development System (RTDS), and clustered regularly interspaced short palindromic repeats/Cas9 (CRISPR/Cas9) technologies [65, 66]. These novel transgenomics technologies including genome-editing tools,the latter also referred to as genome editing with engineered nucleases (GEEN), are widely developed and utilized to investigate the gene function and apply to solve problems in medicine and agriculture. They are become methods of choice for major functional genomics and biotechnological studies [67]. RNA-mediated genome manipulation (RNAi) tools down-regulate the target genes due to gene silencing effects at transcriptional (TGS) or post-transcriptional (PTGS) levels, whereas GEEN systems help to insert, replace, or remove specific regions of DNA from a genome using artificially engineered nucleases that are referred to as “molecular scissors” [6870]. For a detailed description of RNAi, readers are suggested to read a chapter by Ricano-Rodriguez et al. in this book as well as to the recently published “RNA Interference” book by InTech Open.

The potential application of RNA-mediated gene silencing methods for crop improvement, including RNAi in plant biotechnology, is huge and the technology has already generated many successful examples in a wide range of technical, food, and horticulture crops. For example, RNAi was used to improve crop yield, food/fiber quality [18, 7175], resistance to pests, and biotic/abiotic stresses [76, 77], which are being considered for commercialization or are already in commercial production [78]. Employing ODM-mediated single nucleotide editing in Arabidopsis, targeting the BFP gene, has demonstrated a precise edition of CAC to TAC, converting histidine (H66) to tyrosine (Y66) in GFP protein that offered a non-transgenic breeding tool for crops [66]. Similarly, GEEN tools have also provided a new strategy for “trait stacking,” whereby several desired traits are physically linked to ensure their co-segregation during the breeding processes [79]. The examples include A. thaliana [8082] and Z. mays [83], where ZFN-assisted gene targeting has helped to heritably insert herbicide-resistant genes (SuRA/SuRB and PAT) into the targeted sites in the genome [83]. Although other GEEN technologies such as TALEN [8492] and CRSPR/Cas9 [93] are just picking its application in plants, their utilization in Arabidopsis [84], maize [85], rice [8688], potato [89, 90], wheat [65], barley [91], and plum [92] holds a great promise and potential for non-transgenic crop genome modification and improvement [65, 94].

4. Grand tasks ahead

The revolutionizing advances made in the past three decades in plant genomics and its sub-disciplines provided a mass of novel opportunities with easy-solution applications and high-throughput, cost-effective, and time-effective technologies. Plant genomics era increased our understanding of the basis of complex life processes/traits in plants and crop species, and it paved a way for effective improvement of plants to fulfill our diet and other needs. However, it also piled up challenging grand tasks ahead for current genomics and post-genomics era. Several chapters of this book have discussed some aspects of these challenges, and I tried to briefly summarize some of them here.

As mentioned above, tremendous achievements have been made toward sequencing more than hundreds of plant genomes including major crop species and specialty, model/non-model, wild, vascular, flowering, and polypoid plants [31, 32]. There are ongoing and fascinating consortia projects of sequencing “1001 genotypes of Arabidopsis” and “1000 various plant species” [3335, 51, 52]. However, the first current and future task ahead is to extend such large-scale, multiple accession genome sequencing initiatives for each priority agricultural and specialty crop species including their wild relatives and ancestor-like genome representatives. Although it sounds largely ambitious, this task will be mandatory and important for the next plant genome sequencing phase. This is to effectively use all variations existing among plant/crop germplasm resources and its ecotypic populations and to design efficient GWAS analysis and consequent genomic selections as well as tools/software programs for better analyzing plant genomes and improving genome assembly issues [3335]. This is especially needed for polyploidy crops [24, 32, 37] because the sequencing of many polyploids and their subgenomes would increase our understanding of the complexity of polypoidy, gene silencing, epigenetics, and biased retention and expression of genes after polyploidization [24, 9597]. Furthermore, it also helps to discover all natural variations and lost genes during crop domestication that should be useful to restore the key agriculturally important traits in the future.

Sequencing the entire genome of 1KP samples, rather concentrating on only transcriptome/exome, is also the necessary task ahead that would elucidate many important noncoding sequences from these plant species. Results would be useful for plant evolutionary, speciation and taxonomy studies. There are ongoing planning and targets toward this goal, and it should not cause much trouble in the land of experiences gained and inexpensive high-throughput sequencing technologies [1, 27, 32].

Although high-throughput DNA sequencing instrumentation exists and keeps evolving to better versions year-to-year, the consequent task is still to improve the sequence length that would solve many incorrect sequence sites and genome assembly challenges that plant genomics faces currently [1, 32]. Some of the currently ongoing efforts and possible solution with the advent of third-generation sequencing platforms and genome assembly tools and methodologies highlighted herein have been discussed by several book chapters in this book.

A consequent grand task and challenge with the completion of the above-highlighted tasks is the handling, organizing, systematizing, and visualizing a huge amount of plant genome sequencing (“Big Data”) data that require urgent attention, effort, collaborative work, and investment. There is an urgent need to develop more efficient bioinformatics platforms to handle plant genome data due to challenges, specificities, complexities, and sizes of currently available and future sequenced plant genomes mentioned herein [1, 98]. Funding this aspect of plant genomics and bioinformatics research is a necessary key step [1] for future advances on this task ahead.

Furthermore, the most important current and future post-genomics grand task ahead is to link the sequence variation(s) with phenotype(s), trait expression, and epigenetic and adaptive features of plants to their living environment and extreme conditions. The successful completion of this task will require the combined approaches of genomics with bioinformatics, proteomics, metabolomics, phenomics, genomic selections, genetical genomics, reverse genomics, system biology, etc. [11, 2129, 64, 65, 98]. In other words, there is a need to make sequenced genomes “functional” [31] and biologically meaningful [29, 37]. This also requires the integration of all available genomic and phenotypic data to identify key networks that also require downstream effort of integration of specific networks to networks of other systems in order to connect heterogeneous data [29]. There are suggested thoughts and tasks for plant genomics that should target to develop plant genome-specific “Encyclopedia of DNA Elements (ENCODE)” [31, 32], which will be an important achievement in the next phases of development. There is a need to use molecular phenotyping (i.e., using molecular process such as protein-RNA interactions, translation rates, etc.) in QTL mapping [23] that would help to precisely link the sequence variation(s) to its phenotype(s). There is a task for the development and translation of the concept “personalized agriculture” [21] that requires an attention as an unexplored area in crops with the availability of sequenced genomes and high-throughput genotype, proteome, metabolome, and phenotype profiling platforms and rapid crop line development tools such as genomic selection and new-generation genome-editing tools mentioned above. All these will help to minimize the current challenges with improved crop line development costs through efficient breeding [11, 22, 23, 26]. These particular grand tasks further highlight a need for extended effort and work on the development of inexpensive high-throughput plant phenotyping [25, 26] and plant proteome and metabolome profiling tools and instrumentation [27, 28] by utilizing small amount single-cell-derived samples [2729].

A parallel grand task to the above-outlined needs is to have concentrated efforts on the timely application of novel transgenomics and genome-editing tools for all types of plants and to optimize it for routine large- and short-scale biotechnology industry usage. There are grandest tasks to (1) utilize the complex effects of plant developmental genes (e.g., core microRNA/RNAi machinery) to simultaneously improve the key traits and overcome negative trait correlations [15, 18] and (2) optimize and better design novel transgenomics and genome-editing technologies for the key priority crops and plant by-product production. In addition, there are needs to (3) identify the appropriate choice of plant tissues for genome editing, (4) reduce or eliminate side effects and off-target toxicity and mutagenesis of application of novel genome modification technologies, and (5) develop reliable screens for the detection of edited genome samples [99]. The revolutionizing effects of these novel genome-editing/manipulation technologies and genome-edited organisms (GEOs) as well as their safer nature compared to conventional transgenesis are evident. However, without objective or proper regulatory policies, providing understanding and removing confusion of regulatory agencies and stakeholders [94], “these technologies may not live up to their full potential” [64] if they are regulated as genetically modified organisms bearing foreign genes [64, 94]. Therefore, this is one of the most important grand tasks ahead in the front of plant sciences research community in the era of plant genomics and post-genomics.

Finally, the grandest task is a preparation of well-qualified next-generation scientists capable of continuing plant genomics tasks highlighted herein with the understanding of conventional plant biology, ecology, plant breeding, evolution, taxonomy, modern “omics” disciplines, and cross-related scientific disciplines (e.g., mathematics, computing, and modeling) [1, 98]. Importantly, they are required to have a capability to utilize modern computing and instrumentation platforms and bioinformatics knowledge [29]. For instance, there is a huge need for a new generation of molecular breeders [100] with full knowledge and appreciation of conventional plant breeding aspects including the understanding of agrotechnology methodologies, genetic diversity of crop germplasm, and randomized multi-environmental field trails. These breeders also need to have abilities to handle, work, and utilize the sequenced genomes, high-throughput genotyping, and phenotyping platforms. This is a bottleneck for plant genomics at present, which requires urgent awareness, attention, and investment.

5. Conclusions

Thus, in the past three decades, plant genomics has evolved from the enrichment and advances made in conventional genetics and breeding, molecular biology, molecular genetics, molecular breeding, and molecular biotechnology in the land of high-throughput DNA sequencing technologies powering the plant research community to sequence and understand the genetic compositions, structures, architectures, and functions of full plant genomes. The technological and instrumentation advancements as well as the desire and need to feed the increasing human population, overcome biosecurity issues, and sustain agricultural production in the era of global climate change, the societal globalization, and technological advancements have been the main driving forces for plant genomics development. These led to sequence and assemble entire plant genomes including very complex polyploid plants, annotate gene functions, link the sequence variation(s) to the phenotype(s), and exploit sequence variation(s) in plant/crop improvement in genome-wide scale or through targeted native modification of plant genomes in a highly sequence-specific manner.

To date, more than 100 plant genomes including a large number of crops as well as flowering, non-flowering, crop wild relative, model and non-model, and specialty plants have been fully sequenced. As a result, it expanded our knowledge and understanding of many aspects of plant biology, genetics, breeding, and crop evolution and domestication, which contributed to the development of analytical and breeding tools, resulting in accelerated crop improvement programs. To look even deeper scales, more than 1100 Arabidopsis accessions from various eco-geographic origin and experimental populations have been fully sequenced, which will equip plant researchers with better analysis tools and help in tagging and exploiting the biologically meaningful variations. Furthermore, transcriptome profiling of 1000 distinct plant species with agricultural, medicinal, biochemical, and evolutionary utilization has a great value and will be “a gold mining” opportunity for plant biology to explain the evolution of tree of life and Plant Kingdom speciation. All of these successes have significantly accelerated crop improvement using novel genomic selections and new-generation genome-editing and manipulation technologies.

These advances, briefly highlighted herein, also have generated a number of grand challenges and mandatory tasks ahead in plant genomics and post-genomics era. There are many tasks ahead for the plant genomics community, which require more collaborations, integrated approaches, better computing capacity and analytical tools, accelerated training and education of well-qualified researchers, and larger investments. In this book, the authors tried to highlight some updates on current plant genomics efforts with future perspectives. We trust that the next phase of plant genomics efforts and development will be more exciting and help to solve current and future issues in front of humanity.

Acknowledgments

Plant genomics research in Uzbekistan is being jointly funded by the basic science (FA-F5-T030), applied (FA-A6-T081 and FA-A6-T085), and innovation (I-2015-6-15/2 and I5-FQ-0-89-870) research grants of the Academy of Sciences of Uzbekistan and Committee for Coordination Science and Technology Development of Uzbekistan. I thank the Office of International Research Programs (OIRP) of the U.S. Department of Agriculture (USDA) Agricultural Research Service (ARS) and the U.S. Civilian Research & Development Foundation (CRDF) for international cooperative grants P120, P120A, P121, P121B, UZB-TA-31016, UZB-TA-31017, and UZB-TA-2992, which were devoted to cotton genomics, including cotton gene characterization, germplasm analysis, genetic mapping, plant disease resistance, marker-assisted selection, and biotechnology. I also thank Dr. Din-Pow Ma, Mississippi State University, for the critical reading of this introductory chapter.

How to cite and reference

Link to this chapter Copy to clipboard

Cite this chapter Copy to clipboard

Ibrokhim Y. Abdurakhmonov (July 14th 2016). Genomics Era for Plants and Crop Species – Advances Made and Needed Tasks Ahead, Plant Genomics, Ibrokhim Y. Abdurakhmonov, IntechOpen, DOI: 10.5772/62083. Available from:

Embed this chapter on your site Copy to clipboard

<iframe src="http://www.intechopen.com/embed/plant-genomics/genomics-era-for-plants-and-crop-species-advances-made-and-needed-tasks-ahead" />

Embed this code snippet in the HTML of your website to show this chapter

chapter statistics

1014total chapter downloads

2Crossref citations

More statistics for editors and authors

Login to your personal dashboard for more detailed statistics on your publications.

Access personal reporting

Related Content

This Book

Next chapter

Integration of Next-generation Sequencing Technologies with Comparative Genomics in Cereals

By Thandeka N. Sikhakhane, Sandiswa Figlan, Learnmore Mwadzingeni, Rodomiro Ortiz and Toi J. Tsilo

Related Book

First chapter

Virtual Plant Breeding

By Sven B. Andersen

We are IntechOpen, the world's leading publisher of Open Access books. Built by scientists, for scientists. Our readership spans scientists, professors, researchers, librarians, and students, as well as business professionals. We share our knowledge and peer-reveiwed research papers with libraries, scientific and engineering societies, and also work with corporate R&D departments and government entities.

More about us