Opuntia DNA content and other plant species
From a genomic point of view, plants are complex organisms. Plants adapt to the environment, by developing different physiological and genetic properties, changing their genomic and expression profiles of adaptive factors, as exemplified by polyploidy studies. These characteristics along with the presence of duplicated genes/genomes make sequencing with early low-throughput DNA sequencing technologies in plants a challenging task. With the development of new technologies for molecular analysis, including transcriptome, proteome or microarray profiling, a new perspective in the genomic analysis was open, making possible to programs in species without genomic maps. The opportunity to extend molecular studies from laboratory model scale toward naturally occurring plant populations made it possible to precisely answer the longstanding important ecological and evolutionary questions. Some plant species have unique properties that could help to understand their adaptability to environment, crop production, pest protection or other biological processes. Molecular studies on non-model plants, including algae, mosses, ferns and plants with very specific characteristics are ongoing.
- Genome size
- wild materials
The first wave of plant genome sequencing has passed, and now new era has started in plant genomics research with new-generation sequence (NGS) strategies, which require a mixture of economic and scientific needs. Until now, several crops have been sequenced and some others crop´s sequencing is underway, which will greatly help to elucidate unknown biological processes and the phylogenetic relationship among crop plants. Furthermore, the genomic data analysis and its integration of the biological systems will help to establish fundamental models to understand the evolution, development, and adaptability of the plants.
A genomic sequence is an important information for the basic research and understanding of plant evolution and development. It serves as a tool for engineering new genotypes . Different plant species have different amounts of DNA . The DNA of most plants includes from 100 million to the largest example of 150 billion base pairs (designed as alphabets), organized into 20,000 to 50,000 genes . The most important contribution in the field of plant genome analysis is the discovery that many higher plants share a blueprint gene content. As distantly related plant taxa, monocots and dicots, which diverged from a common ancestor about 200 million years ago, retain some common gene order along the genome .
The development of new strategies and technologies for genome sequencing, can lead to development of programs to get a partial (transcriptome) or a complete DNA sequence (whole genome sequencing (WGS)) for non-model plants. The costs for these projects are now accessible. The first plant genome sequencing project, represented an effort of several years and millions of dollars. Now, the costs for the sequencing are in the order of thousands of dollars and there are new bioinformatics tools available for the analysis of generated sequence data.
The world’s population depends on a few crops such as rice, wheat, maize, and potato for their food. In the following decades the world will face the tremendous challenge of feeding the global population . The study of plant genomics in non-model plants will help to reveal the genetic factors and biochemical pathways involved in many processes such as flowering, nutrition, disease, and pest resistance, as well as tolerance of plants to abiotic stresses.
Model organisms are important for biological and agricultural approaches. The research in model organisms has generated a huge amount of important information on different molecular factors that contribute to plant growth and development, however it has some limitations . The study of wild plants will help to overcome these limitations. Wild plants are well adapted to extreme conditions, and resistant to plant pathogens.
Model organisms, have a limited number of uniform narrow-based genotypic samples or variability limited to a number of specific plants. The study on how some plants survive in extreme conditions may also provide some clues about the mechanisms of plant response to biotic or abiotic stresses. Some non-model organisms are an extraordinary source of plant secondary metabolites.
The wild plant genotypes sometimes does not look attractive for breeding programs because of their morphology; however, they are the repository of ancestral genes and very important sources for the rescue of specific traits.
In Mexico, there are a large number of wild plant populations fit for breeding or sequencing programs are yet to be identified. Wild plant populations are genetically diverse and are source of genes that encode proteins potentially used for health, industrial, or ecological purposes.
2. Non model plants as model for environmental adaptation
The cactus plants of Cactaceae family are an example for a plant that can adapt to several environmental conditions. One of these plants is Nopal (Opuntia spp.), which belongs to the genera Opuntia and Nopalea . This is an endemic plant found in semiarid areas in Mexico, but it grows along the American continent, from Canada to La Patagonia in Argentina, where environmental conditions are different from each area. Recently, Nopal plants have become the world´s interesting alternative fruit and forage crop. Only few varieties of nopal fruit originated from the Mexican nopal germplasm, are available in the market.
The history of first Nopal use in Mexico dates back to the ancient Mesoamerican civilizations; people used to collect cladodes and fruits from wild materials, for their nutritional qualities and medicinal purposes. The Spanish conquerors spread Nopal in America and Europe; now it is cultivated in Italy, Morocco, Tunisia, Greece, Israel, India, Philippines, China, Australia, South Africa, Brazil, Argentina, Colombia and the United States [6-8].
Although nopal is propagated asexually for commercial purposes, seed propagation is essential for breeding. Nopal apomixis makes the screening of individual crops obtained from crosses difficult and complicates the genetic studies . Although no genomic map exists for this multipurpose plant, several efforts have been made to get some genomic approaches, and it has been included in the 1000 genomes sequencing program. To date, extensive efforts on cDNA microarrays, microRNAs (miRNAs) microarrays, mRNA deep sequencing and molecular markers studies have been made.
To study the genes associated with crassulacean acid metabolism (CAM), an expressed sequence tag (EST) database of different developmental stages of various tissues was created . Sequences were assembled and compared with the available plant and genetic databases; genes involved in circadian regulation and CAM were identified in plants grown under a long day regime. Three kinds of expression profiles were found: transcripts oscillated with a 24-h periodicity; transcripts of the light-active genes adapted to cycles of 12-h periodicity; and
arrhythmic accumulation patterns. Some genes were scored best to a 12-h rhythm, suggesting a difference with Arabidopsis at level of circadian clock gene interactions. The results indicate that changes at the CAM metabolism are the result of modified circadian regulation at the transcriptional and posttranscriptional levels .
In addition, the gene regulation trough miRNAs has been explored . miRNAs are a class of small non-coding RNAs that regulate gene expression. A combination of Northern blot and tissue print hybridization was used, to identify conserved miRNAs expressed during nopal (Opuntia ficus indica) fruit development. A comparative analysis detected 34 miRNAs expressed differentially. These miRNA were clustered different groups and associated with the different phases of fruit development. Gradual expression of several miRNAs was observed during fruit development. The work provided the evidence of miRNA expression in the cactus fruit and the basis for future research on miRNAs in Opuntia .
One transcendental work is related to the analysis of genomic content in 23 Opuntia species by flow cytometry . A main interest on Opuntia genomes was related to the DNA content because; almost all the genotypes have a ploidy level of 4x, 8x or 12x; of their genetic complexity. In four different ploidy levels having 2C-DNA amounts, DNA content ranged from 3.75 Giga base pairs (Gb) (Opuntia incarnadilla Griffiths) to 5.87 Giga base pairs (Gb) (Opuntia heliabravoana Scheinvar) among the samples analyzed.
The 2C DNA content when compared with other species; such as maize, shows that genome of Opuntia is less complex than that of maize (Table 1), which makes Opuntia suitable for a genomic sequencing program.
|Common name||Scientific name||Family||Ploidy level||2C genome size (pg)||2C|
genome size (Gb)
|Soy bean||Glycine max||Leguminosae||2x||2.31||2.25|
|Tuna charola||Opuntia streptacantha||Cactaceae||8x||4.64||4.53|
|Tuna blanca||Opuntia ficus indica||Cactaceae||8x||4.90||4.79|
|Tuna robusta||Opuntia robusta||Cactaceae||8x||4.98||4.87|
The nopal and its products need a more deeply analysis to maximize the real value of this crop. It is a multipurpose plant that is very important in the life of the people because it impact on the economy, nutrition, medicinal practices, and fuel production. Two main aspects are now in the focus for increasing its crop value:
1. Some crops have been sequenced and some others are in progress, however the nopal is waiting to be sequenced. Once sequenced, it would help to understand several mechanisms of plant adaptation to different environments, and will give us clues about controlling the process for adaptation to extreme conditions in other plants .
2. A new important aspect involves miRNAs, which are thought to be fine-tuning mechanisms in gene regulation . Wrong expression of miRNAs can produce pleiotropic effects on development. It would be no surprise to discover that several events related to plant adaptation were under the control of miRNA expression. In the future, the expression of miRNA and siRNA will serve as tools for the generation of new Opuntia phenotypes. In these experiments the role of different molecules or pathways involved in seed formation, ripening delay or fruit development could be revealed.
3. Non model plants as source of industrial solutions
Development of modern society has led to an increased emission of pollutants into the environment, from industrial and domestic activities, as well as from mining, agriculture and crafting . These compounds are a threat to all the organisms; therefore, numerous methods have been developed to reduce the impact caused by pollution. Conventional methods for the removal of pollutants in soil and water are often costly and can irreversibly affect the properties of the soil, as well as the organisms that inhabit those places .
Bioremediation is a tool used to clean pollutants in soil and water, and it is referred to the chemical transformation of pollutants through the use of microorganisms and plants . The genomic content of plants for remediation has been calculated by different methods, and the sizes are included in Table 2. As shown in Table 2, there are no genomic complex organisms and some have been sequenced already.
It is important to consider that there is a great diversity of plants grow under different climates, which belong to different families. This allows researchers to have a wide variety of candidate plants that fits the scientific needs. Some plants of the Asteraceae, Brassicaceae and Solanaceae families have been found as tolerant to different pollutants. According to Lopez et al. , plants use a mechanism to alleviate the environmental stresses, by the following three phases:
(1) Absorption, excretion and detoxification of pollutants; (2) the distribution of pollutants throughout the plant and their excretion via volatilization; and (3) detoxification of pollutants by phytoremediation, by any one of the following processes: phytoextraction, rhizofiltration, phytoestimulation, phytostabilization, phytovolatilization or phytodegradation [15,16].
Phytostabilization allow to reduce the bioavailability and mobility of contaminants, avoiding underground transport layers or the atmosphere [15,16]. This process is less expensive than other methods, is easy to apply and aesthetically pleasing.
Phytodegradation is the transformation of organic pollutants in simpler molecules. In certain instances, degradation products will serve to accelerate plant growth, and other cases the contaminants are biotransformed. The phytodegradation has been employed for the removal of explosives, such as TNT, halogenated hydrocarbons, Bis-phenol, PAHs and organochlorine and organophosphorus pesticides .
In phytovolatilization, plants absorb water along with the soluble organic and inorganic pollutants (As, Se and Hg). Some of the contaminants can reach the leaves and get evaporated or volatilized into the atmosphere. Plants such as Bigelovii Salicornia, Brassica juncea, Astragalus bisulcatus and Chara canescens have been used for bioremediation of Se pollution and Arabidopsis thaliana has been used for bioremediation of Hg .
Rhizofiltration uses plants to remove contaminants from water environment through the root. In rhizofiltration, these plants are grown in hydroponic way. When the root system is well developed, the plants are introduced into polluted water with metals, where the roots absorb and accumulate. Numerous aquatic plants have the ability to accumulate pollutants, and some examples of these are as follows: Scirpus lacustris, Lemna gibba, Azolla caroliniana, Elatine trianda, Wolffia papulifera, Polygonum punctatum, Myriophylhum aquaticum, and Mentha palustris (for Al, As, Au, Cd, Cr, Cu, Cr, Fe, Hg, Mg, Mn, Ni, Pb, Se, Sr, Zn,) [14,15].
|Function||Species||Family||1C (Gb)||1C (pg)||Sequencing|
Pb, Zn, Cd, As, Cu, Mn
|Hordeum vulgare||Gramineae||5.1||5.5||2012||International. Barley Genome Consortium |
|Arabidopsis thaliana||Cruciferae||0.125||0.16||2000||Arabidopsis Genome Initiative |
Cd, Zn, Pb, Ni, Ag, Cr, Cu, Hg,
|Brassica juncea||Cruciferae||1.49||1.092||--||Johnston et al. .|
|Helianthus annuus||Compositae||3.5||2.43||2012||Staton et al. |
|Brassica napus||Cruciferae||1.12||1.15||2014||Boulos et al. |
|Sorghum bicolor||Gramineae||1.68||0.835||2009||Paterson et al. |
|Medicago sativa||Leguminosae||1.75||0.86||2011||Young et al. |
Cd, Pd, Zn, Cu, Ni, Cr
|Brassica nigra||Cruciferae||0.632||0.647||--||Johnston et al. |
|Helianthus annuus||Compositae||3.5||2.43||2012||Staton et al. |
|Cucumis sp||Cucurbitaceae||0.68||0.66||2009||Huang et al |
|Cucurbita sp||Cucurbitaceae||0.34||0.33||--||Šisko et al. |
|Helianthus annuus||Compositae||3.5||2.43||2012||Staton et al. |
Phytoextraction or absorption is carried out by the plant roots and accumulation of polluting metals in the stems and leaves. Some plants used for this approach are: Thlaspi caerulescens; Sedum alfredii, Viola and Vertiveria baoshanensis; Alyssum murale, Trifolium nigriscens, Psychotria douarrei, Pruinosa geissois, Homalium guillainii, Hybanthus floribundus, Sebertia acuminata, Stackhousia tryonii, Pimelea leptospermoides, Aeollanthus biformifolius; Haumaniastrum robertii; Brassica juncea, Helianthus annuus, Sesbania drummondii and Brassica napus (for Ag; Cd, Cr, Cu, Hg, Ni, Pb, and Zn) [14,15].
Phytodegradation in plants and microorganisms is associated with, degradation of organic pollutants into harmless products and, mineralization into CO2 and H2O. Plants such as Populus spp. are introduced to absorb the contaminants in soil pores and prevent leaking to other soil layers [15,26].
4. Perspectives for genome sequencing and genome information from non-model plants to plant breeding
Next-generation sequencing (NGS), include several and different technologies which has its own set of characteristics. NGS generates huge amounts of sequence data in a very cost-effective way .
The increased number of WGS projects means that more organisms, are becoming important genetic models . At the same time, many molecular studies are focusing on natural variation and adaptation in classical genetic model species, or close relatives of these, such as Arabidopsis, thus closing the gap between model and non-model organisms.
For example, the assembled genomes could be used as a reference sequence for further transcriptome analysis or re-sequencing and surveys of genetic variation. They may also be used to develop other genomic tools, such as proteomics and microarrays hybridization .
After the novel transcriptome has been annotated using a genomic reference species, it can be used as a starting point for more detailed functional characterizations of desired organisms, using gene ontology databases.
With RNA-seq protocols, or longer sequence reads will also improve applications because large haplotype blocks including several linked polymorphisms will become available. Wherein hundreds of genes are analyzed simultaneously. Some of these may be involved in important phenotypic variation, and this is relevant from the conservation point of view because such variation may be important to maintain within the population.
In the future, the bottleneck is more likely to be at the bioinformatics rather than in producing the sequences  because a huge number of biologists are trying to order
the genomic data with biological sense. New approaches for data storage and processing will be needed, because currently available databases might be unable to cope up with the rapid generation of new sequencing data .
5. Conclusions and prospects
Plants provide food for all living organisms, and just 15 crop plants provide 90% of the world’s food intake . Plant species are responsible for maintaining the balance of the carbon cycle, for developing and maintaining soil from erosion, and plant products are used as human medicines [33, 34]. For these reasons, there is great interest in sequencing plant genomes, but so far relatively few plant species have been sequenced compared with the hundreds of thousands of species around the world.
Non-model plants are becoming very attractive sources for different purposes, for their ability to adaptation to extreme environments and to produce specific metabolites that can be used for food and medicinal purposes. The materials must be characterized at molecular level to develop any strategy for the generation of genetic data, that is molecular markers, cDNA sequencing, and cDNA microarrays, to have reference data to compare with model organisms.
Large complex plant genomes remain a particularly difficult challenge for de novo assembly for various biological, bioinformatics, and biomolecular reasons. Plant genomes can be nearly 100 times larger than the sequenced mammalian genomes . The next frontier for plant genomics is to characterize the diversity of genomic variations across large populations, deeply annotate their functional elements, and develop predictive quantitative models relating genotype to phenotype.
The authors thanks Fondo Institucional de Fomento Regional para el Desarrollo Científico, Tecnológico y de Innovación, for financial support (CONACyT- FORDECyT 193512).