Functional genomics aim to discover the biological function of particular genes and to uncover how sets of genes and their products work together. Transgenic plants are proving to be powerful tools to study various aspects of plant sciences. The emerging scientific revolution sparked by genomics based technologies is producing enormous amounts of DNA sequence information that, together with plant transformation methodology, is opening up new experimental opportunities for functional genomics analysis.
2. Plant functional genomics methods
The main methods of Plant Functional Genomics are as follows.
2.1. Functional annotations for genes
Gene function prediction is based on comparison of genomes and proteomes with searching homologies within different species to gene of interest with known functions from nucleotide and amino acid databases. Putative genes can be identified by scanning a genome for regions likely to encode proteins, based on characteristics such as long open reading frames, transcriptional initiation sequences, and polyadenylation sites. A sequence identified as a putative gene must be confirmed by further evidence, such as similarity to cDNA or EST sequences from the same organism, similarity of the predicted protein sequence to known proteins, association with promoter sequences, or evidence that mutating the sequence produces an observable phenotype.
2.2. Gene-targeted and site-directed mutagenesis. Reverse genetics methods (loss of function)
Using transgenic plant with insertion/deletion or site-specific mutations. Host gene is replaced with mutant allele. The most conventional approach to the analysis of gene function is loss-of-function mutagenesis by chemicals or fast neutrons that introduce random mutations or deletions in the genome (Ostergaard and Yanofsky 2004).
Transferred DNA (T-DNA) tagging or transposon tagging methods were developed to generate loss-of-function mutations because these tag sequences can be used to identify the genes disrupted by these elements (Sundaresan and Ramachandran 2001; Sussman et al. 1999). However, because many plant genes in Arabidopsis, rice, and other plants belong to gene families (Goff et al. 2002; Kaul et al. 2000), the characterization of gene functions by single-gene mutagenesis is not always possible. Many mutants generated by single-gene disruption do not show clear phenotypes because of genetic redundancy.
2.3. Overexpression of normal gene in transgenic plants (gain of function)
Gain-of-function approaches have been used as an alternative or complementary method to loss-of-function approaches as well as to confer new functions to plants. Gain-of-function is achieved by increasing gene expression levels through the random activation of endogenous genes by transcriptional enhancers or the expression of individual transgenes by transformation. Gain-of-function mutagenesis is based on the random insertion of transcriptional enhancers into the genome or the expression of transgenes under the control of a strong promoter (Matsui et al. 2006; Nakazawa et al. 2003; Weigel et al. 2000). In this approach, phenotypes of gain-of-function mutants that overexpress a member of a gene family can be observed without interference from other family members, which allows the characterization of functionally redundant genes(T. Ito and Meyerowitz 2000; Nakazawa et al. 2001).
Alternatively it is possible to overexpress mutant forms of a gene that interfere with the (wildtype) genes function. Over expression of a mutant gene may result in high levels of a non-functional protein resulting in a dominant negative interaction with the wild type protein. In this case the mutant version will out compete for the wild type proteins partners resulting in a mutant phenotype.
The advantages of gain-of function approaches in comparison to loss-of-function for the characterization of gene functions include the abilities to (a) analyze individual gene family members, (b) characterize the function of genes from nonmodel plants using a heterologous expression system, and (c) identify genes that confer stress tolerance to plants that result from the introduction of transgenes.
The first gain-of-function approach in plants was the activation-tagging system (Kakimoto 1996). In this system, T-DNA that harbors strong promoter or enhancer elements is randomly integrated into the plant genome. The introduced promoter or enhancer elements activate genes near the site of insertion.
Other recently developed gain-of-function approaches include cDNA overexpression and open reading frame (ORF) overexpression systems. In these approaches, cDNAs from Cdna libraries, representative full-length cDNAs (fl-cDNAs), or ORFs are strongly expressed when they were cloned downstream of a strong promoter.
The production of a large population of gain-of-function mutants can accelerate the high-throughput screening of desired mutants and the characterization of gene functions.
In the activation-tagging method, plant genes are randomly activated to produce gain-of function mutants. In this strategy, the promoter or enhancer elements from the cauliflower mosaic virus (CaMV) 35S gene have been exploited (Odell et al. 1985). Genes near the insertion site are activated under the control of enhancer elements.
After the selection of mutants from the population of transformants, T-DNA insertion sites are determined to identify candidate genes.
Plasmid rescue, inverse PCR, or adapter PCR methods can be used to recover the genomic fragments near the T-DNA right and left border sequences (Spertini et al. 1999; Yamamoto et al. 2003). TAIL-PCR is also an efficient method to determine T-DNA insertion sites (Singer and Burke 2003)
Although many algorithms have been developed to predict the presence of transcriptional units within the genome, the accuracy of such predictions is still limited. Empirical information is required to correct annotation, and the main source of experimental information to achieve this is derived from RNA transcript analysis. Notable progress in Arabidopsis genome annotation has been made by the generation of expressed sequence tags, fl-cDNAs (Seki et al. 2002), and whole genome tiling array studies (Toyoda and Shinozaki 2005). The fl-cDNAs are also important as a resource for functional genomics, i.e., in the identification of gene function, because they contain all the information needed for the production of functional RNAs and proteins.
Approximately 240,000 Arabidopsis fl-cDNA clones have been generated (Sakurai et al. 2005; Seki et al. 2002) using the biotinylated CAP trapper method together with trehalose-thermoactivated reverse transcriptase (Carninci et al. 1996; Carninci et al. 1997; Carninci et al. 1998). Large sets of fl-cDNA clones have also been produced from several plants such as rice (Kikuchi et al. 2003), wheat (Ogihara et al. 2004), poplar (Nanjo et al. 2007), soybean (Umezawa et al. 2008), barley (K. Sato et al. 2009), cassava (Sakurai et al. 2007), sitka spruce (Ralph et al. 2008), Physcomitrella patens (Nishiyama et al. 2003), and Thellungiella halophila (Taji et al. 2008).Well-characterized collections of cDNAs play an essential role in defining the function of genes and proteins in plants. The cDNA overexpression system is one of the approaches that use cDNA resources.
Progress in sequencing technology has revealed the genome sequences of many plant species that include Arabidopsis, rice, poplar, grape, papaya, and sorghum (2000; Jaillon et al. 2007; Ming et al. 2008; Paterson et al. 2009; Sasaki et al. 2002; Tuskan et al. 2006). A functional genomics approach is now required to clarify the function of genes in these plant species.
However, transgenic approaches for both forward and reverse genetic studies are not yet practical in many plants in which transformation methodology is inefficient or not available. A heterologous expression approach provides a solution for the high-throughput characterization of gene functions in these plant species.
One study used approximately 10,000 nonredundant fl-cDNA clones from the RIKEN Arabidopsis fl-cDNA collection (Seki et al. 2002). A representative of each flcDNA was mixed at approximately the same molar ratio to generate a cDNA mixture and then cloned into an expression vector under the control of the CaMV 35S promoter. This flcDNA expression library was used to transform Arabidopsis plants by in planta transformation. In these transgenic plants, fl-cDNAs are randomly expressed in individual Arabidopsis plants so that each plant carries one (or more) fl-cDNA(s).
The introduced fl-cDNAs can be cloned easily using vector-specific primers after the isolation of mutants. Thus, the cDNA that caused the mutant phenotype can be directly linked to a function.
The full-length cDNA over-expressing gene (FOX) hunting system is an alternative gain of- function approach that uses fl-cDNAs. The FOX hunting system was applied for the high-throughput analysis of rice genes by heterologous expression in Arabidopsis (Matsui et al. 2009). The efficient, rapid, and high-throughput transformation system developed in Arabidopsis, together with the short generation time and compact size of this plant, makes Arabidopsis an ideal host plant.
These advantages have enabled researchers to express heterologous genes in Arabidopsis to analyze their functions.
Whole genome sequencing makes it possible to predict the presence of genes in the genome. Of particular interest are the gene elements that encode proteins, called ORFs. Because ORFs can be distinguished from fl-cDNAs by their lack of 5_ and 3_ untranslated region (UTR) sequences, they can be considered a minimal unit of the gene that encodes information on the functional protein. The Saccharomyces cerevisiae ORFeome project was the first attempt to verify the genome annotation at the genomic scale and to clone all its predicted ORFs (Heyman et al. 1999). The ORF collection has been created for functional analysis in various organisms, e.g., RNAi approaches in Caenorhabditis elegans (Piano et al. 2005), cellular localization studies of YFP/GFP fusion proteins in Schizosaccharomyces pombe (Matsuyama et al. 2006), GFP fusion proteins in Escherichia coli (Kitagawa et al. 2005), and proteomics in human (Collins et al. 2004; Rual et al. 2004). These ORF clone collections can facilitate the large-scale analysis of individual genes.
2.4. Studying gene expression using DNA-RNA hybridization, gene silencing
Transgene expression in pair with reporter gene under control of inducible promoter allows to reveal temporal functional effects of gene expression and the compartmentalization of transgene products. The gene silencing techniques (also known as RNA-interference) allow to achieve temporary disrupting effects of gene expression (gene knockdown). This procedure offer the possibility to explore gene expression more precise.
Because transgene-induced RNAi has been effective at silencing one or more genes in a wide range of plants, this technology also bears potential as a powerful functional genomics tool across the plant kingdom.
RNA-induced gene silencing (RNAi), was originally observed as unusual expression patterns of a transgene designed to induce overexpression of chalcone synthase in petunia plants (Napoli et al. 1990).
In the years following this observation, experiments in many model systems contributed to rapid advancements in understanding the underlying mechanisms, and RNA-mediated gene silencing processes came to be collectively known as RNA interference (RNAi). It is known that the ‘triggers’ for RNAi were small RNAs, 21–25 nts in length, that were processed from longer, double-stranded (ds) RNAs by endonuclease proteins referred to as dicers (Fire et al. 1998; Hamilton and Baulcombe 1999; Mello et al. 2001; Zamore et al. 2000). These siRNAs cause direct degradation of mRNAs in a homology dependent manner and lead to post-transcriptional silencing of the silencing target. Other gene silencing methods are direct heterochromatin formation and DNA methylation at regulatory sequences for the target to be silenced, which alsoinduce transcriptional silencing of target loci in a homology dependent fashion (reviewed by ref. (Eamens et al. 2008). Now, it is understood that RNAi is an evolutionarily conserved mechanism for gene regulation that is critical for many examples of growth and development.
There are multiple pathways by which small RNA molecules can influence gene expression in plants, at both the transcriptional and post-transcriptional levels. These pathways vary in their sources of small RNAs and specific mechanisms of silencing (S. W. L. Chan 2008; Eamens et al. 2008; Verdel et al. 2009).
Because transgene-induced RNAi has been effective at silencing one or more genes in a wide range of plants, this technology also bears potential as a powerful functional genomics tool across the plant kingdom. A common strategy for functional genomics projects is to generate lines that are deficient for the activity of a subset of genes, and test the knock down lines for phenotypes to characterize the function of the knocked down gene.
In many cases, a single inverted repeat transgene can be designed to silence multiple, closely related genes (Springer et al. 2007).
To induce transcriptional silencing with a transgene, a typical strategy involves designing a construct such that a dsRNA is generated which bears homology to the promoter region of the intended silencing target (Mette et al. 2000). Herein, this method of silencing will be referred to as promoter directed RNA silencing.
To induce post-transcriptional silencing with a transgene, a portion of the coding region of the gene is typically introduced into an inverted repeat (IR) construct, and expression of that transgene will result in a dsRNA with homology to the coding region of the intended silencing target (McGinnis et al. 2005). This type of silencing is likely mediated by components of the trans-acting siRNA pathway in plants (reviewed by ref. Verdel et al. 2009). Herein, this method of silencing will be referred to as coding region directed RNA silencing.
2.5. Analysis of spatial and temporal expression of studied gene
Genomic studies tend to be done at the whole tissue/organ level due to the ease of collecting samples and/or the lack of tools necessary to isolate sufficient quantities of specific cell or tissue types. Recent studies, however, have shown that most transcriptional responses to environmental stimuli are cell-type specific (Dinneny 2008; Gifford et al. 2008). In addition, the many examples of ion-channels, hormone biosynthetic enzymes and signaling components with spatially complex expression patterns clearly illustrate the need to study all aspects of plant biology at high-spatial and temporal resolution to fully understand the plant–environment interaction.
The root of Arabidopsis provides an excellent system for generating and utilizing such tools due to the simple and stereotypical organization of tissues and cell types. Specific cell layers in the Arabidopsis root have been engineered to express green fluorescent protein (GFP). Fluorescence- activated cell sorting (FACS) can then be used to enrich for GFP-positive cells (Birnbaum et al. 2003; Birnbaum et al. 2005). This method has been used to characterize the global transcriptional profiles of nearly all cell types in roots grown under standard conditions (Birnbaum et al. 2003; Brady et al. 2007) and to characterize transcriptional changes that occur in these cell types in response to salt stress, iron deprivation and nitrogen treatment (Dinneny et al. 2008; Gifford et al. 2008). A detailed description of these studies can be found in the following reviews of Dinneny (2010) and Iyer-Pascuzzi and co-workers (2009).
Genetically-encoded fluorophores offer a vast tool kit to study in vivo molecular events such as protein localization and gene expression. Fluorescent proteins have also been engineered to act as biosensors, which either emit fluorescence in response to a specific biological stimulus or undergo a change in intrinsic fluorescence intensity (Frommer et al. 2009).
Transgene expression is usually driven by a constitutive promoter. Thus, high expression levels in inappropriate tissue or developmental contexts might occur. This misexpression can cause the ectopic expression of endogenous genes and might result in a phenotype that is not related to the authentic functions of the transgene. In some cases, this misexpression can lead to the incorrect functional annotation of genes. Tissue-specific expression can provide information on intracellular events in each tissue. Replacing the CaMV 35S promoter with tissue-specific promoters is another way to analyze gene function in certain tissues.
Two component systems have been developed for conditional gene activation or silencing (Brand et al. 2006). They combine an activator locus that codes for an artificial transcription factor expressed in restricted tissues at precise developmental times.
The activation or the ectopic expression of developmentally controlled transcription factors sometimes causes an embryonic or seedling lethal phenotype, making it difficult to analyze the function of the gene. Thus, controlled gene expression by an inducible system might be an efficient approach to identify these genes (Zuo et al. 2002).
Microarrays allow the identification of candidate genes involved in a given process based on variation between transcript levels for different conditions and shared expression patterns with genes of known function. With appropriate controls and repeated experiments, significant data are obtained on gene expression profiles under various conditions (including stresses) or in various organs. Because of the large quantity of data produced by these techniques and the desire to find biologically meaningful patterns, bioinformatics is crucial to analyze functional genomics data. However, the DNA microarray and bioinformatics data are not sufficient for determining correct expression profiles due to limited accuracy of the obtained data. Next stage of investigations explores the properties and functions of selected genes. In this case, a transgenic plant construction is one of the most informative techniques.
2.7. Next generation sequencing
Previously, DNA sequencing was performed almost exclusively by the Sanger method, which has excellent accuracy and reasonable read length but very low throughput. Sanger sequencing was used to obtain the first sequence of the human genome in 2001 (Lander et al. 2001; Venter et al. 2001). Shortly thereafter, the second complete individual genome (James D. Watson) was sequenced using next-generation technology, which marked the first human genome sequenced with new Next Generation Sequencing (NGS) technology (Wheeler et al. 2008). A common strategy for NGS is to use DNA synthesis or ligation process to read through many different DNA templates in parallel (Fuller et al. 2009). Therefore, NGS reads DNA templates in a highly parallel manner to generate massive amounts of sequencing data but, as mentioned above, the read length for each DNA template is relatively short (35–500 bp) compared to traditional Sanger sequencing (1000–1200 bp). NGS technologies have increased the speed and throughput capacities of DNA sequencing and, as a result, dramatically reduced overall sequencing costs (Metzker 2010).
Current NGS approaches can be classified into three major categories:
DNA-Seq. Genome-based sequencing yielding genomic deletions and rearrangements, copy-number variations (CNV) of smaller regions or elements, and single-nucleotide polymorphisms (SNPs).
RNA-Seq. RNA-Sequencing, yielding genome-wide and quantitative information about transcribed regions (exons, and subsequently transcripts).
Chromatin-immunoprecipitation (ChIP)-Seq. a) transcription factor (TF)-based ChIP, yielding genome-wide information about the physical binding sites of individual TFs to within a few hundred base pairs. b) Epigenetic ChIP (DNA methylation and/or histone modifications), yielding information about modifications and the accessibility of genomic regions to TFs and other factors.
The inclusion of NGS-based transcriptome sequencing for ChIP of transcription factor binding and epigenetic analyses (usually based on DNA methylation or histone modification ChIP) completes the picture with unprecedented resolution enabling the detection of even subtle differences such as alternative splicing of individual exons.
Next-generation sequencing technologies have found broad applicability in functional genomics research. Their applications in the field have included gene expression profiling, genome annotation, small non-coding RNA (ncRNA) discovery and profiling, and detection of aberrant transcription, which are areas that have been previously dominated by microarrays. Thus, functional genomics and systems biology approaches will benefit from the enormous data density intrinsic to NGS applications, which will beyond doubt play an important role both in definition as well as verification of mathematical models of biological systems such as a cell or a tissue.
As mentioned above the inventory of methods used to study gene product functions in vivo (i.e. in a living organism) includes gene silencing, induced mutagenesis, reporter gene strategy, microarrays, and some others. However, there are some limitations inherent to this type of approach. First of all, physiologically essential genes cannot be switched off, and the induced mutagenesis can lead to concomitant mutations. The use of microarrays can lead to misinterpretation of the results since changes in transcription are not always accompanied by changes in protein level (Mittler et al. 1998). Moreover, the transcription level fails to reflect post-translation modifications of protein products which often occur in vivo. It is also worth to mention that when an enzyme possesses many isoforms, it is difficult to measure the activity of each of them in vivo (Slakeski et al. 1990). In view of the above-mentioned limitations, development of novel models for functional genetics which will aid to overcome these difficulties is deemed very much desirable.
One of such models may be the approach that is developed in our laboratories that employ transgenic plants that constitutively express bacterial genes, which code enzymes that are functionally homologous to plant enzymes. Such an approach was proposed and used in our laboratory since mid-1980s (Piruzian et al. 1983; Piruzian and Andrianov 1986). It involves several stages: search a cloning of a gene of interest, sequencing, sequence modification (if needed, e.g. when codon usage in the gene is different from that in the model organism), gene transfer into the model organism, and studies of biochemical and phenotypic changes that entail expression of the foreign gene. Such an approach is feasible owing to the similarity of metabolic pathways and gene networks that regulate the activities of pro- and eukaryotic organisms under normal conditions and under exposure to various biotic and abiotic stresses. In addition, the use of bacterial genes helps to avoid many problems that arise during cloning, modifications and expression of eukaryotic genes in plants, whereas the constitutive nature of bacterial gene expression allows revealing “hot spots” of action of the homologous plant enzymes.
3. Usage of the methods of functional genomics for studying fundamental and applied aspects of plant life
3.1. Biotic stress tolerance
Activation tagging has been used for the isolation of mutants with resistance to biotic stress. For example, CDR1-D is a mutant that is resistant to sprayed suspensions of virulent Pseudomonas syringae pathovar tomato (Pst) (Xia et al. 2004). CDR1 encodes an extracellular aspartic protease, which is a member of a large family of aspartic proteases in Arabidopsis. CDR1 functions in the production of a systemic signal that induces basal defenses. Another mutant, FMO1–3D, showed enhanced resistance to virulent Pst DC3000 (Koch et al. 2006). This phenotype is the result of the overexpression of a gene that encodes a class 3 FMO protein.
Recently we have proposed the model for studying the role of plant dioxygenases. Phenolic compounds serve as antioxidants and protect plants from active oxygen species. The content of phenolic compounds changes as plants grow and get mature and in response to biotic and abiotic influences, and these changes are achieved through modulation of enzymatic activities involved in their synthesis and degradation. Enzymes that take part in oxidation of aromatic compounds include dioxygenases (Tsoi et al. 1988). These enzymes oxidize phenolic compounds by breaking the aromatic ring, and thus enable subsequent biodegradation of phenols. There is evidence that plant dioxygenase (coded for by the lls gene of maize) may participate in the hypersensitive response of the plant to a pathogen attack (Lawton and Maleck 1998). For a study of the role played by dioxygenase in plants we have chosen the bacterial gene nahC (Y14173) of Ps. putida, coding for 1,2-dehydronaphtalene dioxygenase. Our choice was due to the fact that this enzyme possesses broad substrate specificity and can also use pyrocatechin as substrate (Tsoi et al. 1988), thus allowing to model a maximum number of dioxygenase isozymes. The expression of bacterial 1,2-dehydronaphthalene dioxygenase (coded by the nahC gene) in tobacco plants resulted in marked phenotypic and morphologic changes: chlorosis of the leaves, development of necrotic spots, delayed rooting and growth, and early flowering (Piruzian et al. 2002). Data on expression of bacterial 1,2-dihydroxynaphtalene dioxygenases in plants have not been reported in the literature. The necrotic spots on leaves of transgenic plants could have resulted from accumulation of phenolic substances. The above-mentioned phenotype and morphology changes suggested that the expression of bacterial dioxygenase resulted in alteration of the level of phenolic compounds in the transgenic plant cells. Measurements of phenolic acid content indicate that normal metabolism of phenolic compounds is disturbed in the plants, and the disturbance apparently results in induction of a stress response and appearance of the necrotic spots. In our opinion, such transgenic plants are a promising model for the study of mechanism of genome functioning under normal conditions and under stress, as well for the study of functions of phenolic compounds.
3.2. Abiotic stress tolerance
Environmental stresses are the major factors adversely affecting plant growth and development as well as productivity. Of the various abiotic stresses, drought and osmotic stress cause considerable agronomic problems by limiting crop yield and distribution world-wide (Chaves and Oliveira 2004).
Drought and osmotic stress induce a range of alterations at the molecular, biochemical, and cellular levels in plants, including stomatal closure, repression of photosynthesis, accumulation of osmolytes, and the inducible expression of genes involved in stress tolerance (Shinozaki and Yamaguchi-Shinozaki 2007).
The accumulation of proline by plants is a common physiological indicator and occurs under various abiotic stresses. There is an increasing body of evidence supporting the role of proline as a compatible osmolyte that maintains cellular osmotic adjustment and stabilizes the structure of proteins and membrane integrity (Verbruggen and Hermans 2008). Overexpression of different genes has been shown to significantly enhance proline levels in transgenic rice and improve their tolerance to environmental stresses ( Ito et al. 2006; Liu et al. 2007; Pasquali et al. 2008; Xiang et al. 2007; Xu et al. 2008; Chen et al. 2009).
The transference of a single gene encoding a specific stress protein does not always result in sufficient expression to produce useful tolerance, because multiple and complex pathways are involved in controlling plant drought responses (Bohnert et al. 1995) and because modification of a single enzyme in a biochemical pathway is usually contrasted by a tendency of plant cells to restore homeostasis (Djilianov et al. 2002). Targeting multiple steps in a pathway may often modify metabolite fluxes in a more predictable manner. Another promising approach is therefore to engineer the overexpression of genes encoding stress inducible transcription factors.
There is increasingly more experimental support for the manipulation of the expression of stress-related transcription factor genes as a powerful tool in the engineering of stress-tolerant transgenic crops. This would, in turn, lead to the up-regulation of a series of stress-related genes under their control in transgenic plants (P. K. Agarwal et al. 2006).For example, the overexpression of transcription factor genes, such as ZFP252, SNAC1, OsNAC6, OsDREB1A, and HvCBF4, could enhance rice tolerance to different environmental stresses (Nakashima et al. 2007; Oh et al. 2007; Xiong et al. 2006; Xu et al. 2008; Yamaguchi-Shinozaki et al. 2006).
Following the application of microarray technology, several hundred stress induced genes, mainly in the model plant Arabidopsis thaliana, have been identified as candidates for manipulation (Shinozaki and Yamaguchi-Shinozaki 2007) and have been classified into three groups (Bhatnagar-Mathur et al. 2008): (a) genes encoding proteins with a known enzymatic or structural function. Examples include enzymes for synthesis of osmoprotective compounds, late embryogenesis abundant (LEA) proteins, osmotins, chaperons, channels involved in water movements through cell membranes, ubiquitins, proteases involved in protein turnover, and detoxifying enzymes; (b) genes with as yet unknown functions; and (c) regulatory genes, such as those coding for kinases, phosphatases and transcription factors.
Mutants with abiotic stress tolerance have been isolated by activation tagging and include the edt1 mutant recently identified under drought conditions (Ahad et al. 2003; Ahad and Nick 2007; Pereira et al. 2004; Yu et al. 2008). This mutant showed a drought tolerant phenotype and reduced stomatal density. The enhanced drought tolerance of edt1 was associated with an increase in the expression of the gene that encodes the transcription factor HDG11. The overexpression of ArabidopsisHDG11 in tobacco can also confer drought tolerance and reduced leaf stomatal density (Zhang 2003).
FOX lines that consist of 43 stress-inducible transcription factors were constructed to elucidate stress-related gene function (Fujita et al. 2007). The T1 generation was screened for salt-stress-resistant lines and led to the identification of salt-tolerant lines. Among them, four lines harbored the same transgene, AtbZIP60, which encodes a basic domain/leucine zipper class transcription factor. The overexpression of AtZIP60 leads to the upregulation of stress related genes, which suggests an important role for this transcription factor in stress-responsive signal transduction.
Transcription factors play an important role in plant development and stress responses. The Arabidopsis genome encodes more than 1,500 transcription factors. gain-of-function mutagenesis is an ideal approach to uncover the function of transcription factors (J. Z. Zhang 2003).
Weiste and colleagues (Weiste et al. 2007) generated an ORF collection composed of members of the ERF transcription factor family. They constructed a destination vector to enable ectopic expression driven by theCaMV35S promoter and included a HA-tag sequence to reveal transgene-specific expression. Using this library, they generated transgenic Arabidopsis plants that overexpress HA-tagged ORFs of the ERF transcription factor family. This approach yielded eight plants that show enhanced tolerance to oxidative stress resulting from the overexpression of the same ERF.
Typically a gene coding for a transcription factor in Arabidopsis is isolated, characterized and shown to improve drought response when overexpressed. The gene is then transferred to a crop plant where it often confers the same drought-tolerant phenotype. The HARDY (HRD) gene, coding for an AP2/ERF-like transcription factor (Pereira et al. 2007) is an example of this approach. Arabidopsis plants with a gain-of-function mutation in the HRD gene (hrd-D mutants) are drought resistant, salt-tolerant, and overexpress abiotic stress marker genes. Overexpression of the same gene in rice significantly improves water use efficiency both under well-watered conditions (50–100% increase) and under drought (50% increase). These plants also show enhanced photosynthetic assimilation and reduced transpiration (Pereira et al. 2007). HRD gene overexpression conserves drought tolerance in both dicots and monocots.
In other cases a gene coding for a transcription factor is isolated and characterized in Arabidopsis, but its orthologue gene in the crop plant of interest is identified and made to overexpress. For example Nelson et al. (Nelson et al. 2007) showed that overexpression of the Arabidopsis CAAT box-binding transcription factor AtNF-YB1 confers improved performance in Arabidopsis under drought conditions. They next overexpressed the orthologue of AtNF-YB1 (called ZmNF-YB2) in maize and found that, under simulated drought conditions, the altered maize plants produced up to 50% more than unmodified plants (Nelson et al. 2007).
A high-throughput gain-of-function approach has been applied to isolate salt stress tolerance genes using cDNAs of Thellungiella halophila (Du et al. 2008). Thellungiella halophila is a type of salt cress similar to Arabidopsis that can grow under high salt conditions. The cDNA library was prepared after salt stress treatment. Approximately 125,000 transgenic Arabidopsis that express Thellungiella halophila cDNAs under CaMV 35S promoter were generated. Novel salt stress tolerance genes were isolated from this mutant collection.
Ethylene response factor (ERF) genes have been successfully introduced into rice, generating transgenic rice with enhanced tolerance to biotic and abiotic stresses. For example, the tobacco OPBP1 (an AP2/ERF transcription factor) can enhance salt tolerance and disease resistance of transgenic rice (Chen and Guo 2008); ectopic expression of the Arabidopsis HARDY gene in rice improves water use efficiency and the ratio of biomass (Pereira et al. 2007); overexpression of rice DREB transcription factor (OsDREB1F) increases salt, drought, and low temperature tolerance in rice (Chu et al. 2008). Overexpression of transcription factor Sub1A-1 in a submergence-intolerant Oryza sativa ssp. japonica conferred transgenic plants with enhanced submergence tolerance (Ronald et al. 2006). The ethylene response factors SNORKEL1 and SNORKEL2 encoding ERFs trigger internode elongation and allow rice to adapt to deep water (Ashikari et al. 2009).
By overexpressing a Athsp101 protein, Katiyar-Agarwal and associates ( 2003) generated a heat-tolerant transgenic rice (cv. Pusa basmati 1) line. This group showed that almost all the transgenic plants recovered after severe heat stress of 45–500C and exhibited vigorous growth during the subsequent recovery at 280C, while the untransformed plants could not recover to a similar extent.
In our experiments with salt stress tolerance, we have selected a mutant of E. coli able to grow on a medium with a high salt content. The cells of the mutant strain have a high content of proline, and it has been shown that this was due to a mutation in N-terminal region of γ-glutamyl kinase encoded by the proB gene. The mutation, which consists of single amino acid substitution (leucine is replaced by glycine) caused a conformational change in the regulatory region of the protein, made the enzyme less sensitive to the feedback inhibition by proline. We designated the gene coding for the mutant γ-glutamyl kinase as proBosm (Neumyvakin et al. 1990, 1991). E. coli genes proBosm and proA have been transferred into tobacco plants, each under control of a strong constitutive promoter CaMV 35S, which contains a duplicated enhancer sequence, or the Pmas promoter which induces gene expression predominantly in roots (Sokhansandzh et al. 1997). The rationale for using a mutant form of the prokaryotic proteins which determine a strictly define phenotype, osmotolerance, was that the phenotype of the plant model will be easy to assay. The expression of the osmotolerance phenotype in the model eukaryotic organism would be indicative that the prokaryotic protein is an ortholog of the eukaryotic protein. Transgenic plants carrying the bacterial proline operon genes had a higher resistance to the toxic proline analogue (L-azetidin-2-carboxic acid) and to the high salt stress (were capable of rooting at NaCl concentrations in the medium over 350 мМ) (Sokhansandzh et al. 1997). Thus, our results demonstrate usefulness of the proposed model and the possibility of simulating the activity of a bi-functional plant enzyme with two bacterial enzymes.
In Arabidopsis, knockout or silencing of HSP101 caused loss of the acquired thermotolerance, whereas the overexpression of HSP101 in transgenic plants improved tolerance to high temperature stress (Gurley 2000; Hong and Vierling 2000). Agarwal and co-workers (2003) provided evidence that AtHSP101 and OsHSP101 impart thermoprotection to yeast cells by dissolution of heat-induced protein aggregates. High-temperature-tolerant rice plants have also been produced by overexpressing a rice small heat-shock protein sHSP17.7 (Sato et al. 2004). Oxidative stress may accompany heat stress by the formation of ROS (Foolad et al. 2007). More recently, Qi and associates (2011) have reported that mtHsp70 over-expression suppresses programmed cell death (PCD) by maintaining mitochondrial membrane potential and preventing ROS signal amplification in rice protoplasts.
Koh and co-workers ( 2007) reported that knockout (KO) mutants of rice OsGSK1, an orthologue of Arabidopsis BIN2, showed enhanced tolerance to several abiotic stresses including high temperature. In comparison to non-transgenic plants, the wilting ratios for knock out mutants were as much as 26% lower after heat (450C) stress. Feng and associates ( 2007) raised transgenic rice plants overexpressing rice sedoheptulose-1,7-bisphosphatase SBPase. They showed that overexpression of SBPase resulted in enhanced tolerance of growth and photosynthesis to high temperatures in transgenic rice plants.
Huang and co-workers (2008b) generated transgenic tobacco expressing rice A20/AN1- type zinc finger protein gene (ZFP177). Compared to wild type tobacco, the transgenic seedlings showed higher tolerance to temperature stress.
Major efforts have been made to identify genes that are associated with drought stress in a number of plant species (Gong et al. 2010; Huang et al. 2008a; Manavalan et al. 2009; Tran and Mochida 2010; Zheng et al. 2010). In rice, identification of drought-responsive genes has been carried out by means of expression profiling studies such as microarrays, expressed sequence tags (ESTs), RNA gel blot analyses and qRT-PCR (Rabbani et al. 2003; Rabello et al. 2008; Ramachandran et al. 2008; Reddy et al. 2007; Zhou et al. 2007). As a result, hundreds of genes that were induced or suppressed by drought stress have been identified. A number of these genes have been analyzed in detail, resulting in their characters as regulatory genes, such as transcription factor (TF) and protein kinase encoding genes, whose products regulate other stress-responsive genes. Some of the identified stress-responsive genes are functional genes which encode metabolic components, such as late embryogenesis abundant (LEA) proteins and osmoprotectant-synthesizing enzymes, important for stress tolerance (Yang et al. 2010).
Recently, Yang and associates (2010) classified drought-responsive genes into three groups based on their biological functions: transcriptional regulation, post-transcriptional RNA or protein phosphorylation, and osmoprotectant metabolism or molecular chaperons. However, among the genes that are affected by drought many genes have unknown functions. Efforts will be continued to determine the functions of the unknown drought-responsive genes.
Aquaporins, which are water channel proteins that translocate water across cell membranes, have been demonstrated for their roles in various physiological processes including stomatal closure (Li et al. 2008).
The rice plasma membrane intrinsic proteins (OsPIP) proteins are subfamilies of aquaporins and are divided into two subgroups, OsPIP1 and OsPIP2. Several members of OsPIP1 and OsPIP2 subfamilies were responsive to drought and salt stresses. Transgenic Arabidopsis overexpressing OsPIP2–2 showed enhanced tolerance to salt and drought stresses (Guo et al. 2006).
Transgenic rice overexpressing Datura stramonium S-adenosylmethionine decarboxylase (adc) gene showed increased tolerance to drought due to an increase in polyamine content (Capell et al. 2004), suggesting that OsAdc1 may be a potential candidate for development of enhanced drought-tolerant rice cultivars.
In some cases however, constitutive expression of a gene normally only induced by stress, has negative effects – so-called pleiotropic effects (Chan et al. 2002; Kasuga et al. 1999; Nakashima et al. 2007) – on growth and development when stress is not present. One solution is to use inducible (rather than constitutive) promoters that allow expression of a transgene only when it is required, while it is silenced otherwise. For example constitutive expression in Arabidopsis of DREB1/CBF3, a gene coding for a transcription factor induced by osmotic stress, confers tolerance to stress, but causes severe growth retardation under normal growth conditions (Kasuga et al. 1999). However, if this gene is expressed under the control of an osmotic stress-inducible promoter like RD29A no growth retardation occurs, and the plant is highly resistant to several stress conditions (Kasuga et al. 1999).Similarly,In tomato, overexpression of the Arabidopsis CBF1 gene, encoding a transcription factor belonging to AP2/ERF family, confers increased drought, cold and oxidative stress tolerance compared to wild-type plants, but plant growth is severely affected (Chan et al. 2002). By contrast, when the same gene was placed under the control of a synthetic promoter derived from the barley HVA22 gene, it was expressed mainly under abiotic stresses, so that the plant had the same tolerance characteristics towards stresses, but plant growth under normal conditions was not affected (Lee et al. 2003).
An ideal stress-inducible promoter would be completely silenced under normal conditions, but induced by stress in a fairly short time (a few hours) after stress onset. The promoter of the Arabidopsis AtMYB41 gene, which is not expressed in any tissues under standard growth conditions but is highly induced in response to drought, salt and abscisic acid (Cominelli et al. 2008), may therefore be a very useful promoter.
3.3. Increase of productivity
The development of unique transgenic plants provides an applied angle, in making available highly nutritious "speciality crops," and also adds to the genetic resources that can be used to develop insightful knowledge base about genetic, biochemical, and physiological regulation of various metabolic pathways and functional metabolites. The transgenic tomatoes that accumulate higher polyamines, Spd and Spm, during ripening are a kind of a "gain of function" genotype, and we are using them to address the questions on the role of polyamines in fruit metabolism, in particular, their crosstalks with other functional molecules to enable higher nutritional quality of vegetables and fruits.
In this case, a fruit ripening-specific promoter was used to drive the expression of yeast SAM decarboxylase gene, with the result that the introduced gene was not active during the early growth and development of the plant but became active along with the normal ripening process of the fruit (Mehta et al. 2002).
An analysis of the principal, soluble constituents of wild-type and Spd/Spm-accumulating transgenic tomato, generated using high-resolution NMR spectroscopic methods, showed that the same metabolites were present in wild-type/azygous control tomatoes as in the transgenic tomato fruit. However, the latter conspicuously revealed differential metabolite content as compared to the controls (Mattoo et al. 2006). The red transgenic fruit were characterized by higher accumulation of the amino acids glutamine and asparagine; micronutrient choline; the organic acids citrate, fumarate, and malate; and an unidentified compound A. Compared to the control, wild-type fruit, the levels of valine, aspartic acid, sucrose, and glucose in the transgenic red fruit were reduced. These changes reflected specific alteration of metabolism, since the levels of isoleucine, glutamic acid, aminobutyric acid, phenylalanine, and fructose remained similar in the nontransgenic and transgenic fruits. Consequently, the transgenic red fruit have significantly higher fructose/glucose and acid [citrate+malate]/sugar [glucose+fructose+sucrose] ratios (Mattoo et al. 2006), consistent with higher fruit juice and nutritional quality reported in the 2 transgenics (Mehta et al. 2002), attributes favorably considered as higher quality in tomato breeding programs.
3.4. Xenobiotic tolerance
We have created transgenic plants expressing a mutant, glyphosate-resistant EPSP synthase from E. coli (Piruzian et al. 1988). The rationale for using a mutant form of the prokaryotic ortholog protein which determines a strictly define phenotype, herbicide resistance, was that the phenotype of the plant model will be easy to assay. The expression of the glyphosate tolerance phenotype in the model eukaryotic organism would be indicative that the prokaryotic protein is an ortholog of the eukaryotic protein. First we isolated an E. coli strain with resistance to glyphosate. The mutation was localized in locus aroA and was shown to result in a replacement of Ala with Pro in the bacterial EPSP synthase (Piruzian et al. 1988). We constructed expression vectors for plant transformation and obtained tobacco plants that tolerated a five-fold higher glyphosate concentration than inhibited the growth of control plants (Mett et al. 1991; Piruzian et al. 2000). Thus it was shown that with the use of transgenic plants expressing the mutant EPSP synthase, a bacterial enzyme insensitive to glyphosate could be active in the plant instead of the plant enzyme which was sensitive to the herbicide (Mett et al. 1991). This experiment also demonstrated a principal possibility of modeling plant enzymatic activity by using bacterial enzymes.
3.5. Studying processes of plant physiology
One of the best examples of the use of the activation-tagging system to identify genes involved in plant development is the isolation of YUCCA genes, which encode flavin mono-oxygenase (FMO) proteins involved in auxin biosynthesis. Six loci, which encode proteins with a role in auxin biosynthesis, have been identified in Arabidopsis by using activation-tagging technology (Pereira et al. 2002; Zhao et al. 2001). The YUCCA family consists of 11 members in the Arabidopsis genome (Cheng et al. 2006). All activation-tagged mutants of YUCCA genes showed phenotypes that were characteristic of auxin-overproducing mutants (Boerjan et al. 1995; Delarue et al. 1998). Double, triple, and quadruple loss-of-function mutants of YUCCA genes showed deleterious developmental disorders, although no single mutants displayed visible phenotypes (Cheng et al. 2006). This functional redundancy might lead to difficulties in the isolation of mutants related to auxin biosynthesis by the loss-of-function approach. Thus, gain-of-function mutagenesis is a powerful tool to elucidate the function of genes that compose a gene family.
The pap1-D Arabidopsis mutant generated by activation tagging is an intense purple color caused by the overproduction of phenylpropanoid derivatives, such as anthocyanins (Borevitz et al. 2000). The PAP1 gene encodes a member of the R2, R3 MYB transcription factor family that comprises more than 100 members in Arabidopsis (Paz-Ares et al. 1998; Weisshaar et al. 1998). The heterologous expression of Arabidopsis PAP1 can enhance the accumulation of anthocyanins in tobacco plants. The activation-tagging method has also been employed to identify a transcriptional regulator of secondary metabolites in tomato (Mathews et al. 2003). The overexpression of ANT1, which encodes a MYB-type transcription factor, caused an intense purple color in many vegetative tissues throughout development and purple spotted fruit on the epidermis and pericarp in tomato.
LeClere and Bartel ( 2001) generated Arabidopsis lines that overexpresses random cDNAs driven by the CaMV 35S promoter. They generated more than 30,000 Arabidopsis transgenic plants and isolated a mutant that showed a pale green phenotype caused by the overexpression of a truncated cDNA that encodes chloroplast ferredoxin-NADP+ reductase (FNR). This phenotype was caused by the cosuppression of endogenous genes by transgene overexpression, which led to dominant loss-of-function phenotypes.
Using Arabidopsis FOX lines, Okazaki and colleagues isolated six mutants that contained an increased number of chloroplasts in their leaves (2009). The overproduction of plastid division (PDV) proteins that regulate the rate of chloroplast division in Arabidopsis leads to an increase in the number, but a decrease in the size, of chloroplasts.
Another Arabidopsis FOX line that carries the cDNA, which encodes the cytokinin responsive transcription factor 2 (CRF2), also showed an increased number of chloroplasts.
Thus, the FOX hunting system is capable of the highthroughput characterization of gene functions.
A high-efficiency transformation method has been developed in rice. This makes rice an ideal host plant for the FOX hunting system (Ichikawa et al. 2007). More than 28,000 rice fl-cDNAs have been generated (Kikuchi et al. 2003). Approximately 12,000 rice lines have been generated in which 13,980 independent fl-cDNAs were overexpressed under the control of the ubiquitin promoter. Among several phenotypes in the T0 generation, three dwarf lines that carry the same novel gibberellin 2-oxidase (GA2ox) gene were isolated.
The ORF overexpression approach can also be used to investigate the function of putative genes identified by computer-based means. Small secreted peptides, less than 150 amino acids long, were predicted to identify genes involved in plant development in Arabidopsis (Hara et al. 2007). Plants that overexpress 153 predicted genes that encode these small secreted peptides were generated, and this approach led to the identification of the Epidermal Patterning Factor 1 (EPF1) gene. EPF1 was expressed in stomatal cells, and precursors may be involved in the control of stomatal patterning through the regulation of asymmetric cell division.
The T-DNA vector pER16, which contains the estradiol-inducible promoter, was used for the conditional activation of nearby genes by the addition of estradiol. This system was used to identify the gene PGA6, which encodes WUSCHEL (WUS), a homeodomain protein involved in the regulation of stem cell fate in Arabidopsis shoot and floral meristems (Mayer et al. 1998).
Recently we have proposed a new strategy for creating experimental models for plant functional genomics. It is based on the expression in transgenic plants of genes from thermophilic bacteria encoding functional analogues of plant proteins with high specific activity and thermal stability. We have validated this strategy by comparing physiological, biochemical and molecular properties of control tobacco plants and transgenic plants expressing genes of β-glucanases with different substrate specificity. We demonstrate that the expression of bacterial β-1,3–1,4-glucanase gene exerts no significant influence on tobacco plant metabolism, while the expression of bacterial β-1,3-glucanase affects plant metabolism only at early stages of growth and development. By contrast, the expression of bacterial β-1,4-glucanase has a significant effect on transgenic tobacco plant metabolism, namely, it affects plant morphology, the thickness of the primary cell wall, phytohormonal status, and the relative sugar content. We propose a hypothesis of β-glucanase action as an important factor of genetic regulation of metabolic processes in plants.
It should also be mentioned that many plant enzymes have numerous isozymes, and for this reason the activity assays as well as functional studies of particular isozymes in vivo are difficult (del Campillo 1999; Libertini et al. 2004). It should be noted that it was the use of thermostable heterologous proteins that enabled us to obtain these results. Overexpression of a homologous plant gene or a gene from a related species, apart from the additional difficulties of cloning plant genes with their exon–intron structure, could result in technical problems in detecting and assessing the activity of these proteins in transgenic plants. Thus, the use of thermostable bacterial proteins–functional analogs of plant beta-glucanase is not only adequate but also a more convenient method when compared to over-expression of a plant gene from its own genome or that of a related species.
The next our work was the construction of experimental models for studying the role of isopentyl transferases in phytohormone synthesis and plant differentiation.
It has been supposed that phytohormones, cytokinins in particular, are largely responsible for the viability of plants following exposure to abiotic stress and to pathogens. Therefore, an employment of genes whose expression alters the phytohormone balance for studying plant metabolism is deemed especially promising. One of such enzymes is isopentenyl transferase, a key enzyme of the cytokinin biosynthesis pathway. As a functional bacterial analogue of this enzyme, we have used isopentyl transferase (coded for by the T-cyt or ipt gene), the key enzyme of cytokinin synthesis from the T region of the Agrobacterium tumefaciens Ti plasmid. The product of the ipt gene is involved in crown gall formation in plants. In accordance with our strategy, we cloned the ipt gene and introduced it into the tobacco genome (Iusibov et al. 1989). Regenerated shoots at first did not form roots and were devoid of apical domination, a fact that was also reported by others. As reported by Zhang and co-workers, (1996), the expression of isopentyl transferase in transgenic tobacco plants may be controlled by auxins. We have therefore used exogenous auxin and this allowed us to obtain normal transgenic plants having an intermediate level of cytokinin in comparison with normal transgenic plants and the crown gall tissue (Makarova et al. 1997). On the whole, expression of the agrobacterial gene leads to cytokinin overproduction (a two-fold excess of total cytokinins), decrease of the abscisic acid level, elevated level of chlorogenic acid, and disturbed morphogenesis and regeneration processes in the plants (Makarova et al. 1997; Yusibov et al. 1991). The altered hormone balance naturally affected such a vitally important process as photosynthesis. In particular, we have shown that elevated cytokinin affects the expression of some plant genes. For example, we found that plants transgenic for the ipt gene had a higher level of mRNA of the chloroplast gene of the ribulose biphosphate carboxylase (RBC) smaller subunit (rbcL) (Yusibov et al. 1991). Thus, the expression of bacterial isopentyl transferase causes significant metabolic changes in the transgenic plants. These include altered hormone balance (cytokinins, abscisic acid), altered expression of some chloroplast genes (the RBC smaller subunit) involved in photosynthesis, and altered morphology of the plants.
To study the role of enzymes related to hydrocarbon metabolism in plants we have chosen the gene xylA of E. coli. This gene codes for xylose (glucose) isomerase (P00944) (EC 126.96.36.199) which converts xylose into xylulose and vice versa as well as fructose into glucose and vice versa. E. coli’ s enzyme is thermostable (Piruzian et al. 1989), a property which ensures an easy assay of the enzyme in plants. The xylА gene under control of the 35S CaMV promoter was transferred to tobacco plants using the A. tumefaciens vector system, and it was shown that an active enzyme is produced in the transgenic plants (Goldenkova et al. 2002). The plants had larger leaves, grew faster and had stronger roots than the controls. The expression of bacterial xylose (glucose) isomerase induces morphological changes in transgenic plants that correlate with changes in the expression of chloroplast genes involved in photosynthesis and maintaining the phytohormone balance. Thus transgenic plants of the XylA type represent a promising model system for studying photosynthesis as a function of phytohormone activity.
In plant functional genomics most approaches have introduced genes with a constitutive or inducible promoters, resulting in gene overexpression in transgenic plants. In some cases, however, it has been conferred by gene down-regulation by RNA interference, co-suppression or loss-of-function mutants. Each approach has advantages and disadvantages in different aspects of high-throughput characterization of gene functions. The first aspect is the basic construction strategy to produce a large population of mutant lines. The second aspect is whether mutants generated in each system can cover the vast numbers and wide variety of genes. Identification of all gene functions is the final goal of functional analysis at a genome level, and the production of mutants for this purpose. The final aspect is the enhancement of endogenous gene expression with tissue specificity. Highthroughput functional genomics is helping to shift the focus from the characterization of individual gene functions to a more systems-based holistic or synthetic approach to understand the genetic mechanisms that underlay gene regulation and complex signaling networks.
This work was supported by grant from Russian Academy of Sciences ("Biodiversity program"), and grant Russian Foundation for Basic Research (grant # 10-04-01195-Б).