ZFN-mediated genome modifications in plants.
Conventional tools induce mutations randomly throughout the cotton genome—making breeding difficult and challenging. During the last decade, progress has been made to edit the gene of interest in a very precise manner. Targeted genome engineering with engineered nucleases (ENs) specifically zinc-finger nucleases (ZFNs), transcription activator-like effector nucleases (TALENs), and clustered regularly interspaced short palindromic repeat (CRISPR) RNA-guided nucleases (e.g., Cas9) has been described as a “game-changing technology” for diverse fields as human genetics and plant biotechnology. In eukaryotic systems, ENs create double-strand breaks (DSBs) at the targeted DNA sequence which are repaired by nonhomologous end joining (NHEJ) or homology-directed recombination (HDR) mechanisms. ENs have been used successfully for targeted mutagenesis, gene knockout, and multisite genome editing (GenEd) in model plants and crop plants such as cotton, rice, and wheat. Recently, cotton genome has also been edited for targeted mutagenesis through CRISPR/Cas for improved lateral root formation. In addition, an efficient and fast method has been developed to evaluate guide RNAs transiently in cotton. The targeted disruption of undesirable genes or metabolic pathway can be achieved to increase quality of cotton. Undesirable metabolites like gossypol in cottonseed can be targeted efficiently using ENs for seed-specific low-gossypol cotton. Moreover, ENs are also helpful in gene stacking for herbicide resistance, insect resistance, and abiotic stress tolerance.
Cotton is an important source of natural fiber and has been playing a major role in economy and social structure of several countries. In addition, cotton serves as cash crop for more than 20 million farmers in Asia and Africa. Despite the availability of synthetic alternatives, cotton remains an important source of fiber because of the advantages related to cost of production and unique features offered by cotton lint. Consumption of cotton products in the world is increasing day by day with a lot of paces, but world cotton production is stagnant because of biotic and abiotic stresses. To meet the demands of the masses, production of cotton needs to be very high with good quality. Cotton is also affected by diseases, causing significant losses to industry. The most damaging diseases are Texas root rot, bacterial blight, blue disease, cotton leaf curl disease (CLCuD), and some strains of Verticillium and Fusarium wilt. Abiotic factors (heat, drought, salinity, and waterlogging) affect cotton yield, especially during early stages of plant development. Along with conventional breeding and genetic engineering, other novel techniques such as GenEd could be helpful for resistance development in cotton against biotic and abiotic stresses. GenEd tools have also been used for growth, quality, and yield enhancement in other crop plants. So, translation of this marvelous technology for improvement of fiber, quality, and yield of cotton would definitely have long-lasting benefits. In this chapter, we provide a picture of the use of GenEd tools for genetic improvement of cotton and other crop plants.
2. GenEd tools for targeted genome modification
Mutagenesis at target sites was a long-standing goal in the field of genome engineering and biotechnology. Along with chemical mutagens, transposons, recombinases, and TILLING technologies have been used historically to mutate certain genes for functional genomics and reverse genetic studies. The last decade has observed a revolution in the field of targeted genome modifications. GenEd has been found successful with equal efficiency in both plants and animals. Targeted genome modifications have modernized the field of genome engineering and biotechnology by GenEd from unicellular to multicellular and from prokaryotic to eukaryotic organisms. A diversity of organisms from bacteria to humans such as Arabidopsis thaliana , tobacco , rice , yeast (Saccharomyces cerevisiae) , fungi , zebrafish , rats , sheep , Caenorhabditis elegans , human cell lines , Drosophila , viruses [12, 13, 14], bacteria , mouse , insects , cattle , goat , pigs , tomato , grapes , potato , soybean , maize , wheat , and cotton [27, 28] have been targeted successfully with engineered proteins and nucleases.
GenEd tools like zinc-finger nucleases, transcription activator-like effectors, and CRISPR/Cas have been used massively for targeted genome modification. These GenEd reagents have the ability to search and bind specific DNA sequence and, hence, can be programmed to target any DNA sequence of choice. All of the ENs mentioned above have a catalytic ability to create double-strand breaks (DSBs) at the target DNA sequence. Zinc fingers and TALEs are fused with FokI nuclease domain to induce DSBs on dimerization, while CRISPR/Cas9 has its own catalytic activity with two nuclease domains: RuVC and HNH. DSBs at a predefined DNA sequence can be utilized efficiently for targeted genome modifications. DSBs in the DNA are repaired through cell endogenous repair systems: nonhomologous end joining (NHEJ) and homology-directed recombination. NHEJ is an error-prone repair mechanism in which DSBs are repaired with some insertions and/or deletions (Indels). On the provision of a homologous DNA template or donor DNA, the DSBs are repaired without errors in HDR fashion. HDR is an efficient pathway to make targeted insertions and/or gene corrections.
Reprograming and redesigning of artificial DNA-binding proteins and ENs have made GenEd quite an easy job. Most of the softwares are freely available online for the designing and cloning of ENs. Apart from ZFNs, TALENs, and CRISPR, other ENs such as homing endonucleases or meganucleases (DADGILAGLI) have also been used for targeted GenEd , but their applicability is very low compared to the above-mentioned nucleases.
2.1. Zinc-finger nucleases
The first targeted induction of DSBs was achieved using the natural meganuclease I-SceI, which has an 18-bp recognition site . Experiments performed in tobacco using I-SceI to introduce chromosome breaks at integrated, defective reporter genes which, upon correction by homologous recombination, confer a selectable phenotype [30, 31]. Zinc fingers were fused with FokI nuclease to create artificial endonuclease for targeting predetermined DNA sites . Zinc-finger nuclease-assisted gene targeting was first implemented in animal systems . In the late 1990, ZFNs were designed and used for the first time to target genes of Drosophila melanogaster . In case of ZFs, three DNA bases are targeted with one monomer. ZF monomers have been deciphered, and a table was built with possible combinations of three DNA bases to design ZFs against a DNA sequence (Figure 1a). Two efficient ZFN assembly platforms are available for successful designing and cloning of ZFNs: oligomerized pool engineering (OPEN)  and context dependent assembly (CoDA) . Previously, modular assembly method was used to assemble multi-finger ZFN arrays, but the efficiency was reported low owing to inefficiency for context-dependent activity.
Using two ZFN monomers results in DSB formation by a functional nuclease dimer, as initially shown for FokI endonuclease coupled to three ZFs636 recognizing 9-bp-binding sites [32, 37]. Induction of ZFN expression in Arabidopsis by heat shock during seedling development resulted in mutations at the ZFN recognition sequence. In 10% of induced individuals, mutants were present in the subsequent generation, thus demonstrating efficient transmission of the ZFN-induced mutations . Homologous recombination was measured by restoring function to a defective GUS:NPTII reporter gene, integrated at various chromosomal sites in ten different transgenic tobacco lines . ZFN-mediated gene targeting at endogenous plant genes of tobacco acetolactate synthase genes (ALS SuRA and SuRB) was observed with high frequency exceeding 2% of transformed cells. Targeting of SuR loci resulted in resistance to imidazolinone and sulfonylurea herbicides with allelic mutations .
Co-expression of ZFNs with heterologous donor molecule led to precise targeted addition of an herbicide tolerance gene at the intended locus in maize. Mutant maize plants also transmitted genetic changes to further generation . HDR-based gene replacement has been achieved successfully by replacing a 7-kb fragment flanked by two ZFN cutting sites with a 4-kb donor cassette, which integrates genes of kanamycin resistance and red fluorescent protein (RFP) . In the last decade, artificial zinc-finger proteins (AZPs) have been used against begomoviruses (beet severe curly top virus (BSCTV) and tomato yellow leaf curl virus (TYLCV), respectively) [43, 44]. This strategy can be used for suppression of begomoviruses infecting cotton plants . Moreover, ZFNs and AZPs can be useful for gene insertion, deletion, replacement, and functional genomics studies in cotton. Selected reports of ZFN-mediated genome modification are given in Table 1.
2.2. TALEs and TALENs
TALE proteins are bacterial proteins (plant pathogens: genus Xanthomonas) and produced to bind DNA in the infected plant to hijack the expression system in a way that attenuate the disease process. Natural TALEs have a binding domain and an effector domain which binds DNA sequence and alter expression system of host, respectively. The binding domain consists of variable number of amino acid repeats in which each repeat contains 33–35 amino acids and recognizes a DNA base pair. The DNA recognition is specifically modulated by two hypervariable amino acid residues (also called as repeat variable diresidues (RVDs)) at positions 12 and 13 in each repeat. Therefore, TALE repeats can be engineered by varying the RVDs to create a TALE protein that can bind a specific sequence in the genome (Figure 1b).
In case of TALEs and TALENs, the designing and assembly can be done with more ease and comfort. Owing to the single base-pair specificity of the TALE RVDs, modular assembly has been used frequently. Golden gate assembly of Cermak et al. has advantage of being fast, simple, and cost-effective . Many free online softwares are available to design TALEs and TALENs . The assembly of TALENs has also been offered on commercial basis by different companies, and many kits are available to construct TALENs against target sequence . These TALE domains can be linked with a designed effector domain (nuclease like Fok1, repressor like KRAB, or activator like VP64) to create a chimeric protein capable of targeted genome manipulation. Successful genome modifications have been achieved using TALENs in different plant species (Table 2).
|Tobacco||EBE of Hax3||NHEJ|||
|Rice||EBE (AvrXa7 and PthXo3)||NHEJ|||
|N. benthamiana||FucT, XylT||NHEJ|||
|Tobacco||SuRA, SuRB||NHEJ, HDR|||
|Rice||OsMST8, OsMST7, OsEPSPS||NHEJ|||
2.3. CRISPR/Cas RNA-guided system
CRISPR/Cas is an RNA-guided endonuclease (RGEN) system. RGENs are the easiest and simplest to design and clone. Cas9-gRNA is based on simple Watson and Crick base pairing of RNA-DNA, and 20-bp guide RNA is designed to target a DNA sequence of interest (Figure 1c). The efficiency of RNA-guided Cas9 system is remarkable to rewrite genomic sequence for genetic improvement of crops against different threats of multiple origins. Due to the ease of designing, simplicity in cloning, and cost-effectiveness, CRISPR/Cas is the most widely used EN.
CRISPR/Cas has emerged as a new tool for targeting DNA using single-guide RNA (sgRNA), enabling genetic editing of any region in the genome [46, 47]. This single RNA-single protein CRISPR system is derived from a natural microbial adaptive immune system that uses RNA-guided nuclease to recognize and cleave foreign DNA elements. This system consists of two components, a chimeric sgRNA and a CRISPR-associated protein (Cas9), which specifically unwinds and cleaves the target DNA, with the cleavage site dictated solely by complementarity to the sgRNA . The only restriction in this system to target a DNA sequence is the presence of protospacer adjacent motif (PAM) region. CRISPR system has been proven to be incredibly valuable for site-specific genome engineering. Recently, in bacterial and human cells, nuclease’s deactivated version of Cas9 protein called as dCas9 was created for programmable RNA-dependent DNA-binding protein . Targeting nuclease-inactive Cas9 protein (dCas9) to coding region of a gene can block the binding and elongation of RNA polymerase, leading to dramatic suppression of transcription. Moreover, it has also been reported that dCas9 can also be modulated to recruit different protein effectors (activators or repressors) to DNA in a highly specific manner  to activate (CRISPRa) or suppress (CRISPRi) a gene. More recently, fusing dCas9 with Krüppel-associated box (KRAB) repressor domain resulted in an efficient transcriptional interference [50, 51]. In addition, CRISPRi was also used for multiplexed control of endogenous genes  and stable repression of genes with silencing efficiency typically achieved by RNAi while minimally impacting transcription of nontargeted genes. CRISPR/Cas9 has the efficiency to target the green fluorescent protein (GFP) gene within the genome of transgenic cotton line with single copy of GFP gene incorporated previously . Multiplexing ability of CRISPR/Cas system has given a distinction to this system. Multiplexed, targeted gene editing has been achieved in Nicotiana benthamiana for glycol engineering and monoclonal antibody production . CRISPR/Cas system has been used efficiently for GenEd in plants (Table 3).
|Plant species||Targeted gene||Modification||References|
|Arabidopsis||PDS3, FLS2, RACK1b, RACK1c||NHEJ|||
|Barley, cabbage||HvPM19, BoIC.GA4.a||NHEJ|||
|C. reinhardtii||CpFTSY, ZEP||NHEJ|||
|Cotton||GFP (transgene), CLA1, VP||NHEJ||[28, 53]|
|Flax||EPSPS, BFP (transgene)||NHEJ, HDR|||
|Lettuce, N. attenuata||BIN2, AOC||NHEJ|||
|Lotus japonicus||SYMRK, LjLb1, LjLb2, LjLb3||NHEJ|||
Specific DNA-binding proteins such as zinc fingers, TALEs, and dCas9 can be fused with different effector domains like activators, repressors, and epigenome modifiers to modulate gene expression (Figure 2). DSB created by ENs/RGEN can be used for different purposes (Figure 3). Controlled and tuneable expression of genes can be tremendously used for genetic improvement of plants. Modification of epigenetic marks can be further saved from regulation as GMOs. ZFNs, TALEs, and TALENs and Cas9, dCas9, and multiplexed Cas9 can be used efficiently for genetic improvement of cotton through gene deletion, insertion, replacement, correction, and modulation of expression.
3. Use of GenEd tools against abiotic stresses in cotton
Abiotic stress is a multigenic and complex trait. A substantial interaction between several components of signaling, regulatory, and metabolic pathways leads to response/adaptation to abiotic stress [55, 56, 57]. In response to abiotic stress, sometimes, plants may undergo whole-genome duplication events, and functional redundancy in multigene families may also be observed. Single-gene knockout often produces undesirable results/phenotypes making difficult to unravel the exact function. A comprehensive understanding of molecular basis of abiotic stresses (including drought, salinity, and heat) and their tolerance mechanisms have been one of the major goals of plant researchers to engineer stress tolerance in plants.
A VIGS-mediated gene silencing of sucrose non-fermenting-1-related protein kinase 2 (GhSnRK2) mitigated drought tolerance in cotton plants, indicating that GhSnRPK2 positively conditions drought stress and low-temperature tolerance . Moreover, RNAi of cotton PHYA1 genes improved drought, salt, and heat tolerance in transgenic plants, due to increased photosynthesis and better developed root systems . This kind of genes can also be targeted for deletion with pair of ENs or RGENs. Moreover, ZFs, TALEs, and dCas9 can be used for suppression of such genes at the transcriptional level.
To increase the tolerance in cotton against drought stress, transcription factors are excellent candidates for the plant scientists. Various transcription factors (such as MYB, WRKY, ERF, NAC, bZIP) are involved in normal development as well as in drought stress response. These transcription factors have been cloned and proven useful for stress tolerance in cotton and/or in other plants. The genetic engineering of transcription factor genes could activate drought tolerance pathways and enhance drought tolerance in cotton. Recently, a bZIP transcription factor gene, GhABF2, has been reported in the drought and salt tolerance in Arabidopsis and cotton. The transcriptomic analysis revealed that GhABF2 regulates genes related to ABA. Overexpressing GhABF2 in cotton increased SOD and CAT activities as compared to wild-type plants. Moreover, overexpressed plants showed better results in the field, and meanwhile its yield was recorded higher than wild-type plants . Stacking of these gene/transcription factors in best-growing cotton varieties with strong promoters could produce more resistant varieties. In another case, overexpressing GbMYB5 positively involved in response to drought stress in cotton and tobacco by reduced water loss from stomata and showed hypersensitivity to ABA .
Mitogen-activated protein kinases (MAPKs) are important signaling molecules that respond to drought stress. In a study, SlMAPK3 was induced by drought stress, and CRISPR/Cas9 system was utilized to generate SIMAPK3 mutants . Field tests of transgenic maize plants with reduced ethylene biosynthesis by silencing 1-aminocyclopropane-1-carboxylic acid synthase 6 significantly improved grain yield under drought stress conditions . Similarly, decreasing the sensitivity of maize to ethylene also resulted in higher yield . Overexpression of ARGOS genes and negative regulators of the ethylene response enhances drought tolerance in transgenic maize plants [64, 65].
Due to its simple design and efficient cloning of single or multiple gRNAs, CRISPR/Cas9 system using multiplex genome editing represents a promising and very powerful tool to specifically modulate the expression and activity of genes involved in abiotic stress responses. Multiplexing through CRISPR/Cas9 has been used successfully in model and crop plants [19, 66, 67]. Multiplex genome editing may also be useful for studying functions of gene families as well as an interaction between multiple genes. Multiple genes involved in stress regulatory network, signal transduction, and metabolite production may be simultaneously targeted via CRISPR/Cas9 technologies for engineering stress tolerance in crop plants. An additional strategy could be pyramiding/stacking of multiple stress regulatory genes through HDR-mediated gene targeting.
4. Use of GenEd tools against biotic stresses in cotton
Conventional methods have been used for integrated pest management (IPM). Physical, chemical, and biological methods have been used for pest and disease management since domestication of crops. For insect resistance, the most widely used technology is Bacillus thuringiensis (Bt) technology. Through expression of Bt genes, Cry toxin, many insect-resistant crops have been developed . Bt crops helped in decreasing insect attack and the use of pesticides and, hence, had done a good job for decreasing pollution as well. But unfortunately, resistance against Bt has been observed in certain parts of the world like resistance in pink bollworm in India. Apart from Bt technology, RNAi technology has also been used for insect resistance in crop plants. The first report of RNAi technology for cotton bollworm resistance was developed  by expression of dsDNA of insect-derived cytochrome P450 monooxygenase gene (dsCYP6AE14). Stacking of dsCYP6AE14 and plant cysteine proteases, such as GhCP1 from cotton (Gossypium hirsutum) and AtCP2 from Arabidopsis, can increase insect resistance in plants against cotton bollworms. In addition, stacking of new genes with old transgenic cotton varieties will further produce durable resistance against insects. Bt alternate transgenic approaches have also been used at the laboratory scale to develop new strategies of insect resistance in plants.
To counter the insect resistance against Bt crops, alternate strategies include expression of other toxins , engineering with proteases , proteinase inhibitors , receptor proteins [73, 74], and double-stranded RNA . Among all these, dsRNA has been proposed as a method of choice and next-generation insecticide . Moreover, expression of small dsRNA of CYP450 genes in transgenic plants to target vital bollworm functions has been reported as alternative to Bt applications . Most recently, CRISPR/Cas9 was used to knock down a male-determining factor gene, Nix, in Aedes aegypti mosquitoes, leading to partial sex-change phenotypes . The demonstration of using CRISPR/Cas for inhibition of mosquito-borne disease suggests that GenEd tools can also be translated for inhibition of other insect-borne diseases like whitefly that acts as vector for CLCV transmission to cause CLCuD.
Viral diseases are generally controlled by eliminating the vector population which transmits them. Scientists have been using conventional breeding [77, 78], pathogen-derived resistance [79, 80, 81], and nonpathogen-derived resistance [82, 83] to control the diseases. Most efforts were focused on silencing gene(s) of helper virus, but genes on satellite molecules were ignored. Such efforts proved effective but for a short period of time, and then virus relapsed because of multiple infections, synergistic effects, and evolution. A variety of multiplex genome engineering models in plants and animals are available either with expressing multiple gRNAs under single RNA Pol-III promoter [84, 85] or under different promoters at the same time [86, 87]. The CRISPR/Cas9 system has been successfully used for controlling BeYDV , BSCTV , and TYLCV  with very few off-target activities, and these successful reports highlight the enormous potential of CRISPR/Cas system against geminiviruses. Due to inexpensive, simple, and rapid mechanism for triggering site-specific genome modifications, the programmable Cas9-gRNA system is potentially transforming next-generation genome-scale studies. The efficiency of RGEN system is remarkably high for crop improvement against potential threats of multiple origins (viral and bacterial diseases) especially CLCuD.
The strategy of targeting rep gene or rep protein-binding sites to occupy or disrupt the binding sites could be very fascinating using TALE and TALEN approach with high specificity. Recently, it has been demonstrated  that artificial TALE proteins could be a platform for broad-spectrum resistance against begomoviruses. Targeting viral DNA or host factors associated with pathogenesis of viral disease for disruption could be the possible strategies for virus suppression and disease resistance. There is a great possibility and progress in the idea of using TALEN and TALE repressors for antiviral gene therapy as well, to suppress potent viruses that cause global mortality and morbidity like HIV . So far, different regions of viral genomes have been targeted to inhibit replication and to suppress viruses. As a result, decrease in titer of the virus by using ENs has been achieved by many researchers [13, 91].
5. GenEd tools for epigenetic modifications in cotton
DNA methylation is generally defined as an epigenetic mark of transcriptional gene silencing. Epigenetic regulation is although mysterious but can be modulated for a desirable change in the genome. Gene regulation without any change in DNA remained a challenge for years, but now factors have been deciphered which are responsible for epigenetic suppression or activation of genes. So, it has become possible with the help of engineered proteins to modulate gene expression of a gene epigenetically as well. So far, ZFs, TALEs, and CRISPR/Cas were dominantly used for this purpose [92, 93, 94, 95], but recently TALEs and dCas9 have become available for this purpose. These proteins fused with different effector domains like 10–11 translocation methylcytosine dioxygenase 1 (TET1) , lysine-specific demethylase 1A (LSD1) , and methyltransferase which have been used as potential epigenome editors. ZFs fused with TET1 (ZF-TET1) were successfully used for demethylation purpose . In addition, TET1 was used in demethylation of cytocine at CpG sites, and LSD1 has been used for demethylation of H3K4me1/2 and deacetylation of H3K27.
DNA methylation is a conserved epigenetic mark important for genome integrity, development, and environmental responses in plants and mammals. Active DNA demethylation in plants is initiated by a family of 5-mC DNA glycosylases/lyases (i.e., DNA demethylases). Repeat regions, promoters, enhancers, and gene body are the main sites for DNA methylation in the genome. Epigenetic regulation also contributes in splicing. Recent reports suggested a role of active DNA demethylation in fruit ripening in tomato . It was revealed that DNA demethylation is required for tomato fruit ripening through both activation of induced genes and inhibition of ripening-repressed genes. DNA methylation controls many aspects of plant growth and development. TALE-LSD1 was used to modify methylation pattern of different sites, confirmed through chromatin immunoprecipitation . Gao et al. have confirmed that for epigenome modifications, TALEs are more effective than dCas9 in mammalian cells. Moreover, they have evaluated TALE and dCas9, for gene activation and repression purpose, and highlight the use of designed transcription factors for epigenome modifications.
Epigenetic modifications of chromatin at the DNA or histone level are considered to be one of the major forces that influence gene expression [100, 101]. Genome-wide changes in methylation patterns have been linked with physiological and developmental responses. Genetic imprinting in Arabidopsis endosperm and embryo was also driven by extensive demethylation of whole genome coupled with hypermethylation of non-CG residues especially CHH sites on transposable elements [102, 103]. In plants, genes, transposons, and repetitive sequences were found to be methylated in different densities at various developmental stages, which suggested that the transcription of certain genes is controlled epigenetically [104, 105]. Indeed, promoter DNA hypermethylation was related to target gene repression in undifferentiated Arabidopsis cells . Jin et al.  reported that annual pattern of cytosine methylation drives fiber growth in cotton and moreover also studied the degree of CHH DNA methylation in the promoter regions of the growth-regulating genes SUR4, KCS13, and ERF6 on yearly basis.
However, potential application of TALEs for targeting DNA or histone for epigenome editing has been demonstrated, but more research is needed for development and validation of epigenetically modified crops/organism (EMO). About 500 genes have been identified that are epigenetically modified between wild cotton varieties and domesticated cotton, some of which are known to relate to agronomic and domestication traits. By selectively turning gene expression on and off, breeders could create new varieties of cotton without altering the genes.
6. Use of GenEd tools for growth, yield, fiber, and seed quality enhancement
Accelerated breeding of plant species has the potential to help challenge environmental and biochemical cues to support global crop security. Lengthy breeding cycles are one of major limitations in the rapid genetic improvement and commercialization of woody plant species. In recent years, limitation of T-DNA segregation after site-specific genome editing has gained prominence with the widespread use of CRISPR/Cas technology in genetic engineering. CRISPR/Cas platform will help to strengthen molecular breeding and development of resistance against biotic and abiotic stresses as well as yield and quality improvement in cotton .
Jiang et al.  used CRISPR/Cas9 to target the FAD2 gene in Arabidopsis thaliana and in the closely related emerging oilseed plant, Camelina sativa, with the goal of improving seed oil composition. C. sativa is allohexaploid, while cotton is allotetraploid and, so, can be targeted with ENs to produce quality seeds. For quality improvement of soybean, TALENs were used to mutate two fatty acid desaturase genes FAD2-1A and FAD2-1B . The mutations also improved shelf life and oxidative stability along with decrease in polyunsaturated fats. RNAi-based silencing of two key fatty acid desaturase genes, GhSAD-1 and GhFAD2-1, in cottonseeds significantly increased stearic and oleic acid contents in transgenic lines. In addition, palmitic acid contents were significantly low in both high-stearic and high-oleic transgenic cotton lines. These results provide an opportunity for nutritional improvement of cottonseed oil through genetic engineering . Engineering of cotton in same manner through CRISPR/Cas9 or TALENs will improve cottonseeds valuable for farmers and oilseed industry.
Cottonseeds contain high-quality protein and oil so it is also an important source of nutrient-rich food crop and edible oil. For every kilogram of fiber collected, about 1.65 kg of seeds are produced. Therefore, cotton can potentially provide the protein requirements of half a billion people if it could be used directly as food. However, cottonseeds are toxic for humans and other monogastric animals because of the presence of gossypol in the seed glands. Gossypol is a toxic terpenoid compound that causes heart and liver damage in human beings. Gossypol-free cottonseed may enhance the overall value of cottonseed and may generate a new market for cottonseed. Therefore, gossypol-free cottonseeds could provide protein requirement to poultry, aquaculture, and millions of humans worldwide. Gossypol is not only mainly localized in cottonseeds but also presents in other parts of cotton plant. In leaves and reproductive tissues of plant, gossypol and other related terpenoids play a protective role against insects, provoking infertility in insects. RNAi has been used successfully to reduce gossypol contents in cottonseeds by silencing (+)-δ-cadinene synthase which catalyzes the very first reaction involving the cyclization of farnesyl diphosphate to (+)-δ-cadinene. However, RNAi has several disadvantages like off-targets and reproducibility . Thus, the promise of cottonseed to ensure food security and protein requirement of the developing countries like Pakistan remained unfulfilled. Recently, Ma et al.  have mapped a gene (GoPGF), acting as a positive regulator of formation of pigment glandular trichomes, storage organs of gossypol. Tissue-specific silencing of this gene will result in gossypol-free seeds while maintaining the level of secondary metabolites in the other parts of the plant . Targeting dCas9 to regulatory region of a gene can block the binding and elongation of RNA polymerase, leading to dramatic suppression of transcription. Moreover, it has also been reported that dCas9 can also be modulated to recruit different protein effectors (activators or repressors) to DNA in a highly specific manner  to activate (CRISPRa) or suppress (CRISPRi) a gene. More recently, fusing TALEs and dCas9 with KRAB repressor domain resulted in an efficient transcriptional interference [51, 113, 114]. In addition, CRISPRi was also used for multiplexed control of endogenous genes  and stable repression of genes with silencing efficiency typically achieved by RNAi while minimally impacting transcription of nontargeted genes.
Flowering is a very critical developmental stage in cotton. All of the production depends on flowering. From emergence to drying up or falling off, it takes just 5–7 days. Flowering depends largely on temperature, availability of water, and other environmental conditions. Growth and development stages in cotton, from planting to emergence, from emergence to square, from square to flowering, and from flowering to boll development, are water sensitive. SELF-PRUNING 5G (SP5G) is a repressor of flowering in tomato and drives loss of day length sensitivity in flowering. CRISPR/Cas9-based mutation in SP5G resulted in compact growth of tomatoes with rapid flowering. Moreover, mutation also caused a quick burst of flowering that resulted in early yield. Early and uniform flowering in cotton can be used for ease in mechanized picking as well. Identification of FLOWERING LOCUS T (FT)  gained prominence for its use in advanced breeding initiatives. FT is a small globular protein that interacts with FT-INTERACTING PROTEIN 1 and moves to sieve elements. From sieve elements FT is transported to shoot apical meristem and interact with bZIP transcription factor FD and phospholipid phosphatidylcholine  for its nuclear localization. Finally, FT activates LEAFY (LFY), APETALA1 (AP1), and SUPPRESSOR OF OVEREXPRESSION OF CONSTANS 1 (SOC1) to start flowering development [117, 118, 119]. Overexpression of FT has been used in many plant species to induce advanced flowering [120, 121], thus enabling a more rapid and refined approach to breeding. CRISPR/Cas has also been used successfully to target dihydroflavonol-4-reductase-B (DFR-B), encoding an anthocyanin biosynthesis enzyme that is responsible for the color of the plant’s stems, leaves, and flowers . Moreover, CRISPR/Cas9 system was employed to specifically induce targeted mutagenesis of GmFT2a, an integrator in the photoperiod flowering pathway in soybean .
Li et al.  proposed applications of CRISPR/Cas system for improvement in cotton growth and development, seed quality, and flowering timing and control. They examined targeted mutagenesis in allotetraploid genome of cotton, and no off-target mutations have been observed by sequencing two putative off-target sites, which have three and one mismatched nucleotides with GhMYB25-like sgRNA1 and GhMYB25-like sgRNA2, respectively.
Proper development of plant roots is critical for primary physiological functions, including water and nutrient absorption and uptake, physical support, and carbohydrate storage. Crop roots are the main organs that primarily sense and respond to the biotic as well as abiotic stresses. Previous studies on crop root development have proven that increased lateral root formation (LRF) has a positive effect on whole plant development as well as crop yield. Functions of cotton root system are also strongly influenced by lateral roots. A high number of lateral roots would increase the total root surface area of the plant that may potentially improve the overall growth, fiber length, yield, and stress tolerance against severe conditions. Therefore, engineering cotton plants for the increased number of lateral roots will not only improve the yield and fiber contents but will also make cotton crop suitable for salt, drought-affected, and low-fertility soils. Recent studies demonstrated that arginine (ARG) is the precursor of nitric oxide (NO) in roots catalyzed by nitric oxide synthase (NOS) , and NO plays a key role in the lateral root formation. In Arabidopsis reduced activity of arginase may increase NO contents in roots and therefore improved the lateral roots in transgenic plants. Given that there are two, highly similar, orthologous, cotton arginase genes (GhARG), Gh_A05G2143 and Gh_D05G2397, in the A and D chromosomes that were mutated with CRISPR/Cas9 in upland cotton R18, a transgenic acceptor variety bred from the Coker 312 cotton, which is, globally, a main transgenic acceptor germ line . CRISPR/Cas system was efficient in producing targeted mutations in the selected genes which improved lateral root system under both high and low nitric conditions ensuing adaptation of cotton on a variety of soils. Improved LRF will enhance plant growth and development as well.
7. Use of GenEd tools for gene stacking
Genome engineering with the help of recombinases is no longer a new approach. Site-specific recombinase technology is used to delete, insert, or invert a specific sequence at a target site. A transgenic organism with Cre recombinase expressed by a tissue-specific promoter can be crossed to excise the gene present between two loxP sites. Targeted excision deletes the function of genes within specific tissues. Deletion of genes by site-specific recombinase technology is a particularly advantageous method of gene excision .
Site-specific recombinases are remarkable tools for insertion of multiple genes on single locus or deletion of unwanted sequence from the genome. With discovery of ENs, sequence-specific TALE proteins have been engineered with catalytic domains of DNA invertase Gin to design new chimeric proteins called as TALE recombinases (TALERs). TALERs have been successfully used in bacteria and mammalian cells and offer an alternate approach to targeted GenEd . DNA-binding domains (DBDs) of hyperactivated variants of the resolvase/invertase family of serine recombinases can be replaced with engineered ZFs to retarget them to sequence of interest in the genome. However, imperfect modularity with particular domains, lack of high-affinity binding to all DNA triplets, and difficulty in construction were major limitations in widespread usage of ZFPs for genome editing. Mercer et al.  designed a TALE recombinase (TALER) through engineered fusion of a hyperactivated catalytic domain from the DNA invertase Gin and an optimized TALE protein. The TALER architecture significantly increased the targeting capacity of engineered recombinase as well as its potential applications in plant and animal biotechnology. In cotton, meganucleases were also used for gene stacking based on homologous recombination . TALENS has been described as the most precise technique for targeted gene stacking of economically important molecular traits in crop plants. Cotton genome has been modified efficiently using GenEd tools. Successful reports of GenEd in cotton have been given in Table 4.
8. Targeted mutagenesis for functional genomics studies in cotton
GenEd tools are precise and highly specific. For reverse genetics and functional genomics, these reagents are also advantageous over the existing approaches, TILLING. For targeting gene families, TILLING is limited due to high specificity of the primers . TILLING is difficult in polyploidy genomes; further, its low mutation rate and high screening cost make it more limited compared to ENs. CRISPR/Cas9 can target multiple genes simultaneously with multiple gRNAs. For functional genomics in cotton, ENs can be used with higher specificity and low cost. RNAi has been previously used successfully for functional genomics in cotton . RNAi works at the posttranscriptional level and, hence, may lead to off-target, unreliable, and unpredictable results. Moreover, RNAi may also result in induction of unspecific immune response and incompleteness of knockdowns. All these limitations can be overcome using highly specific, more reliable, and less costly GenEd tools. Additionally, ENs work at the transcriptional level; henceforth, are more predictable; and would result in complete knockdown. Multiplexing has further made RGENs more fascinating than any other technique to study gene families and polygenic characters. Chen et al.  demonstrated CRISPR/Cas9-based targeted mutagenesis of cotton cloroplastos alterados 1 (GhCLA1) and vacuolar H+-pyrophosphatase (GhVP) genes and confirmed targeted/site-specific single nucleotide insertion and substitution in GhCLA1 and one deletion in GhVP.
Multisite GenEd in cotton has also been reported earlier. Wang et al.  utilized a CRISPR/Cas9 system to conduct multisite GenEd in allotetraploid cotton. An exogenous gene DsRED2 and an endogenous gene GhCLA1 were targeted with 66.7–100% efficiency. CRISPR is efficient in multisite GenEd with high successful rate. For gene function studies in cotton, a highly efficient platform has been developed using CRISPR/Cas9 . They used GhMYB25-like gene to study gene knockout mutants in cotton. Moreover, 1–7 nt deletions were observed with one sgRNA, while deletion of 168-nt-long fragment was deleted using two sgRNAs. An efficient and fast method was developed to validate sgRNAs in cotton plant through transient assay. Using this robust method, activity of sgRNAs can be validated in 3 days which will be helpful in selection of potential sgRNAs for stable transformation in cotton. Individual genes (GhPDS, GhCLA1, and GhEF1) were targeted resulting in typical albino phenotypes by inducing mutation in GhCLA1, simultaneous editing of homoeologous genes, and genomic fragment deletions . This kind of studies made a foundation stone for undertaking functional genomics studies in cotton.
9. Delivery of artificial DNA-binding proteins and ENs into plants
Sequence-specific nucleases enable facile editing of higher eukaryotic genomic DNA; however, targeted modification of plant genomes remains challenging due to ineffective methods for delivering reagents for genome engineering to plant cells. Method of delivery of ENs is very important for appropriate expression and optimum results. In animals, delivery of TALEs or TALENs was possible through nucleic acids, mRNA, as well protein [7, 130, 131]. TALEN activity mainly depends upon delivery method, choice of expression vector, and method of transformation used. Conventional plasmids and viral vectors have been used for expression of required proteins inside the cell. ZFNs were delivered using a novel tobacco rattle virus (TRV)-based expression system and produced non-transgenic mutant plants . ZFNs were transiently expressed into a variety of tissues and cells of intact plants to produce genetically modified plants. Geminivirus-based replicons have also been used for transient expression of sequence-specific nucleases (ZFN, TALENs, and CRISPR/Cas) and delivery of DNA repair templates . In tobacco, the use of viral replicons enhanced gene targeting efficiency by twofolds compared with conventional Agrobacterium tumefaciens T-DNA.
Transient expression of the CRISPR/Cas9 ribonucleoprotein complex in protoplasts can result in the production of specifically targeted, transgene-free mutants in the T0 generation in several plant species . Highly efficient and specific transient expression-based genome-editing system was developed for producing transgene-free and homozygous wheat mutants in the T0 generation . Genome-edited DNA-free bread wheat was produced using CRISPR/Cas9 ribonucleoproteins (RNPs) . RNPs were delivered into wheat immature embryos through particle bombardment. Cas9 protein was expressed and purified from Escherichia coli Rosetta strain, and the sgRNA was transcribed using HiScribe T7 In Vitro Transcription Kit (New England Biolabs). CRISPR/Cas9 RNP-mediated GenEd eliminates the risks of transgene integration into plant genome and further promises targeted gene mutations with no off-targets. Moreover, it is fast and robust compared to other methods.
In case of TALENs, the use of mRNA is advantageous than permanent integration of T-DNA in genome. Firstly, in pharmaceuticals viral vectors are perceived as gene-modified organisms, while mRNA has superior regulatory viewpoints. Secondly, delivery of transient mRNA reduces any risks of unwanted stable integration and mutations in the genome. Gallie  introduced mRNA into plant protoplast efficiently using PEG-based transformation. So, the TALEN mRNA delivery could be more attractive for transient expression in plants to avoid undesirable results and to prompt regulatory process. Moreover, in case of nuclease, which introduces double-strand breaks, the integration and continuous expression of the gene into the host may lead to detrimental results. Synthetic mRNAs of TALENs for GenEds are available from different companies like TriLink BioTechnologies at request.
In biomedical industry, direct injection of CRISPR and TALEN proteins in living organisms is very fascinating. Direct delivery of proteins may further reduce the limitations and concerns of posttranscriptional and translational constraints associated with expression of plasmid and mRNA. Direct delivery of purified nuclease proteins was reported in N. benthamiana protoplasts using PEG and was claimed as non-transgenic GenEd approach . Direct delivery of EN proteins into plants would be proven as the most favorite approach for regulatory approval of edible crop plants and cotton as well. On the basis of previous reports discussed above, the production of non-transgenic cotton would be very helpful from regulatory and public acceptance viewpoint.
10. Comparison of ENs
All technologies have almost same mode of action and give same results, but these are different from one another in terms of nature, components, target specificity, target requirements, target limitations, modularity, and construction assembly methods. On these bases current GenEd tools are compared in Table 5.
|Origin||Xenopus laevis||Xanthomonas (similar proteins also reported in Ralstonia solanacearum and Burkholderia rhizoxinica)||Streptococcus pyogenes (present in 40% bacteria and 90% archaea)|
|Nature||DNA-binding motifs in eukaryotes||Plant pathogenic protein||Prokaryotic defense protein|
|Function||DNA binding as transcription factors||DNA binding and gene modulation of host plant (act like transcription factors)||Endonuclease that cuts DNA of infecting viruses and plasmids|
|Target binding||Protein-DNA (one to triplet)||Protein-DNA (one to one)||RNA-DNA (one to one)|
|Components||DNA-binding domain||DNA-binding domain|
Effector domain (activator/repressor)
|Year of emergence as GenEd tools||2000||2010||2012|
|Target length||~9–36 nt||~12–50 nt||~20–23 nt|
|Target limitations||It binds to a triplet of DNA bases||Needs T base at 5′||Needs PAM region (5′NGG)|
|Size||Small||Relatively big (small in case of TALEs)||Big|
|Mode of action||DNA binding and DSB (NHEJ/HR)||DNA binding, expression modulation/DSB (NHEJ/HR)||DNA binding and DSB (NHEJ/HR)|
|Assembly||Difficult||Technical but easy||Easy|
|Uses||Gene disruption, gene deletion, gene correction, gene addition, tag ligation, ObLiGaRe||Gene activation, gene repression, gene disruption, gene deletion, gene correction, gene addition, tag ligation, ObLiGaRe||Gene disruption, gene deletion, gene correction, gene addition|
|Epigenome editing||Less reported||More reported (natural TFs)||Less reported|
|Delivery||DNA, mRNA||DNA, mRNA, protein||DNA|
|Targeting efficiency||Low and variable||High||High|
|Delivery via viral vector||Easy||Easy||Challenging|
|Delivery as RNA molecule||Easy||Easy||Challenging|
|Delivery as protein||Easy||Easy||Challenging|
11. Future perspectives
Genome engineering in cotton using ENs will open up new avenues for gene function studies and understanding of complex polygenic metabolic pathways. Improvement in cotton growth and development with good quality of fiber and seeds can be achieved more precisely using GenEd tools. Some of the reports of GenEd in cotton using ENs reviewed above are enough to demonstrate success of targeted gene modifications in cotton. Moreover, CRISPR/Cas nickases are used for gene replacement and correction, and the use of this technology for replacement of endogenous promoter with exogenous constitutive, inducible, or strong promoter can be helpful in regulation of expression of endogenous gene. This approach could reduce the risks of foreign gene integration into the genome. Furthermore, tuneable, special, and tissue-specific expression of the endogenous genes can be achieved with the insertion of new promoters at place of indigenous promoters. The risks associated with the development of resistance against Bt can be mitigated by gene pyramiding/stacking through ENs. Modification of epigenome marks associated with certain crop parameters such as flowering, fiber quality, and stress resistance can be obtained with fusion of epigenome modifiers with artificial DNA-binding proteins (ZFs, TALEs, and dCas9). In conclusion, genetic improvement in cotton using GenEd toolbox would be helpful in solving prevailing problems and constraints causing decrease in cotton growth, yield, and fiber quality.