Open access peer-reviewed chapter - ONLINE FIRST

Soybean as a Model Crop to Study Plant Oil Genes: Mutations in FAD2 Gene Family

By Sy M. Traore and Guohao He

Submitted: May 23rd 2021Reviewed: July 31st 2021Published: September 27th 2021

DOI: 10.5772/intechopen.99752

Downloaded: 33


Plants have numerous fatty acid desaturase (FAD) enzymes regulating the unsaturation of fatty acids, which are encoded by a FAD gene family. The FAD2 genes belong to such family and play a vital role in converting monounsaturated oleic acid to polyunsaturated linoleic acid. Oleic acid has the health benefits for humans, such as reduction in cholesterol level, antioxidation property, and industrial benefits like longer shelf life. The development of genotypes with high oleic acid content in seeds has become one of the primary goals in breeding oilseed plants. The identification and characterization of the FAD2 genes in plants have been an important step to better manipulate gene expression to improve the seed oil quality. The induction of mutations in FAD2 genes to reduce FAD2 enzyme activity has been an integral approach to generate genotypes with high oleic acid. This chapter will describe the FAD2 gene family in the model organism soybean and the correction of mutations in FAD2 genes with the increase of oleic acid content. Leveraging advanced research of FAD2 gene family in soybean promotes the study of FAD2 genes in other legume species, including peanut. The future perspectives and challenges associated with mutations in FAD2 genes will be discussed.


  • legume
  • desaturase
  • genome editing
  • fatty acid
  • mutation
  • protein

1. Introduction

The legume family (Leguminosae) is the third-largest family of flowering plants, with over 800 genera and 20,000 species, after the Orchidaceae and Asteraceae [1]. It is classified into three sub-families: Papilionoideae, Caesalpinioideae, and Mimosoideae based on morphological characters [1]. The family presents incredibly diverse morphological characters, from giant rain forest trees and woody lianas, to desert shrubs, ephemeral herbs, herbaceous twining climbers, aquatics, and fire-adapted savanna species [1, 2, 3]. Two sub-families, Caesalpinioideae and Mimosoideae, are mostly woody trees and shrubs. Papilionoideae is the largest sub-family consisting of 476 genera and ~ 14,000 species, including most of the economically important legumes [4]. All papilionoids share a common ancestor and bear butterfly-shaped flowers [5, 6]. Within the Papilionoideae, there are four clades, phaseoloids, galegoids, genistoids, and dalbergoids, based on phylogenetic analyses [1, 4]. These clades cover the economically important food and feed legumes. For instance, the phaseoloid clade includes soybean, common bean, cowpea, and pigeon pea; the galegoid clade within the Hologalegina group includes medicago, chickpea, faba bean, lentil, and pea; the genistoid clade includes lupinus, and the dalbergoid clade includes peanut (Figure 1) [7, 8].

Figure 1.

Phylogenetic relationships of sub-families, major clades within the sub-family Papilionoideae, and some economically important species in legumes (modified from references [7,8]). *refer to model species.

The pea (Pisum sativumL.) was the original model organism used in Mendel’s discovery (1866) of the laws of inheritance, establishing the foundation of modern plant genetics [9, 10]. Although Mendel’s peas were the first “model” plant, legume biology has long lagged behind more successful models from the Brassicaceae family or economically important cereals [10]. Due to legumes differing vastly in genome size, chromosome number, ploidy level, and reproductive biology, two legume species with smaller genome size in the Galegoid clade, Medicago truncatulaand Lotus japonicas, were firstly selected as model organisms to demonstrate the referenced genetic system for legumes [11, 12, 13]. As the genome of soybean (Glycine maxL.) has been available in 2010 [14], gene discovery in soybean is more efficient and feasible, providing a powerful high-throughput and non-targeted approach to gene expression and an excellent resource for comparative legume genomics. Although soybean has a relatively large genome compared with much smaller genomes of Medicagoand Lotus, soybean is the most widely grown and economically important legume. Together with advantageous genome sequences, soybean is also considered as a model organism in legumes [15].

The existence of model organisms is fundamental for advancing genetic and genomic studies in crop species. Comprehensive biology study in the model organisms facilitates the transference of biological knowledge, gene function and expression, genomic information, and advanced tools to crop species. Fatty acids are essential components of cellular membranes, storage lipids, and precursors involved in plant metabolism and development [16]. The abundance of different fatty acids in plants is regulated by diverse fatty acid desaturases (FADs) enzymes [17]. Among FADs, the FAD2 enzyme converts monounsaturated oleic acid to polyunsaturated linoleic acid by adding a second double bond at the Δ12 position in the acyl chain. Manipulation of FAD2gene expression and enzyme activity in seeds enables the accumulation of oleic acids that benefit industries and consumers. This chapter aims to describe the FAD2 gene family in the model organism soybean. Mutations induced in FAD2genes and consequences from soybean to crop species, including peanut, are also discussed.


2. FAD gene family in the model organism soybean

Soybean seed oil is composed of approximately 20% of total seed composition, contributing the greatest concentrations of oil when compared to any food legume [18]. However, the concentration of oil is entirely dependent on the growing region, cultivar, and several environmental factors. As seeds develop, lipids, mostly triglycerides, are stored in cell oil bodies surrounding the larger protein bodies [19]. The fatty acid composition of most soybean seeds consists of 11% palmitic acid (16:0), 4% stearic acid (18:0), 25% oleic acid (18:1), 52% linoleic acid (18:2), and 8% linolenic acid (18,3) [20], with 24 other fatty acids in much lower quantities [21]. Synthesis of less common fatty acids occurs with similar structural configurations, which reside in cell membranes and storage lipids that are found in much lower quantities. This composition is mainly due to the physiological processes for seed dormancy and sustaining nutrition for young, recently germinated plants [18].

Fatty acids play an essential role in regulating the tolerance to various environmental stresses by altering the properties of cell membranes [22, 23]. During the desaturation of fatty acid in plant cells, the number and position of the double bonds in a fatty acid chain influence its physical and physiological properties [24, 25], the membranes function, and the proper growth and development [24]. The release of the genomic sequence has allowed the identification of FADgenes firstly in Arabidopsis followed by many crop species, including oilseed crops, such as soybean [26, 27], cotton [28, 29], cacao [30], peanut, and olive [31, 32]. Different fatty acid desaturases (FADs) are involved in the desaturation of fatty acids, including the microsomal Δ12 desaturase (FAD2), the microsomal ω3 desaturase (FAD3), the trans ω3 desaturase (FAD4), the Δ7 desaturase (FAD5), the plastidial Δ12 desaturase (FAD6), the plastidial ω3 desaturase (FAD7), and the plastidial ω3 desaturase (FAD8) [33]. Among these desaturases, FAD2 and FAD6 are ω6 desaturases that convert monounsaturated fatty acid (oleic acid) to polyunsaturated fatty acid (linoleic acid) in the endoplasmic reticulum (ER) and plastids, respectively. FAD3, FAD7, and FAD8 are ω3 desaturases that synthesize linolenic from linoleic acid in the ER (FAD3) and plastids (FAD7 and FAD8) (Figure 2) [34, 35]. FAD4 and FAD5 specifically produce monounsaturated acid from palmitic acid for phosphatidylglycerol (PG) and monogalactosyldiacylglycerol (MGDG), respectively [36]. The content of oleic and linoleic acids affects the oxidative stability and nutritional value of edible oil [37]. Linoleic acid is a polyunsaturated fatty acid that plays a vital role in human health and nutrition; however, it has the disadvantage of decreasing the stability, flavor, and shelf life of the edible oil [38, 39]. Conversely, the oil higher in oleic acid has advantages of higher oxidative stability and long shelf life [40], increase structural integrity at a higher cooking temperature [41], and nutrition benefits to reduce low-density lipoprotein (LDL) cholesterol [42], suppress tumor formation, and protect from inflammatory diseases [43]. Therefore, human consumption of soybean seed oil demands higher oleic acid and lower linoleic acid. Efforts have been made to identify FAD2genes that significantly affect fatty acid biosynthesis, to understand their inheritance, and to manipulate gene expression to develop oilseed crops with high content of oleic acid [44, 45, 46, 47, 48, 49].

Figure 2.

Illustration of a part of the fatty acid biosynthesis pathway.

2.1 FAD2gene family in soybean

In soybean, the FADgene has two copies, GmFAD2–1and GmFAD2–2, each of them has two members (GmFAD2–1Aand GmFAD2–1B) and three members (GmFAD2–2A, GmFAD2–2B, and GmFAD2–2C), respectively [47, 50, 51]. Using both soybase and phytozome databases, an additional two novel FAD2–2members, named GmFAD2–2Dand GmFAD2–2E, were identified (Table 1) [52]. Among the identified FADgenes in soybean, the FAD2–1A and FAD2–1B EST analysis suggested that the GmFAD2–1Aand GmFAD2–1Bare actively expressed in developing seeds and constitute the seed specific paralogs in the soybean genome [53]. GmFAD2–2Apossessed a deletion of 100 bp in the coding region and therefore was predicted to be non-functional [50]. GmFAD2–2Band GmFAD2–2Cwere found to display ubiquitous expression in all the vegetative tissues of the soybean plant, GmFAD2–2Dwas expressed in the flower, seed, and nodule, while GmFAD2–2Eexpression was exclusively confined to the pod and seed with a low level of expression.

GeneAccession numberChromosomeReferences
GmFAD2–1AGlyma.10G27800010[47, 50, 51]
GmFAD2–1BGlyma.20G11100020[47, 50, 51]
GmFAD2–2AGlyma.19G14730019[47, 50, 51]
GmFAD2–2BGlyma.19G14740019[47, 50, 51]
GmFAD2–2CGlyma.03G14450003[47, 50, 51]

Table 1.

List of FAD2genes in soybean.

Because of nutritional and health value, soybean breeders have been paying special attention to screen for the source of high oleic acid in soybean germplasm. Two mid-oleic acid mutant lines carrying a mutant allele GmFAD2–1awere identified from phenotype-based screening [54]. Through Targeting Induced Local Lesions In Genomes (TILLING), another mutant GmFAD2–1bwas found. When combining mutant GmFAD2–1aand GmFAD2–1balleles into one line, oleic acid content was increased to 83%. Similarly, a total of 22 plant introductions (PIs) were screened for high oleic acid content in soybean seeds [50]. Two genotypes, PI 603452 and PI 2833270, were identified with increased oleic acid. Sequence analysis showed mutations occurred in the FAD2–1Agene of PI 603452 and in the FAD2–1Bgene of PI 283327, respectively. When PI 603452 was crossed with PI 283327, a soybean line carrying both homozygous FAD2–1Aand FAD2–1Bmutants was found in the following segregation generations. Fatty acid content analysis showed that oleic acid content increased up to 82–86%, and the level of linoleic and linolenic acids was reduced, while only 20% of oleic acid in wild type soybean lines. Further mutation analysis using (TILLING) by sequencing also demonstrated that mutations within GmFAD2–1Aand GmFAD2–1Baffect seed oleic acid content in soybean [52]. These two genes have played an important role in converting oleic acid to linoleic acid and directly determining the composition of oleic acid in soybean seeds. FAD2gene is 1,164 bp long with an open reading frame coding for about 387 amino acids [55]. It contains two exons and a single large intron that is embedded within the 5′-untranslated region (5’ UTR) and has a promoter function to regulate the expression level of FAD2[56, 57]. In soybean, GmFAD2–1Aand GmFAD2–1Bshare 99% coding sequence identity and are located in paralogous regions of chromosomes 10 and 20, respectively [58].

2.2 Mutations in FAD2genes

Natural mutations in both GmFAD2–1A and GmFAD2–1B in soybean led to a high level of oleic acid, indicating that mutations in both genes can suppress FAD2gene expression to loss of enzyme function resulted in accumulation of oleic acid and decrease in linoleic acid content. Consequently, mutations induced in both genes become a critical step to improve seed oil. Various mutagenesis tools are used to target these two genes in the coding region or promoter region. A previous study showed that RNAi silencing reduced GmFAD2expression and increased oleic acid from 20% to greater than 80% [59]. Transcription activator-like effector nucleases (TALENs) technique was used to target and cleave conserved DNA sequences in both genes FAD2–1Aand FAD2–1B[60]. In four of 19 transgenic soybean lines expressing the TALENs, FAD2–1A and FAD2–1B mutations were observed in the DNA extracted from leaf tissues, and three of the four lines transmitted heritable FAD2–1 mutations to the next generation. The fatty acid profile of the seed was dramatically changed in plants with homozygous mutations in both FAD2–1Aand FAD2–1B, resulting in oleic acid increasing from 20% to 80% and linoleic acid decreased from 50% to under 4% [60]. The chemical mutagen (EMS) was used in the germplasm to generate mutant lines with high oleic acid content [61]. Sequence analysis revealed lines with mutation on the FAD2–1Aand FAD2–1B. Further crossing of the single mutant lines released the FAD2–1aand FAD2–1bdouble mutant with high oleic acid content. Biological mutagens have also been used to induce mutations in FAD2gene to develop high oleic acid lines.

In recent years, the RNA-guided CRISPR/Cas9 system has appeared as a promising tool in site-directed mutagenesis. The release of the genomic sequence of soybean and the characterization of the FAD2 allow to precisely induce mutations on the coding sequence of these FAD2genes. Kim et al. [62] first used CRISPR/Cpf1 system in soybean and successfully induced deletion mutations in FAD2genes though edited plants were not available (Figure 3). The CRISPR/Cas9 system was also used to target the soybean FAD2genes. Expression and sequence analysis confirmed the alteration of the target genes was corrected with high oleic acid up to 65.58% while low linoleic acid to 16.08% [48]. CRISPR/Cas9 technology induced homozygous mutations in GmFAD2–1Aalone generated high oleic acid without adverse effects on plant development [63]. Two gRNAs simultaneously targeting two sites within the second exons of both GmFAD2–1Aand GmFAD2–1Bshowed dramatic increases in oleic acid content to over 80%, whereas linoleic acid decreased to 1.3–1.7% [56]. Transgene-free high oleic homozygous genotypes could be obtained through segregation generations, in their case, as early as the T1 generation. A gRNA was designed to target the coding region in the first exon of GmFAD2–1Aand GmFAD2–2A, resulting in the oleic acid content increased from 17.1% to 73.5%, and the linoleic acid content decreased from 62.9% to 12.2% [49]. The coding region of FAD2gene contains four transmembrane domains and three histidine boxes (H-box) in soybean (Figure 4) [53]. The histidine residues are essential for the catalytic function of the FAD2 enzyme; substituting histidine with a different amino acid disrupts its desaturase function [64]. High efficiency of mutagenesis using CRISPR-based gene editing provides a promising tool to induce mutations within the sequence of FAD2genes. With intensive efforts, high oleic acid varieties, Vistiv Gold and Plenish, were developed by Monsanto and DuPont companies, respectively [49].

Figure 3.

Demonstration of deletion mutations identified at the target site (blue) ofFAD2 genesusing CRISPR/Cpf1 (cited from Kim et al. [62]).

Figure 4.

Alignment of FAD2–1A and FAD2–1B amino acid sequences. The difference in Amino acids between A and B is highlighted in red. There are four transmembrane domains and three H-box in the coding region of FAD2 enzyme in soybean (modified from Tang et al. [53]).

In addition to alter the coding region, mutations in the promoter and intron can influence FAD2gene expression. The FAD2intron has promoter activity because it harbors promoter-like sequence structures, including TATA and CAAT boxes, as well as many potential cis-elements [56]. Bioinformatics analyses of FAD2intron revealed the CGATT motif and the 5’ UTR Py-rich stretch motif that enhanced gene expression [65]. Mutations in the TATA-box of the promoter reduced the promoter’s function [66]. Therefore, mutations induced in both intron and promoter can manipulate the gene expression of FAD2, though few studies focus on this aspect in soybean.

2.3 FAD2genes from model organism soybean to crop species peanut

Peanut (A. hypogaeaL.) is an economically important oilseed crop like soybean but belongs to a different clade from soybean. Comparison of FAD2genes in peanut and soybean, peanut has an open reading frame without intron but one intron in soybean. Compared to soybean, peanut seed has a higher content of oleic acid (36–67%) and a lower level of linoleic acid (15–43%) [67]. The first natural mutant peanut genotype with 80% of oleic acid content and 2% of linoleic acid in seeds was reported in 1987 [68]. Research studies have demonstrated that the natural mutant genotype with high oleic acid was associated with mutations in the FAD2genes. Two homeologous AhFAD2Aand AhFAD2Bgenes are responsible for converting oleic acid to linoleic acid, located on the chromosomes 9 and 19 of the A and B genomes in the allotetraploidy peanut, respectively [69, 70]. The coding region of both genes has a length of 1,140 base pairs (bp) with 99% sequence homology and only 11 bp differences. The comparison between the high oleic acid line (F435) and the low oleic acid line (Tampson 90) revealed the presence of two mutations on the coding sequence of AhFAD2. The first mutation was a substitution of base guanine (G) to the base adenine (A) at the 448 bp position from the start codon in AhFAD2A, resulting in a missense amino acid from aspartic acid to asparagine. The second mutation was an insertion of the purine base adenine (A) at 441–442 bp position in AhFAD2B, leading to the shift in the amino acid reading frame, consequently generating premature stop codon [70]. Both spontaneous mutations that occurred on AhFAD2Aand AhFAD2Balleles led to 80% of oleic acid and 2% linoleic acid [71]. After screening the Chinese mini core collection, 53.1% of genotypes carrying natural mutation G448A in the AhFAD2Agene and 46.9% with no mutations were observed [72]. Interestingly, 82.8% of this mutation existed in A. hypogaeasubsp. hypogaeawhile 15.4% was observed in A. hypogaeasubsp. fastigiat. In addition, no mutations were detected in the AhFAD2Bgene alone in any lines of the collection. Over 4000 peanut genotypes were screened, and two natural mutant lines PI 342664 and PI 342666 with high oleic acid, were identified [73]. In these two natural mutant lines, sequencing results of the coding region showed the same substitution of G448A in AhFAD2A, but a different substitution of C301G in AhFAD2B, resulting in an amino acid substitution of H101D. These reports demonstrated that mutations occurred in the coding region in either one or both of AhFAD2Aand AhFAD2Bgenes alter enzymatic activity that leads to the higher oleate trait in mutant genotypes [73]. In addition to the natural FAD2mutations in peanut, various chemical and physical mutagens, for example, X rays, EMS, gamma rays, and sodium azide, were used to generate mutations in FAD2genes to increase oleic acid content in seeds. However, these methods generated many other mutations in the genome other than in the target gene [74, 75, 76, 77]. Yuan et al. [78] was the first use of CRISPR/Cas9 technology in peanut to induce mutations in FAD2genes. The result showed that the same mutations of AhFAD2genes that occurred in nature could be induced by gene editing. We have increased oleic acid content with different levels using a CRISPR-based gene editing approach targeting several locations in the coding region and cis-regulatory RY element (CATGCATG) and 2S seed protein motif (CAAACAC) in the promoter region of peanut. Inducement of mutations in both coding and promoter regions using the CRISPR-based gene editing technology is ongoing in our peanut research. Hopefully, through gene editing, genotypes with high oleic acid content in soybean and peanut will be developed to complement the conventional breeding method.


3. Future perspectives and challenges in the mutagenesis of FAD2genes

As a model organism and economically important species in legumes, soybean has been intensively investigated in genetics and genomics for its genetic improvement. Precision gene editing systems have been used to change the profile of the soybean seed fatty acid panel. The TALEN technology has been used to target the FAD2genes, and induced mutations materialize by a significant increase of the oleic fatty acid content. CRISPR-based gene editing system has advantages of ease use, accuracy, high efficiency, and success in a wide range of crop species to induce mutations in FAD2genes. Transgene-free genotypes can be obtained through recombination of edited plants in the following segregation generations. However, the application of CRISPR-based gene editing is a challenge in polyploidy species due to multiple copies of target genes. Different mutant allele combinations would also change the content of oleic acid. Moreover, a complete loss of FAD2 function could result in important development defects due to the lack of polyunsaturated fatty acids that play a crucial role in maintaining the fluidity of the cell membrane in a cold temperature environment. The better strategy to accumulate oleic acid in seed only may implement gene editing to target cis-regulatory elements that implicate seed-specific gene expression in the promoter and avoid knocking down FAD2 expression in the entire plant.

Genetic transformation methods were developed using particle bombardment meristem cells and shoot tips and somatic embryogenesis in soybean. The establishment of these technologies has permitted the generation of soybean lines to improve its oil quality. However, legume species are generally difficult to transform and regenerate. The tissue culture procedure is time-consuming, genotype dependent, and recalcitrant to regenerate adventitious shoots from explants, particularly in soybean and peanut. Methodology to avoid tissue culture should be developed, such as floral dipping for Agrobacterium mediate delivery.


4. Conclusions

Fatty acids are essential components of cellular membranes and storage lipids that are regulated in part through the action of fatty acid desaturases (FADs) and related enzymes. FAD2gene encoding fatty acid desaturase 2 enzyme is responsible for converting oleic acid to linoleic acid in the developing seeds and directly affects seed oil quality in oilseed crops. Intensive genetic and genomic studies of FAD2genes in soybean as a model organism provide valuable information on understanding FAD2 gene family members to other oilseed crops. Due to high oleic acid’s nutritional and health value, efforts have been focused on generating mutations in the FAD2gene, which could lead to high oleic acid content. Mutations that occurred in both FAD2–1Aand FAD2–1Bgenes in soybean can result in the highest oleic acid content. Among the tools used for mutagenesis, CRISPR/Cas9 technology is a promising approach to target multiple genes simultaneously and precisely to efficiently induce mutations.



The authors would like to thank the financial support from USDA/NIFA (2018-67014-27572).


chapter PDF

© 2021 The Author(s). Licensee IntechOpen. This chapter is distributed under the terms of the Creative Commons Attribution 3.0 License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

How to cite and reference

Link to this chapter Copy to clipboard

Cite this chapter Copy to clipboard

Sy M. Traore and Guohao He (September 27th 2021). Soybean as a Model Crop to Study Plant Oil Genes: Mutations in FAD2 Gene Family [Online First], IntechOpen, DOI: 10.5772/intechopen.99752. Available from:

chapter statistics

33total chapter downloads

More statistics for editors and authors

Login to your personal dashboard for more detailed statistics on your publications.

Access personal reporting

We are IntechOpen, the world's leading publisher of Open Access books. Built by scientists, for scientists. Our readership spans scientists, professors, researchers, librarians, and students, as well as business professionals. We share our knowledge and peer-reveiwed research papers with libraries, scientific and engineering societies, and also work with corporate R&D departments and government entities.

More About Us