The 3-year (2010–2012) protein content range and mean for different maturity group (MG) elite breeding lines created at the agricultural institute Osijek in comparison to standard cultivars.
The potential of soybean for food, feed, and pharmaceutical industry arises from the composition of its seed. Since European countries import 95% of the annual demand for soybean grains, meal, and oil, causing an enormous trade deficit, the governments in Europe had started to introduce additional incentives to stimulate soybean cropping. To rebalance the sources of soybean supply in the future, production must be followed by continuous research to create varieties that would make European soybean more appealing to the processing industry and profitable enough to satisfy European farmers. This chapter is giving an overview of the European soybean seed quality research and an insight into soybean seed quality progress made at the Agricultural Institute Osijek, Croatia. The studies presented are mainly considering maturity groups suitable for growing in almost all European regions. The most important traits of soybean seed quality discussed are protein content and amino acid composition, oil content and fatty acid composition, soluble sugars, and isoflavones. Defining quality traits facilitates the parental selection in breeding programs aiming to improve the added value properties of final soybean products and enables the exchange of materials between different breeding and research institutions to introduce diversity, which is a prerequisite for genetic advance.
- seed quality
- chemical composition
Soybean (Glycine max (L.) Merr.) is the main oilseed crop of the world , a staple crop for protein-rich food and feed as well as a significant source of nutraceutical compounds with many different medical benefits . One of the major health-promoting traits of soybean seed is the proposed ability to reduce the risk of metabolic disorders, cardiovascular diseases and cancers [3, 4]. Even though the majority of soybean in the European Union (EU) is used as poultry and pork feed, and to a less extent for feeding dairy cows , the high nutritional value of the seed and its health-promoting traits are making soybean more and more appealing for human consumption. In the last decades (1998–2017), most of the world’s soybean seed was produced in the Americas (85.6%), whereas Europe produced only 1.6% . Moreover, European countries import 95% of the annual demand for soybean grains, meal and oil from overseas, causing an enormous trade deficit, so the governments in Europe had started to introduce additional incentives aiming to stimulate soybean cropping [6, 7]. As a result, the harvested area in Europe has been continuously increasing over the past few years . Furthermore, EU has a relatively high demand for non-genetically modified (non-GM) soybean in comparison to other parts of the world, and food/feed ingredients containing more than 0.9% of GM material must be labelled. This generated a system of segregation and identity preservation (IP) that ensures the identity of non-GM soybean is preserved through the entire supply chain. If we consider that EU imports about 2.7 million tons of non-GM IP soybean meal equivalent yearly and that the premium for non-GM IP is 20–30% of the price of non-segregated soybean while more than 80% of soybean area worldwide is planted with GM soybean , inciting the production of non-GM soybean in Europe becomes a matter of high importance. In order to keep the positive trends and rebalance the sources of soybean supply in the future, growing soybean production must be followed by continuous and intensive research in order to create improved varieties which would make European soybean more appealing to the processing industry as well as profitable enough to satisfy European farmers. Furthermore, because the frequency of adverse weather events has been increasing over the last 20 years , continuous and intensive research is necessary to create stable European varieties with high seed quality which would become an integral part of conservation agriculture.
Regardless of the final goal, the prerequisite of successful crop improvement is genetic variability of the traits of interest. Assessment of genetic diversity is necessary for germplasm characterisation, conservation, utilisation and the establishment of breeding programmes . If sources of genetic variability are available, the genetic advance can be achieved not only with genetic engineering but also with continuous breeding efforts using conventional hybridization and selection methods together with modern chemical, biochemical and genetic analyses . Conventional soybean improvement starts with the creation of a large recombinant inbred line population of significant diversity by hybridization of chosen parental components carrying the traits of interest. According to Burton , genetically distant elite parental lines have the greatest chance of producing superior progeny. Genetic characterisation and evaluation of divergence of parental material prior to hybridization can be made based on the pedigree , by measuring and analysing the variability of qualitative and quantitative morphological properties , on the basis of biochemical properties [14, 15] and with the help of molecular markers [16, 17]. All phenotype determination methods must consider the interaction of genotype and environment, whereas the efficacy and reliability of molecular markers are based on the ability of direct and rapid determination of genotype divergence excluding the environmental influences. Even if molecular methods of determining the differences in genetic constitutions are not available, creating a diverse population is enabled by the positive correlation between phenotypic variability and genetic divergence . Therefore, phenotype evaluation is crucial for successful crop improvement.
The process that negatively affects crop improvement is the loss of genetic diversity as a consequence of human activity and the influence of the environment. This process, which Harlan  called genetic erosion, appears because of the replacement of diverse indigenous populations with modern, new, uniform cultivars and hybrids, and it causes a considerable threat to the production of food and hence the survival of humans. A narrow genetic base has been identified in most soybean germplasm studies [20, 21]. However, Hahn and Würschum  noted that the genetic base of middle European genotypes was not as narrow as expected because of the unsystematic phenotype selection. If narrowing of the genetic base is not stopped, it can result in complete crop destruction due to the lack of tolerance to adverse abiotic or biotic factors. This is why it is important to take all the measures necessary for preserving genetic diversity not only as it is crucial for crop improvement  but also to preserve biodiversity which ensures natural sustainability for all life forms. Furthermore, it is crucial to ensure the availability of this biodiversity to breeders, researchers and producers in order to ensure the genetic advance of cultivars in the future. As in other plant species, genetic diversity for soybean is preserved by creating germplasm collections on a local level and gene banks on a global level. Creating large and diverse germplasm collections as genetic resources necessary for the production of high-yielding and high-quality, commercially important cultivars is only possible if there is a continuous research. Since the commercial significance of soybean emanates mainly from the chemical composition of its seed, further improving the genetic basis of soybean seed quality should nowadays be as important as increasing the seed yield. Higher seed yield means more food produced on the same land area, but when 2 billion people worldwide are known to suffer from malnutrition , the quality of that food is of concern as well. This is why crops with high nutritional value, such as soybean, can greatly contribute to improving human and animal diets and health, as well as provide quality stock for pharmaceutical and functional food industries.
This chapter is giving insight into soybean seed quality research beneficial for defining the quality traits in available germplasm and choosing parental components in the soybean breeding programmes aiming to improve the added value properties of final soybean products. The studies presented here are mainly focused on maturity groups (MGs) 00 to II commonly sown in Central and South-eastern Europe but suitable for growing in almost all European regions . The most important traits of soybean seed quality discussed here are protein content and amino acid composition; oil content and fatty acid composition; content and composition of soluble sugars, especially oligosaccharides; and content and composition of isoflavones.
2. Protein content and amino acid composition
Soybean protein, taking up 40% of the dry seed weight (DW) on average , is highly valued for food and feed because of its amino acid composition and a high digestibility . Although it is thought to be the equivalent of animal protein , soybean proteins are deficient in sulphur-containing amino acids of which methionine and cysteine are considered the most limiting in animal feed . Nevertheless, after appropriate heat treatment to reduce protease inhibitor activity, soybean proteins are considered to be superior to the proteins of other legumes in their growth-promoting properties . Soybean seed proteins are made up from four major fractions: 2S, 7S, 11S and 15S, with 7S (glycinin) and 11S (ß-conglycinin) being the most abundant . The ratio between 7S and 11S subunits has an important role in determining the functional properties of food made from soybean  and protein quality in soybean as glycinin has more sulphur-containing amino acids (cysteine and methionine) than ß-conglycinin . Furthermore, glycinin is considered to lower the cholesterol levels in human serum , and it is important for tofu gel formation , whereas α subunit of ß-conglycinin is identified as one of the major allergenic proteins in soybean . The nutritional value, utilization and digestibility of soybean proteins are further affected by bioactive compounds with toxic and/or anti-nutritional properties such as lipoxygenases, lectins, urease, the Kunitz trypsin inhibitor and the Bowman-Birk trypsin inhibitor , causing digestive and metabolic diseases in animals . Some of these anti-nutritional factors can be destroyed or inactivated by heat treatment and some by supplemental enzymes, but others are unaffected by the methods applied commercially . Nevertheless, in 2018, soybean represented 70% of protein meal used worldwide . Defatted soybean meal has the highest level of crude protein among plant-based protein sources , and it is the main source of protein in commercial feed mixtures for poultry, livestock and fish farms . For the production of these nutritionally valuable proteins, soybean needs less land area and less N than wheat and other small grains, maize, rice, fruits and vegetables. The emergence of soybean as a dominant global crop resulted in a partial offsetting of the dramatic increases in global soil nutrient withdrawals occurring as a result of the expansion of agriculture over newly cultivated land and rising yields in the recent decades while, at the same time, slightly increasing the overall protein content of the global harvest . In the last several decades, the dominance of the soybean on the world crop market was further enhanced by the increase in meat consumption and livestock production, which caused the increase of the demand for high-protein materials for livestock feed in Europe as well . It was estimated that global meat and milk demand would have an increase of 57 and 48% respectively, between 2005 and 2050, due to the fast-growing population and rising incomes in developing countries , while livestock production was estimated to increase by 21% between 2010 and 2025 . Plant protein production in the EU, on the other hand, does not follow this increase, so almost 70% of domestic needs are covered by imported plant protein . Because importing 1 kg of dried soybean meal to the European Union from South America is associated with 11.65 kg CO2 equivalent emissions  and because the price of proteins on the world market is continuously rising , this imposes major environmental, economic and social problems. Therefore, increasing local plant protein production and superseding the import of feed protein would not only increase Europe’s protein self-sufficiency but also decrease the negative environmental footprint of animal production . Furthermore, including legumes such as soybean in crop rotation dominated mainly by cereals and non-legume oilseeds, which is the case in Europe , can have positive effects on soil quality and agrobiodiversity and can contribute to reducing nitrous oxide emissions and nitrate-N leaching . These environmental benefits, together with benefits arising from the better balance of EU agriculture and trade, are the main reasons the provisions of the new Common Agriculture Policy (CAP) include the promotion of protein crops in Europe as a priority .
In general, increasing plant protein production can be done by increasing the area under high-protein crops or by increasing the protein content by breeding. Even some management practices can influence protein content. For example, minimum tillage and herbicide application can decrease the protein content in comparison to conventional tillage and no herbicides , and irrigation at the beginning of pod formation and during seed-filling can result in protein content increase [51–53]. Furthermore, Foroud et al.  and Bouniols et al.  noted that protein content was highest when irrigation was applied after the flowering, whereas continuous water supply decreased it. According to Afza et al. , nitrogen (N) fertilization during the seed-filling stage increased the protein content, and earlier N application, e.g. in flowering, did not affect the protein content , whereas, sulphur-containing amino acid composition fluctuated depending on the nitrogen source  and on the availability of reduced forms of sulphur as well . Zimmer et al.  reported that protein content of non-inoculated soybeans was significantly lower than protein content of the inoculated soybeans, and Vollmann et al.  reported that N fertilization at the flowering stage was superior to both the control and rhizobium inoculation in increasing seed protein content, but as symbiotic N fixation is a highly complex phenomenon, the difference in protein content could have been the result of many different factors and their interactions which demand further investigation. Although correct management practices can increase the protein yield of soybean crops, improving the genetic base of protein content is the right approach for providing high-quality soybean cultivars. Furthermore, because the agroclimatic conditions in Europe are not ideal for widespread cultivation of protein crops like soybeans , increasing protein yield per area of arable land by plant breeding is the only efficient way of decreasing the plant protein deficit in Europe in which genetic diversity and population structure have a key role .
The soybean protein content is a quantitative trait inherited polygenetically, and it ranges between 30 and 50% on a dry weight basis . Besides earlier mentioned management practices (M), protein content is influenced by genotype (G), environment (E) and interactions G × E × M, but the contribution of each is still not well established [50, 63–67]. The G and E interaction is one of the main problems in genotype selection as well as in the recommendation of cultivars. Because of this, estimating the broad sense trait heritability is essential for successful genetic advance and the larger the estimated value of this parameter is, the greater will be the chance of success with selection . Broad-sense heritability assessments for soybean vary from middle, which suggests variability due to the combination of genetic and environmental factors, to high, where genetic factors are more pronounced in determining the protein phenotypes [67, 69, 70]. The influence of the environment on protein content has been researched by many authors. According to Popović et al.  and Josipović et al. , protein content is higher in years with lower average temperatures and more precipitation during pod formation and seed-filling stages. On the other hand, Hurburgh  and Nian et al.  reported reduced protein content in northern regions of soybean cultivation with lower temperatures and higher amounts of precipitation, which can be the result of reduced symbiotic N fixation and synthesis of proteins due to the low root-zone temperatures . Gibson and Mullen  reported that high daily temperatures in combination with high night temperatures during the period from seed-filling to maturity can cause positive linear relationship of average temperatures and protein content. Furthermore, Dornbos and Mullen  reported that high temperatures in combination with drought in seed-filling period can result in significant protein content rise, whereas Vollmann et al.  and Matoša Kočar et al.  noticed that seed protein content was highest for soybean crops grown under moderately dry conditions and high temperature during the seed-filling stage. Such contradictory reports are a major concern in plant breeding since they complicate decision-making process during selection. Breeding for increased protein content is further complicated by negative correlation between protein content and yield reported by many authors [70, 78, 79], although there are studies indicating no significant correlation between those two parameters [80, 81]. Furthermore, Vollmann et al.  suggested that even moderately negative correlation between seed protein content and seed yield, means the selection of breeding lines with both improved protein content and acceptable yield level should be possible. On the other hand, the relationship between protein content and oil content is almost always significantly negative [61, 70, 81] which is caused by either pleiotropic effect or linkage  where every 2% protein content increase usually decreases oil content by 1% . Apart from protein content, the content of sulphur-containing amino acids should be considered as it affects the nutritional value of soybean meal, so increasing the content of both would be beneficial. However, Wilcox and Shibles  determined a weak and inverse correlation between protein content and methionine and cysteine levels which means protein quality decreases with the increase in protein content, so Paek et al.  concluded that it would be beneficial to focus on improving protein quality over protein content. Nevertheless, Burton et al.  noticed no significant changes in methionine content among cycles of recurrent selection for high protein. Furthermore, Krober and Cartter  reported a positive relationship between methionine content and protein content suggesting the possibility of combining high-yield, high-quality protein soybean lines.
As mentioned earlier, improving any trait of interest should start with screening the available materials. In Europe, soybean seed content was researched by many authors over the years. Vollmann et al.  determined a considerable variation in protein content of early maturing genotypes studied during 6 years in Austria. In their study, genotype × treatment and genotype × year interactions were of a much lower magnitude than the effect of genotype on protein content . Sudarić et al.  researched 14 MG 0-I soybean genotypes at three locations in eastern Croatia during 5 years and reported highly significant effects of genotype, year, location, genotype × year, genotype × location interactions as well as a highly significant effect of the interaction between genotype, year and location with year effects being larger than location effects. Protein content range in the research by Popović et al.  was 36.65–37.66%, and average value was 37.15%. Fogelberg and Recknagel  determined the protein content range of early matured, well-adapted soybean cultivars evaluated at three sites in Germany during 3 years to be 37–43%, while Pannecoucque et al.  reported a protein content range of 35.5–43.3% in 14 early maturing soybean varieties tested in two consecutive years in two locations in Belgium. In a study by Kurasch et al.  two types of European soybean varieties that maximise protein yield were distinguished: one that produces high grain yields per hectare with protein contents around 39–42% and the other with slightly lower protein yields than the first type but characterised by very high-protein contents (above 42%). Soybean breeding at the Agricultural Institute Osijek (Croatia) resulted in an increase of protein content over time, and this progress can be followed through three studies. The first one , conducted in the 1990s, researched 22 soybean cultivars (11 standard cultivars and 11 promising lines) whose protein content range was determined to be 35.9–38.4%. Afterwards, in a 3-year study (2001–2003), Vratarić et al.  tested 132 elite breeding lines and standard cultivars grouped according to their MG (00, 0, I) and reported that for all MGs, elite breeding lines had significantly higher values for protein content than their respective standards. Ranges determined were 35.2–38.7%, 36.2–39.6% and 38.7–41.2% for MG 00, 0 and I, respectively . In the third research, Matoša Kočar et al.  determined significant variation for soybean seed protein content, influenced by genotype and year in a 3-year (2010–2012) screening study with 21 soybean genotypes (MG 00 to I) created at the Agricultural Institute Osijek (Croatia). The lack of significant difference in average protein content between elite breeding lines created at the Agricultural Institute Osijek and standard cultivars (Table 1) can be explained with the significant progress already achieved in previous selection cycles . Nevertheless, wide ranges of variation for protein content in genotypes from the Agricultural Institute Osijek indicate there is still a room for improvement . Although, according to Sato et al. , most of the world is focused towards increasing soybean seed oil content and oil yield, average protein content is at the world level higher than protein content reported for Central European soybean varieties . For example, Bueno et al.  reported that average protein content value in 18 Brazilian soybean genotypes was 42.44% (40.20–44.49%), Sharma et al.  reported average protein content values being 41.4% (39.40–44.40%) in eight genotypes in India, while Ramteke et al.  reported somewhat lower average protein content (40.23%) determined in 92 soybean varieties in India with values ranging from 37.69 to 42.74%. Furthermore, protein content determined in the whole US Department of Agriculture (USDA) soybean germplasm collection ranged from 34.1 to 56.8% . Higher average protein contents in non-European studies are maybe due to the fact that soybean breeding is still at a relatively low level in Europe compared to the USA or Asia because most of the soybean demand in Europe has been relying on import , so developing high-quality genotypes was not of great economic importance. Therefore, European breeding programmes would benefit from introducing foreign accessions with high-protein contents to be used as parental components in crosses, providing these accessions are not transgenic, since that is not socially acceptable in Europe .
|MG||Elite breeding lines||Standard cultivars||t||P|
|Protein (% DW)|
|00||35.86–43.82||39.87||35.52–43.55||39.44||0.37ns||P > 0.05|
|0||35.83–44.33||40.42||34.14–42.48||39.08||1.28ns||P > 0.05|
|I||36.02–44.04||40.19||36.34–42.70||39.07||1.40ns||P > 0.05|
If sources of variation for high-protein content are available, the next step is choosing the most efficient breeding method. Sato et al.  reported that crossing adapted elite genotypes with advanced high-protein donors exhibiting a seed protein content of approximately 470 g/kg protein improved protein content in progeny more than crossing adapted genotypes and food-grade cultivars. Increasing the protein content from 46.3% in the initial parental population to 48.4% was achieved after six cycles of selection, without significantly reducing yield . Furthermore, Thorne and Fehr  concluded that three-way crosses were superior to two-way crosses for creating high-protein, high-yielding lines. Wehrmann et al.  reported that selection for high protein between two backcross generations increased protein content while maintaining seed yield. Wilcox and Cavins  were able to create high-protein BC3 line with the yield at the same level as the recurrent parents, but it took them 20 years because it was necessary to select for high protein between each backcross. Cober and Voldeng  evaluated single-cross and backcross breeding methods of achieving high-yield, high-protein soybean genotypes and found no significant differences between them. Selecting appropriate genotypes during the breeding process can be made easier and more efficient with marker-assisted selection (MAS), especially since quantitative trait loci (QTL) for proteins have been mapped to locations on all chromosomes. QTL mapping and genome-wide association studies (GWAS) have identified 252 QTL associated with soybean protein and distributed on all 20 soybean chromosomes . Brummer et al.  evaluated eight soybean populations from the Midwestern USA for genetic markers linked to seed protein content detecting a significant association between markers and traits and identifying environmentally stable and sensitive QTL. Sebolt et al.  successfully integrated G. soja alleles for high-protein content into G. max background by backcrossing. Genetic marker alleles linked to the QTL allele from G. soja on linkage group (LG) I were significantly associated not only with higher protein content but with lower oil content, reduced yield, smaller seeds, taller plants and earlier maturity. Hwang et al.  used GWAS to identify QTL controlling seed protein content in 298 soybean germplasm accessions exhibiting a wide range of protein content and identified 40 single-nucleotide polymorphisms (SNPs) in 17 different genomic regions significantly associated with seed protein. GWAS also resulted in narrowing the genomic region previously reported to contain protein content QTL  which is important because narrower GWAS-defined genome regions will allow more precise marker-assisted allele selection and will expedite positional cloning of the causal gene(s) . Bandillo et al.  studied 12,000 accessions from the USDA soybean germplasm collection by using GWAS and had identified SNPs for protein and oil content with strong signals on chromosomes 20 and 15 with chromosome 20 region, previously reported to be important for protein and oil content, further narrowed so it contained only three plausible candidate genes. Besides genomic studies, identifying markers for protein content can be done using metabolomics, a non-targeted approach monitoring hundreds of metabolites, but there is still little knowledge about the association between metabolites and protein content in soybean seeds, especially in similar genetic backgrounds . Nevertheless, Wang et al.  evaluated metabolic diversity in a soybean near-isogenic line (NIL) population derived from parents with contrasting seed oil contents comparing seed primary metabolites of high-protein/low oil lines, low-protein/high oil lines and their parents. Results indicated that metabolic profiles of all progeny lines could be discriminated based on protein and oil contents. All such molecular and genomic investigations help to reveal the genetic architecture of complex traits, protein content being one of them, and enable understanding of the genetic basis of trait variation which is crucial for improving seed quality as well as other important agronomic traits. Nonetheless, local cultivar testing is particularly important for the development of the crop, despite the wide adaptation of soybean, since there could be substantial site effects within regions  which mean it would be beneficial for the European soybean breeders to develop their own, local varieties with increased protein content, rather than relying on foreign introductions.
The next step after increasing soybean seed protein content could be breeding for more favourable amino acid composition and improved 11S/7S protein fraction ratio. For example, while the manipulation of other aspects of seed composition and processing may improve amino acid assimilation, increasing the relative proportion of methionine, lysine and threonine has become a goal in soybean breeding . Almost 20 years ago, the economic benefit of improved essential amino acid content was estimated to be ~$5 per ton per 10% increase of any of the above-mentioned amino acids , and today the benefit could be even bigger. The knowledge of the molecular mechanism of amino acid biosynthesis in soybean was mainly limited to genetic mapping [106, 107], but recent studies have enhanced our understating of amino acid genetic architecture in soybean. For example, Vaughn et al.  conducted GWAS for methionine, threonine, cysteine and lysine and identified multiple loci associated with multiple amino acids across different populations not reported previously. Furthermore, Qiu et al.  predicted 12 candidate genes based on the synteny between 113 genes of sulphur-containing amino acid synthases and 33 related QTLs in soybean and bioinformatic analyses, and many QTL related to the 7S (β-conglycinin) and 11S (glycinin) fractions of soybean storage proteins have been identified on chromosomes 1, 3, 4, 6, 10, 13, 16, 17, 19 and 20 [110–112]. The 11S/7S ratio of soybean seed can vary greatly due to both genetic and environmental differences [113, 114], which means there is room for improvement. In a study by Žilić et al. , the ratio of 11S/7S proteins varied from 2.43 to 3.29 among the seven soybean varieties adapted to South-Eastern European conditions. Murphy and Resurreccion  reported the 11S/7S protein ratio to vary from 2.1 to 3.4 in 12 soybean varieties, while Mujoo et al.  determined this ratio to be 1.63–2.05 in 7 soybean varieties. Since 7S protein fraction is more favourable for food and feed, lowering 11S/7S ratio would be beneficial, but at the moment it is still far from a priority for European breeding programmes. Furthermore, according to Zhang et al. , breeding efforts for the improvement of soybean seed amino acids have rarely been reported, much less developed on a commercial scale, because methods used for quantifying amino acids in soybean, such as HPLC, are time-consuming and not cost-effective, therefore not suitable for high-throughput screening of a large number of lines .
3. Oil content and fatty acid composition
Soybeans represent 59% of the world’s oilseed production and 29% of the total vegetable oil consumption in the world . Oil from soybean seed can be found in margarine, salad dressings and cooking oils and in industrial products such as plastics and biodiesel fuel as well. The content of oil in soybean seed can range from 12 to 24% DW, while most commercial cultivars contain between 19 and 23% . Soybean oil is composed of triacylglycerols (TAGs), which are products of fatty acid esterification . Soybean oil fatty acids are responsible for nutritional value, stability and taste of soybean oil, and commodity soybean oil is in average made up from 13% palmitic acid (16:0), 4% stearic acid (18:0), 20% oleic acid (18:1), 55% linoleic acid (18:2) and 8% linolenic acid (18:3) . Fatty acids in soybean can be saturated (palmitic and stearic acid) or unsaturated (oleic, linoleic and linolenic acid). Soybean oil intended for frying should, for example, contain 7% saturates, 60% oleic acid, 31% linoleic acid and 2% linolenic acid, whereas desired fatty acid composition of oils intended for industrial use is 11% saturates, 12% oleic acid, 55% linoleic acid and 22% linolenic acid. Nevertheless, the most desirable fatty acid phenotypes for soybean oil are those having the most applications for food, feed and industry, i.e. phenotypes with saturates reduced to less than 7%, linolenic acid reduced to less than 3% and oleic acid increased to more than 55% . Lowering saturates makes oil more suitable for the food industry and consumers concerned about dietary health issues such as high cholesterol and increased risk of coronary heart disease usually associated with diets high in saturated fats . However, as stearic acid is thought to be heart neutral , increasing its content could satisfy the market demand for healthier foods without compromising desired functional qualities contributed by saturated fats . Unsaturated fatty acids can be monounsaturated (MUFA), such as oleic acid (n-9, omega-9), or polyunsaturated (PUFA), such as linoleic (n-6, omega-6) and linolenic (n-3, omega-3) acids. Oleic acid is not considered to be an essential fatty acid (EFA) since the human body can synthesise it in small amounts, but it is significant as a precursor of some EFAs . On the other hand, polyunsaturated linoleic and linolenic acids are considered essential, supporting the cardiovascular, reproductive, immune and nervous systems, crucial for manufacturing and repairing cell membranes . However, these PUFAs are susceptible to oxidation, so they reduce the shelf life of the oil, causing low stability at high cooking temperatures as well as off-flavours . Oxidative stability of soybean oil generally decreases with the increasing degree of unsaturation  and can be assessed by the ratio between MUFA and PUFA contents (MUFA/PUFA) . Soybean oil has poor oxidative stability compared to other vegetable oils, and its MUFA/PUFA is about 0.5 on average . The average value for MUFA/PUFA in the research by Matoša Kočar et al.  was 0.4 suggesting these genotypes would give less stable final product, i.e. edible oil with shorter shelf life and lower stability for cooking at high temperatures . Oxidative stability of soybean oil can be improved in two ways: by trans-isomer producing catalytic hydrogenation or by breeding for higher content of oleic acid (MUFA), which is also known to reduce cholesterol and reduce the risk of arteriosclerosis and heart disease [93, 124, 126]. On the other hand, diets rich in trans-fatty acids increase the risk of cardiovascular diseases , so increasing oil stability by hydrogenation is becoming more and more unpopular. Soybean oil with high-oleic acid (>70%) and low-linolenic acid content (<3%) is desirable for its oxidative stability and health benefits, with the linolenic acid content of <3% being the current market target to increase soybean oil functionality . As oxidative stability of soybean oil can be assessed by MUFA/PUFA, nutritional quality can be assessed by the linoleic and linolenic acid ratio (LA/ALA) with an optimal value between 5 and 10:1 . Three-year average LA/ALA reported in the study by Matoša Kočar et al.  was 7.36, which falls within the recommended range for vegetable oils suggested by Rani et al. , meaning that tested genotypes can produce nutritionally acceptable oil. On the other hand, very high LA/ALA is considered detrimental for human health, and so lowering it by breeding can help with the prevention against degenerative pathologies . Even though the demand for vegetable oil is continuously increasing and this increase is expected to continue in the period from 2018 to 2027, as the production of oilseeds is mainly concentrated in few regions of the world, weather fluctuations and extreme whether having a negative effect on the yield impact oilseed market more than other major crop markets, causing market uncertainties and price volatility . This, together with evident changes in consumer preferences concerning dietary value and functional properties of oil, is causing the world oilseed market to become more competitive, which encourages breeding programmes to focus not only on increasing oil yield but also on increasing oil quality to meet the demands of industry and end-users alike . Furthermore, because European consumers still prefer GM-free oilseed end products, while the area planted with soybean, the main oilseed crop of the world , is more than 80% planted with GM soybean , European soybean breeders should focus on creating high-oil GM-free varieties with conventional breeding methods so that they are able to provide high-quality and healthy end products to European consumers. Seed oil content is a complex quantitative trait, controlled by multiple genes and affected by environmental factors [124, 133]. The broad-sense heritability of soybean oil content varied considerably in the range between 0.13 and 0.99 [7, 134–136] suggesting that trait inheritance is a function of G, E, M and G × E × M. For example, oil content can be in positive correlation with temperature [137, 138], but negative linear relationships  and quadratic relationships  were also reported. Water stress negatively impacts oil content , but irrigation at the beginning of pod formation and during seed filling has resulted in oil content decline [51–53]. Some other crop management factors (no-till, seed treatments, foliar N, fungicide and insecticide applications and rotation) had an overall positive effect on oil content because they were improving crop growing conditions through conserving or supplying water and nutrients or improving soil physical and chemical properties and protecting the crop from diseases . On the other hand, a decline in oil content with increased N application  and lack of response  were reported as well. While monocropping negatively affected soybean oil composition in comparison to crop rotation with maize , early planting benefited oil [145–147]. Furthermore, Assefa et al.  reported that planting date had larger impacts at northern latitudes (40–45 N) than southern latitudes (30–35 N). Although the influence of M for oil content was found to be significant [144–147], in a comprehensive analysis of G, M and E factors influencing soybean yield quantity and quality across the US Corn Belt, Assefa et al.  concluded that E is a dominant factor for the significant variability in seed composition and yield. According to Josipović et al.  and Popović et al. , the amount of oil in soybean seed was significantly higher in years with less precipitation and higher air temperatures at the time of pod formation and dry matter accumulation. This was also true for the average oil content determined by Matoša Kočar et al. , noting that hot and extremely humid conditions resulted in the lowest average amount of oil in soybean seed, whereas hot and dry conditions resulted in higher average amounts of oil. On the other hand, Vollmann et al.  reported that low precipitation rates and high temperatures did not favour oil synthesis and that, in genotypes of later maturity, oil synthesis was enhanced by late water availability. The significance of genotype and environmental factors for fatty acid content was determined by many authors as well [7, 61, 65, 67, 127, 146, 148]. Bellaloui et al.  noted that, among fatty acids, oleic acid was the most sensitive to environmental factors and palmitic and stearic acids were the least sensitive in clay soil, whereas stearic, linoleic and linolenic acids were the least sensitive in sandy loam soil. According to the same authors, cooler temperatures favour the synthesis of linolenic acid, whereas the synthesis of oleic acid is negatively affected as a result of the inverse relationship between them . The same was true in the research done by Xue et al. , where increasing air temperature during pod fill significantly increased oleic acid and significantly decreased linoleic and linolenic acid contents which was later confirmed by Matoša Kočar et al. .
Together with understanding the sources of variation, knowing the strength and direction of relationships among different traits is of great value for breeding programmes as it enables the enhancement of more than one trait at the same time. Although significantly negative relationship between oil and protein content was reported in most researches [61, 70, 81, 84], in the research analysing variations from 21 studies conducted over 15 years (2002–2017), Assefa et al.  concluded that, when pooling data across all studies, there was no significant relationship between oil and protein content, but a tendency for a negative relationship was observed when plotting data separately for each of the studies evaluated in the database. Furthermore, oil content increased slowly with yield increase suggesting a positive relationship, but when relationships were investigated by study, 63% of studies supported a positive relationship, and the other 37% displayed a slightly negative relationship between seed yield and oil content . Wilcox and Shibles  reported that oil increased by 1.9 g kg−1 for every 100 kg ha−1 increase in seed yield, whereas protein decreased by 15.6 g kg−1 for each 10 g kg−1 increase in oil, which is in accordance with determined significant positive correlation between oil content and yield and significant negative correlation between oil and protein contents. The analysis of the correlation between oil content and different fatty acid contents in eight early maturing soybean genotypes revealed a positive correlation between oil and saturated fatty acids but negative correlation between oil and unsaturated fatty acids . Matoša Kočar et al.  reported a highly significant and positive correlation of oil with stearic acid and significant positive correlation with oleic acid but significant negative correlation with linoleic and linolenic acid. The results were somewhat different in the research conducted by Rani et al.  where oil was in highly significant positive correlation with stearic acid and linoleic acid but in negative correlation with oleic acid. The correlation between favourable oleic acid and linoleic acid was highly significant and negative in the research by Matoša Kočar et al. , the same as in some previous studies [126, 150], while the correlation between oleic acid and unfavourable linolenic acid was also highly significant and negative. These relationships indicate that it could be possible to create varieties with high oil amount, a high amount of oleic acid and low amount of linolenic acid, which is favourable for edible oil.
Whether the improvements in soybean oil yield and its quality are achieved by conventional breeding or genetic engineering [2, 119], having a favourable gene pool concerning the trait of interest is crucial. In Europe, the variability of oil content was studied by many authors [61, 70, 77, 88, 89, 127]. At the Agricultural Institute Osijek, breeding for soybean seed oil over time is presented by three studies. In the first one , the range for oil content determined in 22 soybean cultivars tested from 1993 to 1995 was 18.9–20.5%. The second one  tested 132 elite breeding lines and standard cultivars grouped according to their MG (00, 0, I) during 3 years (2001–2003). All three elite breeding line groups had significantly higher average oil contents than their respective standards. The range for MG 00 was 22.1–23.8%, for MG 0 22.5–23.4% and for MG I 21.9–22.7% . The third research from the Agricultural Institute Osijek, conducted between 2010 and 2012 with eight MG 0 advanced soybean breeding lines, reported oil content ranges from 22.09 to 24.08% . As it is evident from Table 2, there were no significant differences determined between the average oil content of elite breeding lines created at the Agricultural Institute Osijek in the last decade in comparison to standard cultivars. This is expected because significant progress has been achieved in earlier selection cycles and oil content in genotypes from Osijek is already at a relatively high level. Furthermore, as soybean seed is mostly used for animal feed in Croatia; further increasing oil content is not a priority. However, there were genotypes among the newer elite breeding lines that accumulated significantly higher amounts of oil than the rest, indicating their suitability for being used as parental components . Somewhat lower oil content range values (18.55–18.73%) were reported by Kurasch et al.  researching 1008 F5:8 recombinant inbred lines MG (MG 000–00) that showed good agronomic performance in Central Europe. Pannecoucque et al.  determined the oil content range being 20.3–24.1% in 14 very early (MG 0000–00) soybean genotypes selected from the European plant variety catalogue. Outside Europe, Bueno et al.  reported 22.1% average oil content determined in 18 Brazilian soybean genotypes with values ranging from 20.72 to 22.81%, and Sharma et al.  noted 16.3% average oil content for 8 genotypes in India ranging from 14 to 18.7%, whereas oil content determined in the whole US Department of Agriculture soybean germplasm collection ranged from 8.1 to 27.9% . Variability of soybean seed fatty acid content in European commercial breeding programmes has seldom been researched, but in other parts of the world where soybean is considered the main crop feeding oil industry, many studies have been conducted. Significant variability of the fatty acid content was determined in eight advanced soybean breeding lines (MG 0) developed at the Agricultural Institute Osijek, Croatia . The contents were 10.81, 5.78, 26.41, 49.75 and 5.97% for palmitic, stearic, oleic, linoleic and linolenic acid, respectively , which mostly coincided with fatty acid content in commercial soybean . The positive outcomes of initial breeding efforts for increasing oleic and reducing linolenic acid contents at the Agricultural Institute Osijek can be seen from the significant difference between the average oleic acid content of MG 0 elite breeding lines in comparison with the MG 0 standard cultivars and significant difference between the average linolenic acid content of MG 00 and 0 elite breeding lines in comparison with the MG 00 and 0 standard cultivars, respectively (Table 2). Nevertheless, according to Matoša Kočar et al. , none of the newly developed genotypes from the Agricultural Institute Osijek exhibited desirable values for oleic or linolenic acid contents; the breeding programme would benefit from the introduction of high-oleic and low-linolenic soybean germplasm. Average year values determined in a 3-year research with RILs exhibiting a broad variation for fatty acid content ranged between 10.3 and 11.1%, 3.7 and 3.9%, 22.3 and 26.2%, 51.4 and 55% and 7.5 and 8.6% for palmitic, stearic, oleic, linoleic and linolenic acid, respectively . Average fatty acid contents determined in 96 diverse accessions originating from different regions of the world and evaluated for 2 years in Brazil were 10.59, 3.21, 24.54, 52.81 and 6.36% for palmitic, stearic, oleic, linoleic and linolenic acids, respectively . In Indian soybean germplasm, the average fatty acid contents were 11.15, 3.45, 25.6, 52.56 and 6.91% , while in Chinese they amounted to 11, 3.63, 21.47, 53.99 and 9.93%  for palmitic, stearic, oleic, linoleic and linolenic acids, respectively. As it can be seen from the data presented, none of the genotypes exhibited target values for oleic acid (45–60%) and linolenic acid (<3%) contents desirable for avoiding partial hydrogenation  which emphasises the need for further research and screening studies.
|MG||Elite breeding lines||Standard cultivars||t||P|
|Oil (% DW)|
|00||20.59–25.48||23.29||21.01–23.81||22.54||1.49ns||P > 0.05|
|0||20.24–25.84||22.85||20.06–23.85||22.15||1.45ns||P > 0.05|
|I||20.93–26.40||23.24||20.66–24.40||22.40||1.76ns||P > 0.05|
|Oleic acid (% oil)|
|00||20.58–31.72||25.72||21.37–27.51||23.92||1.67ns||P > 0.05|
|0||19.51–28.91||23.59||18.74–24.84||21.38||2.68**||P < 0.01|
|I||20.27–28.05||24.78||19.48–26.83||23.54||1.30ns||P > 0.05|
|Linolenic acid (% oil)|
|00||4.41–7.04||5.99||5.36–8.27||6.93||2.23*||P < 0.05|
|0||4.99–8.65||7.15||6.97–7.99||7.59||2.54*||P < 0.05|
|I||5.10–7.98||6.59||5.37–8.73||7.17||1.37ns||P > 0.05|
Nowadays, breeding for increased oil content can be facilitated by genetic markers or by monitoring metabolites, but although Wang et al.  reported that metabolic profiles of all progeny lines could be discriminated based on oil contents, there is still lack of sufficient information about the association between metabolites and oil content in soybean seeds. Nevertheless, QTL mapping and GWAS have identified 322 QTLs associated with soybean oil distributed over all 20 soybean chromosomes but mainly on chromosomes 5, 15 and 20 . Fan et al.  identified 35 additive QTLs underlying individual fatty acid contents in a single environment and 17 additive QTLs across multiple environments or underlying multiple fatty acids. Priolli et al.  discovered 19 single-nucleotide polymorphism loci on 10 different chromosomes significantly associated with palmitic acid, oleic acid and total oil contents with loci and specific alleles that contributed to lower palmitic and higher oleic acid contents. Thapa et al.  identified independent mutations in the FAD3A associated with a reduced level of linolenic acid. Combs and Bilyeu  reported that seed stearic acid increased to 10–11% in lines containing combinations of FAD2-1A and FAD2-1B mutant alleles plus the SACPD-C missense mutant alleles, but this increase was associated with a decrease in the oleic acid content and did not meet the target of at least 20% stearic acid in the seed oil. Of all the reported QTLs associated with oil traits in soybean, only two (cqPro/oil-15, cqPro/oil-20) have been officially confirmed and repeatedly detected in several different populations, both associated with protein and oil contents and each showing opposite additive effect directions for the two traits, which is why identifying an environmentally stable major QTL regulating seed oil content is of crucial importance [100, 102, 103, 155]. Moreover, QTLs directly related to seed oil accumulation in soybean have not been cloned, so the underlying mechanism has not been thoroughly elucidated to date . Although advance in oil content has been achieved over time, further increasing oil content by traditional breeding based on genetic crossing and phenotypic selection may be difficult and inefficient, because of the polygenic nature of oil regulation and the majority of oil-related loci having varying additive, epistatic or QTL × E effects [156–158]. Some studies suggest that increasing soybean oil could be achieved by genetic engineering of transcription factors involved in oil accumulation [159, 160]. As further increasing oil content is deemed challenging without genetic engineering [156–158] and taking into consideration that European oil industry is dominated by sunflower and rapeseed , it is evident that improving oil content and oil quality in Europe is not economically as important as in regions of the world where soybean is the main stock for oil production. Nevertheless, research and genotype screening for determining germplasm with favourable QTLs controlling oil traits are of great importance, especially since fatty acid phenotypes that meet the breeding objectives of providing improved oil quality are still lacking, not just in European soybean germplasm but in genotypes from all regions of the world.
4. Soluble sugars
Totally dry soybean seed contains 33% DW of carbohydrates on average, of which 16.6% DW is soluble sugars . The significance of the soluble sugar profile is in its effect on quality, digestibility and nutritional value of soybean for food and feed. Five main soluble sugars in soybean seed are glucose, fructose, sucrose, raffinose and stachyose with sucrose and stachyose being the predominant ones . Monosaccharides glucose and fructose and disaccharide sucrose can be easily digested and give soybean food products their characteristic sweet taste, whereas indigestible galactooligosaccharides (stachyose and raffinose) limit soybean food and feed nutritional value, reduce metabolizable energy and cause irritation of the gastrointestinal tract in humans and animals [162, 163]. In consequence, the use of soybean for food and feed is limited, but as the demand for animal feed in Europe rises , lowering galactooligosaccharides would increase the share of soybean seed used for feeding livestock, thus increasing the market for seed production. Furthermore, due to the increasing awareness of the health benefits connected with soy food consumption, more favourable saccharide profile would make soybean more amenable to human consumption. However, oligosaccharides are not necessarily undesirable because they serve as transport carbohydrates in the phloem, as cryoprotectants, and reportedly play a positive role in desiccation tolerance during seed maturation . Furthermore, in human intestines, galactooligosaccharides ferment into low-chain fatty acids with prebiotic qualities, which make soybean seed interesting for pharmaceutical as well as functional food industry .
Broad-sense heritability is considered to be moderate to high for monosaccharides and high for oligosaccharides [166–168]. Geater et al.  determined that the differences in total sugar (TS) content among tested genotypes were constant in all environments, but significant year effect was determined for raffinose. Taira  found G to be the larger source of variation than E for raffinose and stachyose, whereas the opposite was true for TS and sucrose. Both G and E were statistically significant sources of variation for sucrose in the research by Maughan et al. . Matoša Kočar et al.  reported that TS content was mostly under genetic control, whereas individual soluble sugars, as well as total oligosaccharides (TO), were significantly influenced by the year which means breeding for TS content should be much more predictable than altering sugar profiles in that particular set of genotypes. In mentioned research, glucose, fructose and raffinose contents were higher in years with higher temperatures, whereas the opposite was true for TO, sucrose and stachyose contents . Wolf et al.  concluded that sucrose and stachyose concentrations decreased with the rise of average year temperature; however, the temperature had no effect on glucose, fructose and raffinose. The often occurring absence of G x E interaction , together with the determined maternal and additive effects [171–174], facilitates the selection and breeding for improved saccharide contents in soybean. To lower the expenses and reduce the time invested in analyses, plant breeders rely on favourable relationships between important traits which enable effective indirect selection. Significant positive correlation between raffinose and stachyose [161, 175] and significant and very strong negative relationship between sucrose and raffinose and/or stachyose [175, 176] enable breeders to increase the contents of favourable soluble sugars while decreasing the contents of unfavourable oligosaccharides simultaneously. Furthermore, Akond et al.  noted that the lack of strong correlations among tested soluble sugars indicated it should be possible to obtain lines that are high in sucrose but low in raffinose and stachyose. No significant correlation between raffinose, stachyose and sucrose was noticed by Geater et al.  as well. However, Hartwig et al.  reported a significant moderate positive correlation between raffinose and sucrose, while Hou et al.  determined the relationship between sucrose and raffinose to be significant, positive and strong. Another unfavourable relationship is the negative correlation determined between protein content and TS and protein content and sucrose [84, 178–179], whereas the negative correlation determined between protein content and raffinose  and between stachyose plus raffinose and protein content  is considered desirable. The difference in correlation reports from different researchers emphasises the importance of determining trait relationships in each breeding population.
Significant difference in sugar contents, i.e. variability among soybean genotypes that justifies selection, was determined in many studies [25, 161, 172, 176, 180–182]. Geater and Fehr  reported TS determined in 23 soybean cultivars tested at eight Iowa locations to range between 2.19 and 18.4%. The range for TS content determined by Matoša Kočar et al.  in a 3-year trial (2010–2012) with 22 soybean genotypes (MG 00-II) created at the Agricultural Institute Osijek (Croatia) was 5.69–7.68% with the average value being 6.62%. In the same study, glucose content ranged 0.16–0.29%, fructose 0.11–0.28%, sucrose 2.32–3.46%, raffinose 0.66–1.10% and stachyose 1.64–2.41% . Breeding for improved soluble sugar content is at its beginnings at the Agricultural Institute Osijek, but some progress has been achieved in elite breeding lines, in which higher glucose, fructose, sucrose and TS contents and a lower TO content have been determined than the standard cultivars of the respective MG (Table 3). Hou et al.  investigated worldwide soybean germplasm collections with 241 genotypes and reported average contents for TS, glucose, fructose, sucrose, raffinose and stachyose to be 96.4, 5.4, 4.4, 46.8, 8.3 and 31.7 mg g−1, respectively. Values varied in wide ranges with genotypes from Peru having the least TS (72.6 mg g−1) but the lowest contents of raffinose (0.5 mg g−1) and stachyose (11.8 mg g−1) as well . The highest contents for favourable glucose (23.9 mg g−1) and fructose (25.2 mg g−1) were determined in genotypes from Peru as well, while the highest sucrose content (83.1 mg g−1) was determined in genotypes from Angola . If considering only genotypes from European countries, TS contents varied between 91.5 (Bulgaria) and 111.9 mg g−1 (Poland) . The lowest glucose and fructose contents (0.3 mg g−1) together with the highest sucrose (49.2 mg g−1) and stachyose (39.2 mg g−1) contents were found in a genotype from Germany, whereas the highest glucose (15.4 mg g−1) and fructose (14.9 mg g−1) but the lowest sucrose (16.5 mg g−1), raffinose (5.2 mg g−1) and stachyose (26.7 mg g−1) contents were determined in three genotypes from Turkey . Geater et al.  determined 6.2–7.1% range for sucrose, 0.49–0.58% range for raffinose and 4.7–4.9% range for stachyose in 16 small-seeded soybean cultivars. Wilcox and Shibles  reported combined stachyose and raffinose content in 43 random breeding lines varied in seed protein content and grown in three environments to be 38.5–42.2 g kg−1 and sucrose content to be 43.3–56.7 kg−1. Wide content ranges determined for soybean sugar traits [25, 161, 178] imply there is sufficient variability for altering TS or individual sugar contents. Furthermore, increasing the digestibility and taste of European soybean meal or food products necessitates the introduction of cultivars with specifically tailored sugar profiles intended for human consumption to be used as parent donors in European soybean breeding programmes, which earlier mostly have not put their focus on soluble sugars.
|MG||Elite breeding lines||Standard cultivars||t||P|
|Glucose (% DW)|
|00||0.17–0.33||0.25||0.12–0.27||0.19||3.05**||P < 0.01|
|0||0.11–0.35||0.23||0.13–0.19||0.16||6.58**||P < 0.01|
|I||0.13–0.42||0.24||0.12–0.23||0.19||2.84**||P < 0.01|
|Fructose (% DW)|
|00||0.04–0.30||0.18||0.05–0.25||0.16||0.65ns||P > 0.05|
|0||0.02–0.38||0.17||0.02–0.18||0.11||2.87**||P < 0.01|
|I||0.02–0.43||0.21||0.02–0.19||0.13||3.11**||P < 0.01|
|Sucrose (% DW)|
|00||2.56–3.91||3.18||2.20–2.83||2.62||4.19**||P < 0.01|
|0||1.77–3.96||3.01||1.89–2.71||2.32||5.74**||P < 0.01|
|I||1.13–3.78||3.03||2.34–2.95||2.63||4.55**||P < 0.01|
|Total oligosaccharides (% DW)|
|00||2.47–3.24||2.76||2.66–3.29||3.08||3. 95**||P < 0.01|
|0||1.81–3.53||2.75||3.02–3.34||3.18||6.83**||P < 0.01|
|I||1.05–3.38||2.66||2.47–3.38||3.04||2.94**||P < 0.01|
|Total soluble sugars (% DW)|
|00||5.23–7.67||6.97||5.56–6.95||6.09||3.99**||P < 0.01|
|0||4.96–9.19||6.67||5.01–6.63||5.73||4.17**||P < 0.01|
|I||5.04–9.21||6.68||2.95–7.25||5.69||1.75ns||P > 0.05|
The introduction of germplasm with favourable sugar profiles into European breeding programmes and selection of favourable phenotypes after backcrossing can be facilitated by marker-assisted selection. The first set of markers associated with sucrose content in soybean was discovered in the F2 population of an interspecific cross between Glycine max and Glycine soja detecting six QTL on linkage groups (LGs), A, E, F, I, L and M . A novel allele of the putative soybean raffinose synthase gene (RS2) associated with the low raffinose and stachyose and elevated levels of sucrose content was discovered in PI200508 in Korea  which was used as a donor parent in crosses with three popular Korean soybean cultivars. The effects of RS2 allele were confirmed in the F2 progenies through the use of allele-specific molecular markers proving that breeding and selection for low raffinose-type soybean can be done efficiently . Akond et al.  identified and mapped 14 significant QTLs for sucrose, raffinose and stachyose contents on eight different LGs and chromosomes with only two QTLs underlying seed sucrose and stachyose contents being previously mapped. Although previous studies identified QTL associated with seed sucrose, raffinose and stachyose contents in different populations [166, 167, 171, 173, 177], only 37 QTLs for sucrose and no QTL for raffinose and stachyose are found in SoyBase  to date, emphasising the necessity for more researches. Nevertheless, a high-sucrose, low-raffinose and low-stachyose germplasm was developed  which can be introduced to European soybean breeding programmes to be used as a parent for food-grade cultivars.
Soybean is considered to be the most abundant natural source of isoflavones in the human and animal diet . Isoflavones are the main components of flavonoids and the most common form of phytoestrogens, i.e. non-steroidal compounds with oestrogen-like biological properties . They are considered nutraceuticals as it is claimed they have potential benefits in preventing the development of cardiovascular diseases and cancers, menopausal symptoms and osteoporosis, as well as antifungal and antioxidant properties [186–188]. Their role in plants is to encourage infection and nodulation by nitrogen-fixing bacteria  and to help in abiotic and biotic stress resistance . The most studied isoflavones are genistein, daidzein and glycitein . As a quantitative trait controlled by many minor genes , isoflavone content in soybean seed largely depends on the environment [185, 193–195], to the extent that even the smallest changes in the microclimate can cause significant changes in the isoflavone contents. For example, it is reported that lower temperatures during the seed fill period increase the isoflavone content, whereas environments which were warmer and drier resulted in lower isoflavone contents [196–199]. In 76 environments during 12 years, Carrera and Dardanelli  found a 91% decrease in total isoflavone content with mean temperatures during the seed development rising from 14.1 to 26.7°C. Nevertheless, Morrison et al.  evaluated 14 cultivars across 12 years at one location determining that high mean temperature during seed development did not result in isoflavone concentration decrease which might be due to the fact that only one location provides a somewhat narrower range of temperature variation . The negative impact of water deficit on soybean isoflavone concentration was confirmed in many earlier studies [192, 198, 202, 203] as well. Furthermore, synthesis of isoflavones in soybean seed depends on the geographical origin, seed size, seed colour, maturity, disease tolerance and resistance to insects as well [185, 187, 204–206]. Nevertheless, isoflavone broad-sense heritability was estimated to range from moderate  to high , indicating that genotype effects were high enough to enable efficient improvement of isoflavone contents [194, 208]. The predominance of genotype effects over environmental was confirmed by Gutierrez-Gonzalez et al. , Murphy et al.  and Hoeck et al. . Successful high-isoflavone cultivar development necessitates investigating the relationships between isoflavones and other important traits. For example, total isoflavone (TI) content was reported to be in a positive [204, 205, 209, 210] and negative  correlation with seed yield, but the lack of interrelationship has also been reported [199, 212], which suggests that the development of high-yield, high-isoflavone cultivars would be possible. The relationship between isoflavone content and each oil and protein content was reported to be negative [194, 199, 205, 209, 2011], but no correlation between isoflavones and proteins was reported as well . Conflicting reports on the relationships between total seed isoflavone content and other traits necessitate the need for studies including a wide range of genetic material grown across a range of environments to provide additional insight into the associations .
Because of the many abiotic and biotic factors influencing the isoflavone content, it is expected to vary in wide ranges. Wang and Murphy  found TI content to vary from 1176 to 3309 μg/g within a single cultivar of soybean. Gutierrez-Gonzales et al.  reported of extremely wide content range (12.4–317.4 mg 100−1 g DM) for TI determined in the population of RILs in the USA. Murphy et al.  determined 160–370 mg 100−1 g range for TI content in RIL population tested in 2-year, multilocation trials in Canada, while Adie et al.  determined 14.97–39.85 mg 100−1 g content range for 10 soybean lines tested during 1-year trial on eight locations in Indonesia. In Europe, Cvejić et al.  found TI content in 20 F1 soybean progenies ranged from 156 to 366 mg 100−1 g, while Bursać et al.  determined the TI content range of 211 to 524 mg 100 g−1 for different seed coat-coloured offsprings derived from the single cross between commercial variety and germplasm collection genotype with black seed coat. Matoša Kočar et al.  investigated 22 MG 00 to II soybean genotypes from the Agricultural Institute Osijek (Croatia) during 3 years (2010–2012) and determined average TI content varied from 124.06 to 286.2 mg 100−1 g, while average daidzein, glycitein and genistein contents were 49.27 mg 100−1 g, 20.90 mg 100−1 g and 94.41 mg 100−1 g DM, respectively. Although breeding of isoflavone content at the Agricultural Institute Osijek is at its beginnings, some shifts in the right direction have been made which can be seen from the higher average isoflavone contents in elite breeding lines than in standard cultivars of the same MG (Table 4). In Matoša Kočar et al.  study, genistein was the most abundant isoflavone component, followed by daidzein and glycitein. The same order of abundance was determined in the research by Lozovaya et al.  for two French and three US cultivars, whereas Sumardi et al.  researched 34 black soybeans in Indonesia and found that daidzein content was higher in 31 genotypes. In the research done by Cvejić et al. , Bursać et al.  and Tepavčević et al. , total daidzein was the highest followed by total genistein and total glycitein. Gutierrez-Gonzalez et al.  found that glycitein was the most abundant isoflavone, followed by genistein and daidzein. Although all three isoflavones are considered to have health benefits, the order of abundance is important because genistein is reported to have approximately 10 times higher biological activity compared to daidzein and glycitein , so genotypes with high genistein content should be favoured in selection processes aiming to create soybean genotypes suitable for food and dietetic supplement industries.
|MG||Elite breeding lines||Standard cultivars||t||P|
|Daidzein (mg 100 g−1)|
|00||10.65–106.48||65.07||20.34–63.99||45.02||2.11*||P < 0.05|
|0||11.08–93.39||50.54||19.02–48.53||36.71||3.28**||P < 0.01|
|I||25.03–91.35||51.26||9.81–53.51||36.01||2.70**||P < 0.01|
|Glycitein (mg 100 g−1)|
|00||10.72–40.31||24.26||19.08–27.24||22.77||0.68ns||P > 0.05|
|0||13.09–41.76||21.78||12.97–18.62||15.54||6.46**||P < 0.01|
|I||10.09–47.38||19.91||9.70–18.94||13.41||4.71**||P < 0.01|
|Genistein (mg 100 g−1)|
|00||43.35–355.31||184.85||57.59–128.23||93.89||4.08**||P < 0.01|
|0||52.32–112.25||80.49||54.82–86.38||67.44||2.86**||P < 0.01|
|I||48.19–142.22||92.24||54.48–104.39||83.80||1.12ns||P > 0.05|
|Total isoflavones (mg 100 g−1)|
|00||70.49–501.79||273.18||97.22–206.06||161.68||3.50**||P < 0.01|
|0||87.61–212.99||150.05||115.12–148.60||136.25||0.76ns||P > 0.05|
|I||91.34–239.75||160.75||71.14–179.53||129.94||2.34*||P < 0.05|
The large influence the environmental factors have on the isoflavone content emphasises the necessity for molecular research and identification of molecular markers associated with favourable isoflavone profiles. To date, almost 90 QTLs associated with seed isoflavone content are registered in the SoyBase . Akond et al.  identified 16 QTLs for seed isoflavones content on 12 different chromosomes (Chr) or linkage groups. Wang et al.  identified 33 expression QTLs on 15 soybean chromosomes, with 5 of them overlapping with phenotype QTL. In the overlapping region, 11 candidate genes underlying the accumulation of isoflavones were discovered which could be beneficial for the development of marker-assisted selection to breed soybean cultivars with high-isoflavone contents . Furthermore, Wang et al.  identified 23 new isoflavone content QTLs and 34 QTLs in total in 130 RILs derived from the cross between high-isoflavone and low-isoflavone cultivars of which 6, 7, 10 and 11 QTLs were associated with daidzein, glycitein, genistein and TI, respectively. Akond et al.  identified three QTLs on three different linkage groups, one controlling daidzein content and two controlling glycitein content. Furthermore, QTL epistatic interactions are also thought to contribute to isoflavone variation . Although many QTL have been discovered for isoflavone content, these are mainly minor effect often influenced by the environment, so discovering sufficient loci associated with isoflavone content stable in different environments that are going to significantly aid in selection is yet to be accomplished .
Although genetic enhancement of soybean seed quality contributes to advances in processing industries and improves the added value properties of final soybean products, soybean as a commodity is still mostly being paid for by weight and not by composition. Considering the fact that in Europe soybean is predominantly used as animal feed, i.e. as a protein source rather than oilseed crop, it is expected of European breeders to mainly focus on increasing the seed yield and protein content, since this would be most profitable for soybean producers. As soybean is becoming more appealing for human consumption in Europe, because of high nutritional value and health-promoting traits, breeding for improved amino acid, fatty acid and soluble sugar compositions as well as for increased isoflavone content is gaining importance but still not on a larger scale. Nevertheless, all efforts in describing the variability of important traits through research are invaluable not only for creating superior progeny but for germplasm preservation to oppose the narrowing of the genetic base. Although soybean is widely adaptable, significant environment effects for different seed quality traits, as well as the increased frequency of adverse weather events, emphasise the need for the development of local cultivars with improved performance and stability. The need for creating European soybean cultivars emanates from the demand for non-GM soybean as well, which motivates European breeders to use conventional breeding methods focusing on phenotype selection and MAS. Positive outcomes of breeding for improved seed quality recorded at the Agricultural Institute Osijek (Croatia) and other European soybean breeding programmes indicate that progress can be achieved even without genetic engineering. As this paper mainly focuses on MG 00 to II, commonly sown in Central and South-eastern Europe but suitable for growing in almost all European regions, information presented should be useful for soybean breeders and researches all over Europe and could promote the exchange of germplasm for introducing diversity, which is a prerequisite for any genetic advance.