Glycoside hydrolases (GHs) are enzymes that are able to rearrange the plant cell wall polysaccharides, being developmental- and stress-regulated. Such proteins are used in enzymatic cocktails for biomass hydrolysis in the second-generation ethanol (E2G) production. In this chapter, we investigate GHs identified in plant cell wall proteomes by predicting their functions through alignment with homologous plant and microorganism sequences and identification of functional domains. Up to now, 49 cell wall GHs were identified in sugarcane and 114 in Brachypodium distachyon. We could point at candidate proteins that could be targeted to lower biomass recalcitrance. We more specifically addressed several GHs with predicted cellulase, hemicellulase, and pectinase activities, such as β-xylosidase, α and β-galactosidase, α-N-arabinofuranosidases, and glucan β-glucosidases. These enzymes are among the most used in enzymatic cocktails to deconstruct plant cell walls. As an example, the fungi arabinofuranosidases belonging to the GH51 family, which were also identified in sugarcane and B. distachyon, have already been associated to the degradation of hemicellulosic and pectic polysaccharides, through a peculiar mechanism, probably more efficient than other GH families. Future research will benefit from the information available here to design plant varieties with self-disassembly capacity, making the E2G more cost-effective through the use of more efficient enzymes.
- Brachypodium distachyon
- cell wall proteomes
- glycoside hydrolase (GH)
- second-generation ethanol
The production of second-generation ethanol (E2G) provides an additional source of energy in the sugar and ethanol sector by increasing the biofuel yield without expanding the crop area, thus leading to a sustainable production system. However, for the process to become financially competitive in comparison with first-generation ethanol (E1G), it is necessary to reduce the costs related to the lignocellulosic biomass processing required to recover and break the sugars present in plant cell walls . The principal barrier to the conversion of lignocellulosic biomass into bioethanol or chemicals is the insoluble lignin network that surrounds and shields the cellulose microfibrils from degrading enzymes. The high energy and environmental costs of the treatments necessary to overcome these drawbacks constitute major hurdles to commercial E2G production . To overcome these limitations, many studies have focused on the identification of enzymes related to biomass pretreatment and hydrolysis processes. The majority of the enzymes that compose enzymatic cocktails are proteins prospected from fungi that belong to different glycoside hydrolase (GH) families .
The raw material for E2G production is plant fiber, which is mainly composed of cell walls. Nevertheless, less work has been devoted to study plant cell wall proteins (CWPs), and thus, to understand how the plant mechanisms themselves function to loosen and tighten up back the cell wall in order to promote cell growth and adapt to their changing environment. Accordingly, a common opinion today is that it is important to understand how cell walls are built up to improve the biomass deconstruction processes [4, 5].
Plant cells are surrounded by a wall characterized by its specific structure and composition . Cell walls are mainly composed of polysaccharides, lignin, suberin, waxes, proteins, calcium, boron, and water, and have the ability to self-assemble . Plants have two different kinds of cell wall deposition. Primary cell walls are synthesized in still-growing cells, whose form is not definitive, and thus, they can undergo growth and expansion. Secondary cell walls are synthesized in already fully expanded cells which are differentiating to perform specific functions, like xylem and fibers cells. In lignified secondary walls, the proportion of cellulose is higher than in primary cell walls with a higher degree of polymerization and crystallinity specificities . Cellulose, hemicellulose, and pectins are the cell wall polysaccharides, and the biogenesis of cell walls involves their synthesis in intracellular compartments or at the plasma membrane, secretion, assembly, and rearrangement in muro. The primary cell walls of grasses have specific characteristics, and they are called type II cell walls . They have a low content in pectins and xyloglucans, but a high content in mixed linked β-D-glucan [also named (1 → 3, 1 → 4)-β-D-glucans] during growth and in glucuronoarabinoxylans (GAXs). They also present ferulate and p-coumarate esters formed by attachment to the arabinosyl units of GAXs that are absent in gymnosperms, dicots, and other monocots. It has been assumed that plants devote more than 10% of their genome to the biogenesis of cell walls . The cell wall is a dynamic structure involved in several physiological processes such as: cell growth , defense against pathogens , or signaling . In sugarcane, the cell wall also plays a key role in the distribution of sucrose .
Displaying roles in cell growth, enzymes are part of the cell wall proteome. Glycosidases and glycanases have exo- and endo-GH activities, respectively, while trans-glycosidases and trans-glycanases perform exo- and endo-transglycosylation, respectively. Pectin methylesterases and pectin acetylesterases control the degree of homogalacturonan methylesterification and acetylation, respectively . Class III peroxidases (Prxs) can either form covalent bonds by oxidizing aromatic compounds such as monolignols or aromatic amino acids or produce reactive oxygen species that participate in non-enzymatic breakage of covalent bonds of polysaccharides . All these proteins belong to multigene families and their genes are differentially regulated during plant development and in response to environmental changes.
GHs are of special interest, since they can hydrolyze the glycosidic bonds from two carbohydrates or from a carbohydrate and a non-carbohydrate moiety, thus actively contributing to cell wall polymer rearrangements. In Arabidopsis thaliana, about 379 GHs have been identified, belonging to 29 different families, among which approximately 52% were predicted to be cell wall GHs . A great number of plant cell wall GH families have been identified so far in cell wall proteomes (see [18, 19]). Associations between structure and function can be predicted to point to candidate genes prone to manipulation. Among others, GHs comprise β-glucosidases, β-galactosidases, β-mannosidases, β-glucuronidases, β-xylosidases, β-D-fucosidases, exo-β-1,4-glucanases, lactases, β-glycosidases, α-L-arabinofuranosidases, glucan 1,3- and 1,4-β-glucosidases, β-amylases (for a complete repertoire, see ).
In this chapter, we will describe cell wall GHs identified in the cell wall proteomes of sugarcane and Brachypodium distachyon, two monocots from the grass family. Sugarcane is already largely used for E1G production and could be used for E2G production , whereas B. distachyon is a model plant amenable for genetic transformation . Up to now, there were 49 and 114 GHs identified in sugarcane and B. distachyon cell wall proteomes, respectively (see ). Based on their amino acid sequences, we have made bioinformatic predictions of functional domains and phylogenic analyses. GH families possibly relevant for improving biomass transformation processes to E2G are highlighted.
2. Plant GH families
In plants, several strategies have been used in order to extract and identify CWPs with a high number of GHs, such as vacuum-infiltration protocol with saline solution and identification of proteins predicted by bioinformatics to be targeted to the secretory pathway [23, 24, 25]. In A. thaliana, around 200 GHs, belonging to 13 families, are assumed to be involved in polysaccharide modification and cell wall reorganization .
Conversely, in the monocot rice, GH17 is the one that presents the highest number of members, followed by GH28 . The cell wall particularities of dicots (e.g., A. thaliana) vs. monocots, among which grasses, are also reflected in the distribution of GH families. For example, the lower proportion of pectins in monocot cell walls relates to a lower number of polygalacturonases (GH28) . Thus, the range of GH families depends on plant species, and each of them has to be studied separately.
By bioinformatic analyses of amino acid sequences, it is possible to classify newly identified GHs into families. We have done it for the GHs identified in the sugarcane and the B. distachyon cell wall proteomes. However, assigning a GH to its family as defined in the CAZy database  does not necessarily provide a clear picture of its function, since proteins from a given GH family can display different roles. For example, GH1 family members can be involved in cell wall metabolism, lignification, signaling, or defense . Phylogenic analyses can help getting more information regarding the functions of specific plant GHs. Since monocots could be a major source of raw material for E2G production, it is interesting to study their cell wall metabolism and outline possible strategies to increase biomass production or lower cell wall recalcitrance to deconstruction . In addition, it might be possible to find interesting analogies by comparing the plant cell wall assembly/disassembly mechanisms to those of microorganisms.
3. GHs identified in sugarcane and B. distachyon cell wall proteomes
In sugarcane, 49 GHs have been identified in cell wall proteomes [28, 29]. They are distributed in 16 GH families. The GH3 family is the best represented (~20%), followed by GH17 (~16%), GH18 (~12%), and GH1 (~8%) (Figure 1). This distribution varies according to organ and developmental stage. In cell suspension cultures, only 4 GH families were identified among which GH3 was the most populated . In 2-month-old stems, 7 GH families were found, GH3 also being the most represented . Leaves recovered few GHs, from families 19, 27 (young leaves only), 28, and 31 (young leaves only). Apical internodes mainly contained GH3 members, whereas mostly GH17 members were found in basal ones . Noteworthy, it should be mentioned that the absence of some GH families in a given cell wall proteomes could be due to technical limitations or differential accessibility as a consequence of differences in cell wall structure.
In B. distachyon, 114 CW GHs were identified in cell wall proteomes [31, 32, 33, 34]. They are distributed into 21 families. The most populated one was GH17 (~17%), followed by GH28 (~13%), GH1 (~9%), GH3 (~8%), and GH35 (~8%) (Figure 1). GH28, followed by GH1, GH3, and GH16, had the highest number of members in young leaves. In mature leaves, GH17, GH18, and GH28 were the most represented. In internodes, they were GH28 and GH17. In seedlings and seeds, GH17 was the most populated family.
The large size of the GH1, GH17, and GH28 families is probably linked to their roles in the assembly and in the rearrangement of cell wall polysaccharides . Usually the GH1, GH16, GH17, and GH35 families are less represented in dicots than in monocots . GH17 display glucan-1,3-β-glucosidase activity and possible substrates could be mixed (1,3)(1,4)-β-D-glucans . This is consistent with the fact that only type II grass cell walls present this kind of hemicellulose.
After a survey of the cell wall proteomes described so far and collecting information regarding microorganism enzymes used for biomass deconstruction, we decided to focus this review on the GH1, GH3, GH17, GH27, GH35, and GH51 families. We have predicted functional and structural domains in newly identified CWPs using the PredictProtein bioinformatic software and grouping them in families . Since plant cells perform cell expansion themselves by involving cell wall polysaccharide rearrangements, the plant mechanisms could be mimicked by the enzymes used in cocktails. The comparison of plant and microorganisms enzymes presently used for biomass hydrolysis could contribute to determining their common characteristics and which specificities of plant enzymes could be copied in order to improve industrial cell wall deconstruction processes. Conversely, this comparative study could help in identifying which of the characteristics of microorganism enzymes could be engineered in plant species in order to obtain biomass with less recalcitrant cell walls.
4. GH1, GH3, and GH51
The GH1 family mainly comprises β-glucosidases, which are found in several organisms performing different functions. In plants, they are involved in cell wall catabolism, signaling, lignification, defense, symbiosis, and secondary metabolism. Putative β-glucosidase genes have been shown to be induced by biotic and abiotic stresses and they were considered critical for the success of plant development in stressful environments [36, 37, 38, 39, 40]. Accordingly, plants are the organisms that have the highest number of GH1s, e.g., 48 in A. thaliana  and 40 in Oryza sativa . Over the years, various β-glucosidases hydrolyzing cell wall oligosaccharides have been characterized mainly in monocots, such as in germinating seedlings of barley where they show preference for manno-oligosaccharides in endosperm cell walls , and in rice seedlings, where they hydrolyze different oligosaccharides . Extracellular β-glucosidases can also contribute to the production of toxic compounds, such as hydroxamic acids and cyanide [44, 45, 46]. For this process to occur correctly, the defense molecules are stored in non-active glucosylated forms in the vacuole, while the β-glucosidases are stored in the apoplast or in protein bodies in dicots or in plastids in monocots. The enzyme and its substrate get into contact when the cell is damaged during plant-microorganism interaction.
The plant cell wall is a large polysaccharide repository that contains a large amount of glucosyl residues. β-glucosidases play important roles in cell wall formation and plant development, because they participate in cell wall polysaccharide turnover . In sugarcane  and B. distachyon , more GH1 were found in cell wall proteomes of growing organs, such as young leaves and apical internodes, than in mature organs. In addition, as suggested by bioinformatic predictions, several of the identified GH1 (e.g., SCCCCL3001B10.b, SCJFLR1017E03, SCEQLB1066E08, Bradi1g10940, Bradi1g10930, Bradi1g10940, and Bradi2g59650) have a β-glucosidase activity (GO:0008422).
Ten GH3s have been identified in sugarcane [28, 29, 30] and nine GH3s have been identified in B. distachyon [31, 32, 34]. Half of the sugarcane GH3 are predicted to have a β-glucosidase activity (GO:0008422) (e.g., SCEZLB1007A09, SCEQLR1093F09, and SCQSLR1089A04). However, some GH3 are predicted to have xylosidase (e.g., Bradi5g23470) or α-L-arabinofuranosidase (AFase) activity (e.g., SCCCCL4009F05, SCCCSB1003H06, and Bradi3g59020).
The fungi β-glucosidases can degrade cellulose together with other kinds of enzymes, like endoglucanases and cellobiohydrolases. They separate the molecules of glucose from cellobiose, thus being used in enzymatic cocktails to produce cellulosic bioethanol . In barley, the structure of the GH3 β-D-glucan exohydrolase ExoI was determined through X-ray crystallography, showing a two-domain globular structure being different from that of GH1 . Besides the catalytic site, this enzyme has another binding site for (1 → 3, 1 → 4)-β-D-glucans only identified in monocots. Xylan 1,4-β-D-xylosidases hydrolyze xylose from xylo-oligosaccharides. These enzymes have several uses, such as in the industrial processes related to bread dough, animal feed digestibility, and deinking of recycled papers. In enzymatic cocktails, they are considered the most efficient enzymes to break glycosidic bonds of hemicelluloses . The few GH51 members identified in sugarcane and B. distachyon were also assumed to be AFases (EC 22.214.171.124, GO:0046556). AFases catalyze the hydrolysis of α-L-arabinofuranoside in α-L-arabinosides. They act together with hemicellulases and pectinolytic enzymes to achieve hemicellulose and pectin hydrolysis. Several AFases used commercially belong to the GH51 family, generally originating from fungi. Such enzymes are of special interest for monocots biomass hydrolysis, since this material is particularly rich in arabinoxylans, which need to be degraded with AFases in addition to endoglucanases .
To compare plant and microorganism GH1 and GH3, we have performed phylogenic analyses. Some GH1 and GH3 identified in cell wall proteomes of sugarcane and B. distachyon have been selected. For plants, we have chosen species at critical evolutionary nodes since terrestrialization like a moss (Physcomitrella patens), the common ancestor to higher plants (Amborella trichopoda), Sorghum bicolor as the closest plant to sugarcane having a sequenced genome, and two dicots (Lycopersicon esculentum and A. thaliana). The B. distachyon and sugarcane sequences of several GHs identified in cell wall proteomes have been included in this analysis. Regarding microorganisms, we have retained GHs usually used in enzymatic cocktails for cell wall deconstruction. They belong to bacteria like Bacillus licheniformis, or to fungi like Aspergillus nidulans or Trichoderma reesei [52, 53]. As expected, the selected GH1 and GH3 share common ancestors. The two B. licheniformis GH1 root the GH1 tree (Figure 2). A. nidulans and T. reesei GH1 come next, prior to the P. patens GH1.
Finally, the tree is split into two distinct clades, each containing either one or two closely related A. trichopoda GH1. Monocot and dicot GH1 are finally separated in sub-clades. Regarding the GH3, the situation is more complex (Figure 3). Two clades are separated at the basis of the tree: clade A is rooted by three B. licheniformis GH3, followed by an A. nidulans GH3, whereas clade B is only rooted by an A. nidulans GH3. As for the GH1 tree, each sub-clade comprises an A. trichopoda GH3, whereas monocot and dicot GH3 form distinct groups. Similar results could be obtained for the other GH families. This phylogenic analysis shows the close relationships between microorganism and plant GH1 or GH3. Additional work is required to define precisely their specificities with the aim of generating new tools for industrial processes of biomass deconstruction.
5. GH 17
GH17 are encoded by large gene families in plants. In O. sativa, GH17 is the largest GH family . In A. thaliana and Populus trichocarpa, it comprises 50 and 100 members, respectively . The GH3 family includes β-1,3-glucanase (glucan endo-1,3-β-D-glucosidase, E.C 126.96.36.199), glucan 1,3-β-glucosidase (E.C 188.8.131.52), licheninase (EC 184.108.40.206), glucan 1,4-β-glucosidase (EC 220.127.116.11) activities (
β-1,3-glucanases have been shown to be important proteins involved in plant defense reactions against pathogens and are considered as pathogenesis-related proteins of the PR-2 family . Their role is hydrolysis of the β-1,3-glucan bonds, an important structural component of fungal cell walls, resulting in their destabilization and in the release of elicitors that further stimulate defense responses . This antifungal activity was shown both in vitro  and in vivo . In sugarcane, GH17 SCQSRT2031D12 identified in basal internodes was considered similar to the A. thaliana At4g16260 β-1,3-endoglucanase that has been associated with increased resistance to pathogen attack . Noteworthy, β-1,3-glucanases can accumulate in vacuoles of root cells or mature leaf cells in response to pathogen infection, whereas others are secreted to the extracellular space, but they can also be secreted in the absence of pathogen infection . They are, thus, also important during plant development, being involved in cell division, pollen development, seed germination, and maturation as well as in signaling [55, 59, 60].
According to phylogenic analyses, the GH17 family is divided into three distinct clades (denoted α, β, and γ) [61, 62], with 10% of its members having cell wall-related functions . GH17 of the α clade are more related to stem elongation, but also responsive to gibberellin, those of the β and γ clades are more related to stress response and defense against pathogens [62, 63, 64, 65]. In addition to the GH17 domain per se, proteins of the GH17 family comprise other domains as shown by  studying the β-1,3-glucanase sequences of A. thaliana. They noted that all the sequences had a predicted N-terminal signal peptide linking them to the secretory pathway. Half of them had a C-terminal extension, being first classified as an X8 domain . Previously, the X8 domain was identified as the cellulose binding module 43 (CBM43) responsible for the interaction with β-1,3-glucans . The other GH17 had either a C-terminal glycosylphosphatidylinositol (GPI)-anchor [66, 68] or a vacuolar targeting peptide . The absence or gain of these domains could be related to ancestral traits. All the γ clade members and more than half of the α clade members retained the CBM43 domain, whereas all the members of the β clade lost it through evolution. It is thought that the loss of this domain facilitates the extracellular secretion induced by biotic stresses, thus improving the response to pathogens [61, 62].
Other studies also revealed the antifungal effects of plant extracellular chitinases (GH18 and GH19) in combination with those of GH17 . Indeed, fungi cell walls are composed of chitin and of branched β-(1,3):β-(1,6) glucans [57, 70, 71, 72, 73]. Thereby, transgenic plants overexpressing a chitinase and/or a ß-l,3 glucanase became less susceptible to fungal attack [74, 75].
6. GH27 and GH35
The GH27 identified in cell wall proteomes of both sugarcane and B. distachyon was predicted to have α-galactosidase activity (EC 18.104.22.168, GO:0004557). α-galactosidases break galactosidic linkages in galactose-containing oligosaccharides, galactolipids, and galactomannans . Since galactomannans are hemicelluloses, α-galactosidases could be used in enzymatic cocktails to enhance the cell wall hydrolysis process by acting as a hemicellulase.
GH35 are mainly β-galactosidases (EC 22.214.171.124), but exo-1,4-β-D-glucosaminidase (E.C 126.96.36.199) and exo-β-1,4-galactanases (EC 3.2.1.-) also belong to this family. β-galactosidases are found in microorganisms such as bacteria, fungi, and yeast, as well as in animals and plants . They catalyze the hydrolysis of terminal non-reducing β-D-galactose residues in different molecules, like glycoproteins, oligosaccharides, glycolipids and lactose (
GH35 can be distributed into two main groups according to their preferred substrates: hydrolysis of pectic β-1,4-galactans, cleavage of β-1,3- and β-1,6-galactosyl linkages of O-glycans of arabinogalactan proteins . In plants, they are associated with secondary metabolism or polysaccharide degradation, performing important roles in physiological events, including cell wall degradation and expansion during plant development, and turnover of signaling molecules [79, 80, 81, 82, 83]. They were also shown to be involved in ripening and abscission of mango, papaya, and orange fruits [84, 85, 86]. The GH35 found in the cell wall proteomes of sugarcane  and B. distachyon  is predicted to have a β-galactosidase activity (GO:0004565). In B. distachyon, they were only identified in leaves and in seedlings, whereas they were mostly found in sugarcane internodes. GH14 are very close to GH35 due to sequence similarity, perhaps playing similar roles, and they have only been identified in sugarcane internodes. Interestingly, the SCUTAM2089E05 GH14 was found to be differentially expressed in ancestral genotypes of sugarcane showing differential carbon allocation to lignin or sucrose .
7. Concluding remarks
Microorganisms use an arsenal of GHs to degrade plant cell walls, in order to establish themselves in their host. Similar mechanisms are thought to be used in their own plant cell wall modification, since plant cell walls embrace several types of carbohydrates with a variety of structures and biological functions. For sugarcane biomass deconstruction, the first step proposed is the use of pectinases to release pectins, such as endopolygalacturonases, AFases, and β-galactosidases, along with pectin methylesterases. Lichenases are used to hydrolyze mixed linked β-D-glucan. The remaining polymers, cellulose, and hemicelluloses, would have to be treated with a mixture of enzymes like endo-β-xylanases, α-arabinofuranosidases, xyloglucanases, α-xylosidases, and β-galactosidases. Finally, cellulose could be the substrate of endo-β-glucanases, cellobiohydrolases, and β-glucosidases .
Besides many studies focusing on microorganism enzymes to optimize E2G production, this work has evaluated the plant enzymes that are assumed to display similar activities. Since plant GHs perform cell wall breakage and expansion, a deeper investigation of their structure could be performed in order to produce more efficient chimeric enzymes to be used in enzymatic cocktails. It is difficult to establish GH functions from their amino acid sequences because proteins from the same GH family may have diverse substrates and roles . However, we were able to predict functions for the GHs identified in the cell wall proteomes of sugarcane and B. distachyon. Thus, the mentioned GH1, some GH3 and GH17 were predicted to have a β-glucosidase activity. Other GH3 had possible β-xylosidase and AFase activities, the latter also predicted for GH51. The GH27 and GH35 families were predicted to have α- and β-galactosidase activity, respectively. Nevertheless, it is crucial to mention that in order to precisely characterize the function of a given protein, one should perform biochemical analyses, involving purification, characterization of substrates, as well as genetic studies on mutants in well-characterized model plants.
Therefore, this work has contributed to provide target proteins that could possibly be used in future research to facilitate cheaper E2G production, besides allowing a more detailed analysis of the cell wall proteomes of the grasses, sugarcane and B. distachyon.
The authors thank Dr. Roberto Ruller for the microorganism suggestions.