Exopolysaccharide Biosynthesis in Rhizobium leguminosarum : From Genes to Functions

© 2012 Ivashina and Ksenzenko, licensee InTech. This is an open access chapter distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Exopolysaccharide Biosynthesis in Rhizobium leguminosarum: From Genes to Functions


Introduction
Gram-negative soil -proteobacteria belonging to the genera Allorhizobium, Azorhizobium, Bradyrhizobium, Mesorhizobium, Rhizobium, and Sinorhizobium (Ensifer) are able to infect the roots of their legume hosts in a host-specific way and induce the formation of specialized new plant organs -nodules, in which endosymbiotic bacteria reduce atmospheric nitrogen to ammonia.Rhizobium leguminosarum comprises two biovars, namely trifolii (nodulating Trifolium) and viciae (nodulating Pisum, Vicia, Lathyrus, and Lens) [1].Closely related to the R. leguminosarum is R. etli (formely R. leguminosarum bv.phaseoli) nodulating Phaseolus beans.To avoid confusions, we will use former names for R. leguminosarum bv.phaseoli strains as they were described in original papers.
The R. leguminosarum strains investigated so far synthesize different types of polysaccharides, including acidic exopolysaccharide (EPS), capsule polysaccharide (CPS), gel-forming polysaccharide (GPS), cellulose fibrils, galactomannan, lipopolysaccharide, and cyclic glycans [2].The cyclic neutral -(1,2)-glucans are predominantly accumulated within the periplasmic space and play an essential role during hypoosmic adaptation as well as during plant infection [3].Glucomannan was shown to be important for lectin-mediated polar attachment to Vicia sativa and Pisum sativum root hairs and competitive nodulation [4,5].The CPS is tightly associated with the cell surface of bacteria forming a polysaccharide matrix surrounding the bacteria [6].Differences in noncarbohydrate substitutions, such as O-acetyl, pyruvate, and 3-hydroxybutyrate, may distinguish anionic capsule-bound polysaccharides from secreted EPS.In late-stationary-phase cultures, CPS was replaced by a polysaccharide with strong gel-forming properties having an unknown function [7].The LPS present at the outer leaflet of the outer membrane and consists of three parts: lipid A, the core polysaccharide and the O-antigen polysaccharide [8].More and more data appear indicating that LPS plays a specific role in the later stages of establishment of symbiosis stoichiometric hydroxybutanoyl groups.The distribution pattern of O-acetyl and 3hydroxybutanoyl groups may vary for some R. leguminosarum strains and was shown to be dependent on the growth phase of bacteria and culture medium [27,32].Rlt 4S [37] Rlv 248 [38] Rlp 127K44 [39] Rlp 127K38 [40] Rlp 127K87 [41] Table 1.Structure of R. leguminosarum and R. etli EPS repeating units.Abbreviations: Glc, glucose; GlcA, glucuronic acid; Gal, galactose and Pyr, ketal pyruvate group.
However, several R. leguminosarum strains produce EPS with divergent side chains though with identical backbones and the same 1-6-linked glucosyl residue branching the side chain.Side chains of these EPS may consist of three to seven sugar residues.Up to three Gal residues can be present as in the R. leguminosarum bv.phaseoli (Rlp) 127K87 EPS, or terminal Gal residue can be absent as in the Rlt 4S EPS.In addition, in some cases side chains of EPS contain GlcA residues (Rlp 127K44, Rlp 127K38, Rlp 127K87, Rlv 248).Besides 1-4 and 1-3 linkages, sugar residues can be attached by 1-6 or 1-6 glycosidic bonds (Rlp 127K38, Rlp 127K87).
The acidic nature of EPS is explained by the presence of uronic acids and negatively charged pyruvyl groups.Similar to other representatives of Rhizobiaceae, the R. leguminosarum strains synthesize EPS in high-molecular-weight (HMW) and low-molecular-weight (LMW) forms [13,42].The latter were proposed to act as signaling factors during the development of symbiosis [14,[16][17]42].

Organization of exopolysaccharide biosynthesis genes
According to the modern conception, the synthesis of heteropolysaccharides requires a complex pathway starting with the synthesis of sugar nucleotide precursors as well as of the non-carbohydrate donors followed by sequential assembly of the repeating unit on polyprenyl lipid carries, their modification, polymerization, and export outside of the cell [20,21,43].
We started the study of the genetic control of the biosynthesis of acidic exopolysaccharide with isolation of non-mucoid Tn5-derived mutants in Rlv VF39.As a result, five non-slimy mutants (GL1-5) were obtained.The mutations were mapped within four separate chromosomal loci.The open reading frames (orfs) interrupted by insertion of the Tn5 transposon were named as pss (polysaccharide synthesis) according to Borthakur and coworkers [44].The Tn5 insertion in the GL4 mutant was localized within the pssA gene [45], the ortholog of which was previously identified in Rlp 8002 [44].The mutations in GL2 and GL6 were mapped within the pssE and pssD genes, respectively.Their orthologs were found earlier in Rlt LPR5 [46].Chromosomal walking around these genes in Rlv VF39 allowed us to identify a 15.5-kb multi-cistronic operon which included a core set of genes needed for the assembly of the repeating unit of the EPS (pssEDCFGHIJS), its modification (pssKMR), polymerization (pssL) and processing (pssW) (Fig. 1) [47].It should be mentioned that the pssV-E operon was found in all R. leguminosarum and R. etli genomes, whose complete or partial sequences are available now.Moreover, nine out of the fifteen genes from this operon have orthologs in all these genomes.At the same time, certain Rlv VF39 genes are absent in some other genomes, certain genes are substituted for non-orthologous genes, and some additional genes are also present (Fig. 1).We will discuss the functioning of all these genes bellow.Here we would like only to consider the problem with their names.
All fifteen genes from the pssV-E operon were named as pss genes.In addition, the same gene name abbreviation was assigned to six genes (pssA, pssB, pssN, pssO, pssP and pssT) localized in other operons.It is easy to count up that only five letters of the alphabet left that can be used with the "pss" body in the names of new genes involved in EPS biosynthesis.Meanwhile, in our opinion even at present new names for eight genes from Rlt WSM2304, Re CFN42, Re CNPAF512 and Re CIAT 652 have to be assigned.Therefore, we propose (i) to retain the existing names for all orthologous genes, and (ii) to introduce a new set of genes with the body name "psa" (polysaccharide repeating unit assembly).Our propositions concerning new names to be assigned for certain genes involved in EPS biosynthesis are summarized in Table 2.The pssV-E operon is neighboring with the region comprising several operons containing genes for the Type I secretion system (prsED), EPS processing (plyA), and EPS polymerization/export (pssTNOP).This whole chromosomal region is known now as the Pss-I gene cluster [48].
The pssA gene controlling the first step in the repeating unit assembly is localized approximately at 90 kb from the Pss-I cluster.The gene was shown to be transcribed as a monocistronic mRNA [45].Upstream of the pssA is the pssB gene encoding inositol monophosphatase [49][50][51].Effects of mutations within the pssB gene on the synthesis of EPS and symbiotic behavior have been analyzed in the Rlv VF39 and Rlt TA1 backgrounds and have been shown to be contradictory.In Rlv VF39 the pssB mutants retained the ability to produce EPS in amounts equal to those of the wild-type strain.In Rlt TA1 the pssB inactivation displayed an increased overall production of EPS versus the wild-type strain, and alterations in the LPS PAGE-banding pattern and the O-antigen sugar composition [51,52].Nevertheless, pssB mutants of both strains elicited non-effective nodules on the Vicia faba or Trifolium pratense roots, respectively [49,52].
In the case of GL1 and GL3 mutants we have not carried out the extended chromosomal walking around the Tn5 insertions, but only short genome sequences flanking Tn5 were determined.We localized the Tn5 insertion in the GL3 mutant within some small orf (213 bp only).The ortholog of this orf (RL2260) was found in the chromosome sequence of Rlv 3841 located far away from the Pss-I cluster.It encodes a 7.2 kDa positively charged protein (pI 10.8), which is conserved in numerous bacteria.As for the GL1 mutation, we have not been able to map it up to now.Probably, this mutation targets a gene located in one of the Rlv VF39 strain-specific plasmids.
Taking together these data, one can conclude that the core set of EPS synthesizing genes are clustered in the chromosomal Pss-I region.Clustering of genes involved in EPS biosynthesis is not unique for R. leguminosarum, but is widespread in different polysaccharide producing bacteria [53,54].Such a type of genes organization could reflect their coordinated expression and tightly regulated control.
Evidently, EPS biosynthesis is linked with other metabolic pathways in the cell.Therefore, localization of mutations affecting EPS production distant from the Pss-I region can reflect this linkage.For example, the pssA gene is located in another chromosomal region probably due to involvement in the initiation of not only EPS synthesis, but the synthesis of the other polysaccharides.Recently, several regulatory genes influencing EPS production (psiA, psrA, exoR, expR, rosR, praR) were found to be localized either in different regions of the chromosome or at the endogenious plasmids (reviewed by [24]).We can not exclude that mutations in GL3 and GL1 also target the regulatory genes.

Synthesis of nucleotide-sugar precursors
According to the sugar composition of acidic EPS, biosynthesis of its repeating units requires nucleotide sugars: UDP-glucose, UDP-glucuronic acid and UDP-galactose that are formed by central carbon metabolism.Several genes involved in synthesis of nucleotidesugar precursors were identified in R. leguminosarum genomes.The exoB encodes UDPglucose 4-epimerase, responsible for UDP-galactose production [55].Mutations in this gene have a pleiotropic effect and influence the synthesis of different classes of galactose containing polysaccharides; namely acidic EPS, glucomannan, lipopolysaccharide, and probably the galactose-rich gel-forming polysaccharide [5,55,56].
The exo5 gene encodes a UDP-glucose dehydrogenase responsible for oxidation of UDPglucose to UDP-glucuronic acid.A mutation in exo5 affects the production of extracellular acidic polysaccharide and capsular polysaccharide both of which contain glucuronic acid residues in the backbone chain [57,58].

Assembly of the EPS repeating unit
As mentioned above, heteropolysaccharides are polymers consisting of identical repeating units which can vary only by the distribution of modifying groups per monomer.
Obviously, their assembly has to be stringently controlled.The biosynthetic pathways for the number of heteropolysaccharides have been elucidated [54,59].It is evident from the obtained data that the unique structure of the repeating unit is governed by the specificity of nonprocessive glycosyltransferase (GT) catalyzing a certain step of the biosynthesis.This specificity is likely based on the ability of GT to recognize the sugar residue to be transferred, the acceptor, and the linkage to be formed.At present the complete pathway of EPS biosynthesis has not been determined for any of R. leguminosarum biovars, and only some individual steps have been characterized.Nevertheless, a detailed consideration even of these fragmentary data together with a comparative analysis of the available R. leguminosarum genome sequences allowed us to predict the genetic control of all steps in the repeating unit assembly at least for Rlv VF39 and closely related strains.
It was shown previously that assembly of a repeating unit of R. leguminosarum EPS starts with the addition of a glucose residue to the lipid carrier [60].Biochemical studies and complementation analysis provided strong evidence that this reaction is conducted by the pssA gene product [36,61].The pssA gene encodes the integral inner membrane protein UDP-glucose:polyprenyl-phosphate glucosephosphotransferase, belonging to a family of diverse bacterial sugar transferases.Members of this family catalyze the formation of a phosphodiester bond between polyprenol phosphate and hexoso-1-phosphate, which is donated by nucleotide sugars.The pssA gene is highly conserved in R. leguminosarum biovars and R. etli [44][45][46]50].The pssA mutants do not produce EPS and as a consequence impaired the normal development of the nitrogen-fixing nodules on the appropriate plant hosts and the formation of biofilms both in vitro and on root hairs [5,45,46,50].Some contradictory results exist concerning the influence of pssA mutations on the synthesis of CPS which displays similar structure to EPS in terms of glycosyl composition [5,30,62].In the Rlt 5599 genetic background, pssA mutants still produce CPS at the level similar to that of a wild-type strain [46].In contrast, both EPS and CPS are absent in the pssA mutant of Rlv 3841 [5], which indicates that the PssA protein might be involved also in the initiation of the CPS synthesis.It has been shown that expression of pssA depends on the environmental factors such as phosphate and ammonium concentrations and also on root exudates [63].
Bossio and co-workers have demonstrated that subsequent to addition of glucose to isoprenylpyrophosphate, two glucuronic acid residues are attached [64].The attachment of the first GlcA is catalyzed by the pssE and pssD gene products [47,61].This conclusion is based on the results of in vivo reciprocal complementation between the pssED and spsK genes of the Sphingomonas strain S88 and the in vitro sugar incorporation studies.Recently we confirmed directly the function of PssED.The corresponding mRNAs were translated in a wheat-germ cell-free system in the presence of liposomes obtained from Rlv VF39 phospholipids.The resulting proteoliposomes were used as an enzyme source in experiments on PssDE specificity (Ivashina et al., unpublished).
Both, PssD and PssE display similarity to GT family 28 (CAZy database, http://www.cazy.org/).Notably, the amino acid sequences of PssD and PssE are similar to the N-terminal and C-terminal halves of the glucuronosyl-(ß1,4)-glucosyltransferase SpsK, respectively.This observation has led us to the conclusion that PssD and PssE represent two subunits of the same enzyme.The proposed catalytic domain is localized in the peripheric inner membrane of the pssE subunit, in contrast to PssD, which was shown to be an integral inner membrane protein.We have also found that the integration of PssE into a membrane or liposomes strongly depended on the presence of PssD and vice versa (Ivashina et al, unpublished).It is interesting to note that the same was observed in the case of yeast proteins Alg13 and Alg14.It was found that Alg14 was needed for the correct positioning of Alg13 on the cytosolic face of the endoplasmic reticulum membrane mediating the formation of the active UDP-N-acetylglucosamine transferase complex [65,66].Mutations in the pssE and pssD genes fully abolished EPS production and as a consequence resulted in defects of nodule infection [67][68][69].
PssC belongs to the GT family 2 containing a variety of inverting glycosyltransferases (enzymes that form glycosidic bonds with stereochemistry opposite to that of the glycosyl donor) that utilize a diverse range of nucleotide-sugar donors and participate in the synthesis of various types of polysaccharides [70].This GT was assigned by Pollock and coworkers to glucuronosyl-(β-1,4)-glucuronosyltransferase, which catalyzed the attachment of the third sugar residue (GlcA) to the disaccharide (GlcA-β-1,4-Glc) lipid-linked intermediate with the formation of the ,1-4 glycosidic bond [61].This conclusion was based on the comparative data on the genetic control of the first three steps of R. leguminosarum EPS and the Sphingomonas strain S88 sphingan assembly.Obviously, these data can not be considered as direct evidence, and conclusion on the PssC assignment needs additional experimental proofs.
Several pssC mutations have been characterized in various R. leguminosarum biovars backgrounds to date [46,47].All of them were mapped at the N-terminus of PssC and resulted in the decreased amount (27-38%) of EPS in culture supernatants.However, structural analysis of EPSs secreted by these mutants showed them to be identical to that of the wild-type strains [46,47].We proposed that the initiation of translation of pssC could be realized from the second potential start codon GTG located downstream from the spots of mutations and therefore leading to the synthesis of protein retaining enzymatic activity.In fact, Western blot analysis with antibodies against PssC demonstrated the synthesis of a truncated protein in the pssC mutant (Ivashina et al., unpublished).Attempts to introduce mutations into the central part of pssC in the Rlv VF39 background were unsuccessful.However, it was easy to homogenote pssC in the same strain carrying a mutation in the pssD gene, which failed to produce EPS.These results pointed to the detrimental effect of such mutations most probably due to the accumulation of lipid-linked intermediates in the cytoplasmic membrane and as a result to inability of the lipid carrier to be released for other essential cellular functions.The data obtained with the use of different genetic systems led to the same conclusions [61].
The PssJ protein is the last glycosyltransferase for which its biochemical function was ascertained experimentally.The Rlt RBL5515 strain carrying mutation in the pssJ gene (known as exo344::Tn5), synthesizes residual amounts of EPS, the repeating unit of which lacks the terminal galactose of the side chain.On the basis of the structural features of the polysaccharides synthesized and the results of an analysis of the enzyme activities involved, it was hypothesized that the galactosyltransferase catalyzing formation of the 1-3 linkage between sub-terminal (Glc) and terminal (Gal) sugar residues in the octasaccharide unit is affected in this strain [6].PssJ did not reveal any homologs in protein databases and therefore it could be referred to a family of "not-classified glycosyltransferases" (CAZy database).
It can be seen from the EPS repeating unit structure that the third (GlcA) and the fourth (Glc) sugar residues in the backbone chain are linked by the 1-4 glycosidic bond (Table 1).PssS is the only enzyme which can be responsible for this reaction.According to homology search data, the PssS was referred to the GT family 1 (CAZy database), which integrates the retaining glycosyltransferases forming glycosidic bonds with stereochemistry identical to that of the glycosyl donor.These enzymes were shown to be involved in exopolysaccharide, lipopolysachharide, and slime polysaccharide colanic acid biosynthesis.We were unable to disrupt the pssS gene in the wild-type strain Rlv VF39, but easily inactivated it in the pssD mutant (Eps -).It is likely that in this case the inactivation of pssS also leads to the accumulation of toxic lipid-linked intermediates as it was proposed for pssC mutants.
Summarizing, specific GTs were assigned to the assembly of the backbone chain of the octasaccharide unit, as well as in the attachment of the terminal Gal in the side chain.It should be emphasized that in all known R. leguminosarum EPS structures the backbone chains are identical (Table 1).At the same time, in genomes of R. leguminosarum strains, which synthesize similar EPSs the orthologs of pssAEDCS genes are present (Fig. 1).Therefore, we can conclude that the prediction has been made correctly.The same statement is true for PssJ catalyzing the attachment of the terminal Gal residue: in the R. leguminosarum genomes, where pssJ has not been found, the pssK, which has been predicted to modify this Gal residue (see below), is also missing.
As mentioned above, R. leguminosarum and R. etli can produce EPSs with side chains varying in their length, sugar composition and type of glycosidic linkages.It should be noted, that the set of GT genes in Pss-I clusters can also vary.Studies on the genetic control of EPS biosynthesis are impeded to a considerable degree due to the fact that when genome sequences are available, nothing is known about the structure of synthesized EPS and vice versa.The Rlv VF39 and Rlt TA1 represent the only pair when (i) the structure of EPSs and sequences of the Pss-I clusters are determined; (ii) both strains produce structurally identical EPSs but differ in the sets of GT genes; (iii) data on mutational analysis of Rlv VF39 GT genes are obtained.Taking into account all these considerations, we have picked the Rlv VF39/Rlt TA1 pair for prediction of the pathway of side chain biosynthesis.In this case the question arises, which glycosyltransferase initiates branching by attachment of the Glc residue via the β1-6 bond, and which GT(s) is (are) responsible for the attachment of two subsequent Glc residues by formation of the β1-4 glycosidic linkage.It is obvious, that in Rlt TA1 only two GTs (PssF and PssI) can perform these functions.In addition to PssF and PssI, two other GTs (PssH and PssG) can participate in the side chain assembly in the Rlv VF39.
We introduced mutations into all four genes (pssFGHI) in Rlv VF39 and found that the structures of EPSs of mutant strains were identical to that of the parental strain, and only the level of acidic EPS production decreased.Based on these results, we can conclude that the action of each GT considered in this system can be interchangeable.
In our opinion, PssF is the best candidate to play the role of GT which catalyzes the attachment of the Glc residue by formation of the β1-6 bond.Firstly, in all Rhizobium leguminosarum strains the EPS side chain starts with the Glc residue attached to the backbone chain via the β1-6 bond.At the same time, PssF is present in all PssI-clusters sequenced up to now.Secondly, PssI, PssH and PssG reveal a rather high level of similarity with each other especially in the N-terminal parts of their amino acid sequences where catalytic domains are located.In contrast, PssF is practically non-homologous to that of three GTs but shows although weak but yet reliable homology of its N-terminal half with GTs attaching the Glc residue via the β1-6 bond (e.g.ExoO from S. meliloti [20]).
If our prediction on the PssF function is correct, the attachment of two subsequent Glc residues could be achieved by single GT (PssI) in the Rlt TA1, and as many as three GTs (PssI, PssH and PssG) could participate in this process in the Rlv VF39.Apparently, PssI in the Rlt TA1 strain is to a certain extent tolerant to the acceptor structure and the identity of EPS repeating units probably is attained at the expense of high specificity of PssJ, which catalyzes the last step of the EPS assembly.
It seems that in the Rlv VF39 the subsequent attachment of Glc residues is achieved by two separate GTs, namely PssI and PssH.This assumption is based on a comparison of the amino acid sequences of these homologous GTs.A rather low level of similarity of their Cterminal parts, containing the putative acceptor recognition domain was observed.In contrast, PssG reveals a very high level of homology to PssI over its entire amino acid sequence (more than 80% similarity).It is plausible to assume that PssI and PssG are isoenzymes, which handle the same step in the EPS assembly.Thereby, genetic control of the repeating unit biosynthesis in the Rlv VF39 resembles that of S. meliloti, where the attachment of the sugar residue at each step of biosynthesis is catalyzed by specific GT, and even two GTs can participate in catalysis at some steps of the pathway.
A presumptive circuit of the EPS repeating unit assembly in Rlt TA1 and Rlv VF39 is presented in

Genes involved in modification of EPS
Three genes, pssR, pssM, and pssK, were identified within the Pss-I cluster that may be involved in the modification of EPS in all R. leguminosarum strains as well as in R. etli CFN42 (Fig. 1).A homology search for the pssR gene product revealed similarity of its C-terminal region (amino acids 87-136) with the corresponding parts of a large number of acetyltransferases, including well characterized rhizobial NodL Notably pssR orthologs were found in all PssI-clusters (Fig. 1).This observation is in agreement with the data concerning the major site of O-acetylation localized at the second GlcA residue in the backbone chains which are identical in all EPS with known structure.
Insertional inactivation of pssR in the Rlv VF39 genome does not result in a complete absence of acetyl groups in EPS.This suggests the existence of other gene(s) elsewhere in the Rlv VF39 genome needed for the EPS acetylation.Decreasing of the level of acetylation has no effect on nodule development and nitrogen fixation.Similar data were obtained for S. meliloti ExoZ mutants, which failed to acetylate succinoglycan.It was shown that the acetyl decoration of succinoglycan is not absolutely required for a nodule formation; however it increased the efficiency of infection threads initiation [79,80].
The amino acid sequence of PssM shares homology with several known and putative ketal pyruvate transferases, including ExoV from S. meliloti and GumL from Xanthomonas campestris.Knock-out of the pssM gene does not result in the loss of ability to produce HMW EPS, but leads to the absence of the pyruvic acid ketal group at subterminal glucose in the repeating unit of EPS as it was shown by 13 C and 1 H NMR analyses.Complementation in trans restored the EPS modification in the pssM mutant [81].Disruption of the pssM gene led to essential disturbances in symbiosis.Thus, the pssM mutation resulted in the formation of aberrant non-nitrogen-fixing nodules on peas.Ultrastructural studies of mutant nodules indicated that the infection thread formation, release of bacteria into the plant cell cytoplasm and early steps of differentiation of bacteroids were not affected.However, further stages in the symbiosome development and maintenance were arrested.We proposed that the induction of early senescence of symbiosomes depends on the failure in recognition mechanisms and, what is essential, that recognition of a micro-symbiont by the host plant is important not only at early stages of symbiosis, but also during its intracellular period of life [81].Moreover, an accumulation of very large starch granules observed in infected and noninfected cells, suggests that the plant-derived photosynthates, which serve as an energy source for nitrogen fixation [82] are not fully consumed in pssM induced nodules.The mechanisms which modify the "symbiotic" nodule to starch accumulation may include alteration in the starch phosphorylase activity and (or) its expression [83].
Our finding that mutation in pssM abolishes pyruvylation of only one of the two sugar residues in Rlv VF39 EPS permits to propose that pyruvylation of the terminal galactose may be controlled by the pssK gene localized within the Pss-I cluster.The PssK amino acid sequence was similar to proteins containing the pyruvyltransferase domain IPR007345, including Pvg1p from Shizosaccharomyces pombe, YveS, YvfF and YxaB from Bacillus subtilis [84], and EpsL from Streptococcus thermophilus [85].It was shown that Pvg1 catalysed the transfer of the pyruvyl group to Galβ1,3-residues in N-linked galactomannan chains [86].Interestingly, no sequence homology was observed between the PssM and PssK proteins that can reflect different substrate specificities of these enzymes.No direct evidence for the pssK function was obtained in any of the R. leguminosarum strains.Our preliminary data indicate that knock-out of pssK abolishes the EPS synthesis and results in a non-slimy phenotype of colonies (Ivashina et al., unpublished).It is possible that pyruvyl modification of the terminal sugar residue may be necessary for the efficient polymerization or export of EPS as it was proposed for S. meliloti exoV which is involved in pyruvylation of succinoglycan [19,21].
As seen from Figure 1, in the Re CNPAF512 and Re CIAT 652 the pssK gene is absent and pssM is replaced by non-orthologous psaG and psaF genes, respectively.The latter genes presumably can also encode ketal pyruvate transferases since IPROO7345 domain (Polysacch_pyruvyl_Trfase) was found in their amino acid sequences.Unfortunately, EPS structure of both R. elti strains remains unknown, but one can suppose that at least in their side chains it differs from that of Rlv VF39.
This assumption is based on the observation that the sets of GTs in their Pss-I clusters differ from that of Rlv VF39.One can see that genes of ketal pyruvyl transferases are different also.Therefore, this finding additionally argues towards high substrate specificity of these modifying enzymes.

Polymerization and secretion of EPS
At present three pathways are known for the export of carbohydrate polymers in bacteria: (i) Wzx/Wzy-dependent; (iii) ATP-binding cassette (ABC) transporter-dependent; and (iii) synthase-dependent (reviewed in detail by [87,88).In the Wzx/Wzy dependent mode individual undecaprenol diphosphate-linked polysaccharide repeating units are assembled and translocated across the cytoplasmic membrane by a transport process requiring a Wzx protein (putative translocase or "flippase") followed by their polymerization at the periplasmic space by the Wzy protein [87,89].Further export of polysaccharides from the periplasm to the cell surface has been shown to be dependent upon additional protein(s) assigned to the polysaccharide co-polymerase (PCP) and the outer membrane polysaccharide export (OPX; formerly OMA) families [87,89].The best characterized member of the OPX family is an E. coli K30 outer membrane lipoprotein Wza which forms a multimeric 'secretin-like' structure mediating translocation of the group 1 capsular polysaccharide across the outer membrane.The high-resolution crystal structure of Wza has been determined and this shed light on the CPS traffic across the outer membrane [90].It has been postulated that the Wza protein together with co-polymerase Wzc form a molecular scaffold that spans the cell envelope and promotes the export of CPS (reviewed by [87]).
Current data indicate that polymerization and secretion of acidic EPS in R. leguminosarum biovars might be realized in a Wzx/Wzy-dependent manner.This supposition is based on the structural similarity of R. leguminosarum proteins PssTNOP and PssL to enzymes involved in CPC/EPS biosynthesis.The main data concerning the elucidation of the role of the mentioned proteins in EPS biosynthesis were obtained in the group of A. Skorupska.
The precise function of PssL has not been determined due to the inability to knock-out the pssL gene in the Rlt TA1 strain.However, the amino acid sequence similarity and hypothetical protein secondary structure allow placing the PssL protein within Wzx-like translocases that belong to the polysaccharide specific transport (PST) family.The predicted secondary structure of the Rlt TA1 PssL inner membrane protein has been supported experimentally with a series of PssL-PhoA and PssL-LacZ translational fusions.The obtained results clearly show that PssL displays characteristic features of members of the PST protein family comprising transporters with 12 membrane spanning segments, a large cytoplasmic domain, located between the sixth and seventh transmembrane segments, and amino and carboxyl termini located in the cytoplasm [91].
In addition to pssL, four closely linked pssTNOP genes were identified in the Pss-I cluster of various representatives of R. leguminosarum (Fig. 1) and assigned to be involved in polymerization and export of EPS [42,92,93].The PssT protein has been predicted to be a Wzy-like protein that together with PssL might be responsible for Wzx/Wzy-like-dependent EPS polymerization and translocation.This conclusion is based on structural homology of PssT with inner membrane proteins belonging to the PST family of proteins that are involved in transport of complex polysaccharides [42].The PssT consists of 12 transmembrane helices, a large periplasmic loop between the ninth and tenth transmembrane segments, and cytoplasmic N-and C-termini.The predicted topology of PssT has been confirmed with the use of a series of PssT-PhoA fusion proteins and a complementary set of PssT-LacZ fusions.The role of PssT in EPS biosynthesis has been investigated further by plasmid integration mutagenesis.The Rlt TA1 pssT mutant lacking the C-terminal part of PssT (starting after the 363-rd amino acid located in the periplasmic loop) produced increased amounts of total EPS with an altered distribution of high-and low-molecular-weight forms in comparison to the wild-type strain [42].The PssT was structurally and functionally homologous to S. meliloti ExoT, which together with ExoP and ExoQ proteins is involved in the final stages of succinoglycan biosynthesis [21,94].
The PssP protein displays significant structural features with members of the copolymerase (PCP2a) family that are involved in the synthesis of high-molecular-weight CPS/EPS including the well characterized ExoP protein from S. meliloti and a Wzc protein from E. coli [95][96][97][98].Membrane topology of the PssP protein resembles that of ExoP.Both proteins consist of a periplasmic hydrophilic N-terminal domain flanked by two potential transmembrane helices and a cytoplasmic C-terminal domain.The C-terminus contains the conserved Walker motifs A and B for ATP binding.Coiled-coil regions characteristic of PCP2a members were found both in periplasmic and cytoplasmic C-terminal domains of PssP [97].ExoP has been shown to be an autophosphorylating protein tyrosine kinase.Sitedirected mutagenesis of specific tyrosine residues in the cytoplasmic domain of ExoP has been demonstrated to result in an altered ratio of LMW succinoglycan to HMW succinoglycan [98].It has been hypothesized that the phosphorylation state of ExoP might regulate the degree of succinoglycan polymerization by controlling polymerization activities of other proteins, e.g., ExoQ and ExoT [94].A putative site for tyrosine phosphorylation has been found in the PssP protein, however, the functional significance of this site for phosphorylation of PssP is still unknown.Unlike ExoP, no tyrosine-rich region is found at the C-terminus of PssP.
Several mutations have been introduced into the Rlt TA1 pssP gene and shown to display different effects.The Rlt TA1 mutant with the deletion of the entire coding region of pssP is deficient in EPS production.A mutant that synthesizes a functional N-terminal periplasmic domain but lacks the C-terminal part of PssP produces significantly reduced amounts of EPS with a slightly changed low-to high-molecular form ratio.A pssP mutant with the disrupted 5'-end of the gene synthesizes exclusively low-molecular-weight EPS suggesting the importance of the functional N-terminal domain in the degree of polymerization [99].
The pssN gene encodes a protein which is homologous to the outer membrane polysaccharide export OPX protein family involved in CPC/EPS export [96,99].Like other members of the OPX family, PssN contains a conserved signal peptidase II cleavage site in the lipobox.With the use of pssN-phoA and pssN-lacZ gene fusions and in vivo acylation with [ 3 H]-palmitate it has been shown that PssN is a lipoprotein associated with the outer membrane and with the N-terminal signal sequence directed to the periplasm.Several experimental approaches (indirect immunofluorescence with anti-PssN and fluorescein isothiocyanate-conjugated antibodies and protease digestion of spheroplasts and intact cells of Rlt TA1) indicated that PssN is not exposed to the surface, but oriented towards the periplasmic space.Investigation of the secondary structure of the purified PssN-His6 protein by Fourier transform infrared spectroscopy (FTIR) revealed the predominant presence of beta-structure; however, alpha-helices, which could be involved in association with murein and/or other proteins, were also detected.Similar to OPX proteins, PssN has been shown to exist in a homo-oligomeric form of at least two monomers suggesting that together with PssP it might be involved in the formation of efflux channels for EPS export.No pssN mutants have been obtained so far.However, the increased amount of the PssN protein in Rlt TA1 correlated with a moderate enhancement of EPS production [92,99].
It was hypothesized by Mazur and co-workers [42], that the PssT protein, acting in complex with PssP and PssN, could be involved in controlling the rate of polymerization of repeating units and export of EPS to the cell surface.The PssN could interact with the periplasmic loop of the PssP protein, whereas the transmembrane regions of PssP could associate with the corresponding PST transporter, facilitating polymer export across the bilayer sructure.
The pssO gene product reveals no homology with known bacterial proteins.However, its participation in EPS biosynthesis has been confirmed by mutagenesis analysis: deletion of pssO in Rlt TA1 abolished EPS production and overproduction of PssO increased EPS secretion.Subcellular fractionation, pssO-phoA and pssO-LacZ translational fusion analyses and immunolocalisation of PssO on the Rlt TA1 cell surface by electron microscopy demonstrated that PssO is secreted to the extracellular medium and remains attached to the cell.The secondary structure of PssO-His6, as determined by FTIR spectroscopy, is rich in αhelices (32%) [100].It was speculated by Marczak and co-workers, that PssO may function as a periplasmic "chaperon" coating the EPS polymer and protecting it from the action of glycanases and/or be co-transported with the polysaccharide through a channel formed in the outer membrane.However, the authors can not exclude that PssO forms some kind of a cell surface structure essential for the assembly of the EPS transporter complex and its stability [100].
Using plasmid-borne transcriptional fusions of promoters of pss genes with the reporter gene lacZ, the effect of root exudate, phosphate, and ammonia on expression of pssT, pssN, pssO, and pssP genes in the wild-type Rlt TA1 background was examined.A stimulating effect of these environmental factors on pssO and pssP was observed.Interestingly, within the putative pssO promoter the divergent nod-box element was found.The pssO promoter was slightly inducible in a flavonoid-dependent manner in wild-type strains Rlt TA1 and Rlt 843 and very weakly in a mutant of Rlt 843 that lacks the regulatory nodD gene.The regulation of EPS production by NodD might be an important finding that connects EPS synthesis to the symbiosis of R. leguminosarum with clovers [101].
pssTNOP genes from Rlt TA1 have corresponding orthologs in genomes of R. leguminosarum and R. etli (Fig. 1) suggesting that there is a common mechanism of their action at least in these strains.The pssL gene is not so conserved: in Re CNPAF512 and Re CIAT 652 it was replaced by the non-homologous gene designated as psaI.However, the PsaI protein can be assigned to the same family IPROO2797 (Polysacc_synth.)as PssL.As mentioned above, these are just the strains for which EPS side chains were predicted to have different structure.We suggest that namely PssL/PsaI can be specific for the structure of EPS to be translocated across the inner membrane to the periplasmic space.Probably the main function of PssL-like proteins consists in the stringent control of the identity of repeating units which are further polymerized by the action of PssT.
Noteworthy, another gene designated as psaA was found in the pssV-E operons of R. etli strains described here.It encodes the protein, which can be assigned to the O-antigen ligaselike protein family PF13425.The function of this protein in EPS biosynthesis still remains unclear, however its participation in translocation or polymerization of the EPS repeating units can not be excluded.It should be noted that psaA homologs were found in all R. leguminosarum genomes under consideration, but they localize in different chromosomal regions far from the Pss-I cluster.

Processing of EPS
It was well documented that several rhizobial species (e.g., S. meliloti, R. leguminosarum, Rhizobium sp.NGR234, Bradyrhizobium) produce two distinct EPS classes that differ in size: HMW forms symbiotically active LMW forms [13][14][15][16][17]27,102].LMW forms of succinoglycan can be produced in S. meliloti either by direct export of the lowpolymerized octasaccharide repeating units [95,103] or by depolymerization of HMW polysaccharide by specific glycanases [104,105].These observations are also true for R. leguminosarum model.As discussed above, mutations in the genes controlling the polymerization and transport of EPS in the Rlt TA1 strain could contribute to the production of LMW EPS [93].
Moreover, three gene products have been shown to participate in degradation of HMW EPS in R. leguminosarum: PlyA, PlyB, and PssW (formely PssT in Rlv VF39 and PssT1 in Re CFN42).PlyA and PlyB, which are similar to each other display homology to bacterial and fungal polysaccharide lyases [106,107].Ten copies of a novel heptapeptide repeat motif were found in the sequences of these proteins which may constitute a fold similar to that found in the family of extracellular pectate lyases.These proteins are secreted via the PrsDE system of the Type I secretion system which is conserved in different Rhizobium species [108].PlyA appears to remain attached to the cells, while the PlyB diffuses beyond the edge of the colony [109].It has been proposed that the presence of extra 50 amino acids near the C-terminal domain of PlyA could be responsible for maintaining the protein attachment to the cell surface.Both proteins were inactive in the EPS-defective mutants and did not degrade mature EPS.They may be only active in association with the rhizobial cell surface suggesting the activation of PlyB by an EPS-related (nascent EPS or an intermediate in EPS biosynthesis) component [109].The PlyA and PlyB glycanases are not specific for EPS but can also degrade carboxymethyl cellulose (CMC).In cultured bacteria the plyA gene is expressed at a very low level, while a plyB mutant has a very large reduction in degradation of EPS and CMC [106].Cultures of plyB mutants contained an increased ratio of EPS repeating units to the reducing ends indicating that EPS was present in a longer-chain form, and this correlated with a significant increase in the culture viscosity.A double plyAB mutant retained residual CMC degradation, indicating the existence of additional activities in the cell.Recently, a third gene named plyC has been found in the genome of Rlv 3841.PlyC is secreted via the PrsDE secretion system and displays common structural features with PlyA and PlyB proteins [107], indicating that it may perform a similar function.The analysis of the symbiotic properties of a plyAB double mutant revealed that genes involved are not required for symbiotic nitrogen fixation and that nodulation was not significantly affected [109].In Rlt TA1 the plyA gene is missing.
Recently we have characterized another glycosylhydrolase encoded by the pssW gene and provided experimental evidence for its participation in the EPS processing (Kanapina et al, unpublished).The pssW gene has its counterparts in R. leguminosarum and R. etli genomes, and is located within the pssV-E operon.The PssW protein was referred to the family 10 of glycosylhydrolases of the GH-A clan, which are retaining glycoside hydrolases displaying endo-1,3--xylanase and endo-1,4--xylanase activities (CAZy database).We have shown that PssW is synthesized as a precursor of 42.4 kDa followed by its translocation across the cytoplasmic membrane and by cleavage of the 46 amino acid signal peptide.The periplasmic localization of the PssW indicates that it might be active toward the nascent EPS before its secretion outside the cell.The deletion of the pssW gene resulted in approximately a 3-fold increase in the ratio of HMW to LMW EPS and as a result in the increase of viscosity of the culture supernatant.Complementation of the pssW mutation restored the wild-type phenotype, and even increased the level of secreted LMW EPS.The PssW purified from the periplasmic space did not degrade CMC and succinoglycan, and revealed a 2-fold decrease in hydrolysis of EPS in the Rlv VF39 pssM mutant which lacks one of the two pyruvyl groups.The latter suggests the importance of pyruvyl modification on the degradation activity of the PssW.The knock-out of pssW did not significantly affect nodulation of peas probably due to the incomplete block in the LMW EPS synthesis.
It can be concluded that R. leguminosarum strains like S. meliloti were able to realize different strategies for production of LMW forms of polysaccharide: regulation of the degree of polymerization of EPS and hydrolase-mediating cleaving of EPS.Complex mechanisms directing the synthesis of LMW EPS can reflect the evolutionary benefit of rhizobia possessing different pathways of LMW EPS production that can be considered as important molecules for cell to cell communications during the development of nitrogen-fixing nodules.

EPS-deficient mutants as a possible model for studying bacterial gene expression in symbiosis
A decade ago in collaboration with the group of Prof. B. Rolfe we demonstrated that mutations in the pssA gene in Rlv VF39 and Rlt ANU794 strains led not only to abolishing the capacity of these strains to synthesize EPS but also to induction or up-regulation of at minimum 22 proteins.The differences identified in the pssE and pssD mutants of the Rlv VF39 strain were a distinct subset of the same protein synthesis changes that occurred in the pssA mutant (9 out of 22 changes) [110].Genetic complementation of pssA restored wildtype protein synthesis levels.We concluded that the observed alterations in protein synthesis are caused either by dysfunction of the PssA protein itself or are a response to the absence of EPS.
The N-terminal sequence analysis of 15 members of the pssA mutant stimulon was performed and the unique amino acid sequences were determined for 11 proteins [110].However, our attempts to identify these proteins were unsuccessful, and it was clear why.None of R. leguminosarum and R. etli genomes was sequenced at that time.We have repeated our attempts now and assigned 9 proteins to their known orthologs in Rhizobiaceae species.The results of this analysis are summarized in Table 3.The functions of all of these orthologs are predicted.Five proteins belong to ABC transporter systems; two proteins are assigned as NADH-dependent FMN reductases, one as taurine dioxygenase, and the last as disulfide isomerase.Most genes encoding the proteins were mapped on indigenous plasmids.
It is interesting why mutation in a single gene resulted in induction or up-regulation of genes encoding proteins with such different functions.We propose here the following hypothesis.It is based on the data that the expression of the pssA gene was not observed in bacteroids [63,111].In our opinion, members of the pssA mutant stimulon might be just those proteins which are expressed at the symbiotic state of bacteria.In this case, PssA, directly or indirectly, could play the role of a negative regulator of their gene expression in free-living bacteria.EPS itself could play the same role.
Obviously, the hypothesis can be easily verified.It is sufficient to examine whether the genes listed in Table 3 are expressed in bacteroids.If our assumption turns out to be correct, EPS-deficient mutants of R. leguminosarum could be considered as an attracting model for studying bacterial gene expression during symbiosis.* Spot number: arbitrary numbers were assigned to proteins which exhibited a change in their relative synthesis levels in response to a Tn5 insertion in the pssA gene of Rlt ANU794.Status: induced, when protein is absent in the wt strain and synthesized in the mutant strain; up-reg (+, ++, or +++), when 2-, 3-or 5-fold (or more) level of individual protein synthesis was observed in the mutant strain.Observed Mr and pI values of the protein based on the migration of proteins in gels.Data were taken from Table 2 [110].
** Data on N-terminal amino acid sequences were taken from Table 3 [110].Identical amino acids in the matched sequences in the ortholog proteins are shown in bold letters, and similar amino acids are underlined.Positions of these sequences in the orthologs are indicated bellow the sequences.

***
Mr and pI of the ortholog proteins were calculated from their amino acid sequences.
Table 3. Identification of proteins from the pssA mutant stimulon in Rhizobiaceae species.

Conclusion
The EPSs described in this review have common structure in that their repeating units possess identical backbone chains and the same 1-6-linked glucosyl residue starting the side chain.The diversity of EPSs is specified by the structure of their side chains.Presumably, structural information contained in side chains determines participation of EPS in symbiosis as signaling factors.This assumption follows from at least the observation that the absence of a single pyruvyl group in the side chain dramatically disturbed symbiotic properties of bacteria.
Most of the R. leguminosarum and R. etli genes involved in EPS biosynthesis are localized within the single cluster Pss-I.Gene arrangement in this cluster is similar.This implies that Pss-I clusters of these bacterial species have been evolved from a common ancestor.The differences mostly include genetic rearrangements of GTs genes, genes for modification of certain glycosyl residues in the side chain, and genes pssL/psaI thought to be involved in controlling translocation of the repeating units.It seems that each variant of Pss-I cluster originated as a result of co-evolution of all these genes.

Figure 2 .
It should be noted that PssA, PssDE, PssC, PssS, PssF and PssI/PssG apparently comprise the basic set of GTs involved in biosynthesis of all EPS under consideration.Taking into account the mentioned variability of side chains of these EPS, one can expect that the basic set of GTs in corresponding R. leguminosarum strains is augmented.Indeed, PssJ is present in Rlv 3841, Rlv VF39, Rlt TA1, Rlt WSM2304, Re CFN42, but is absent in Re CNPAF512 and Re CIAT 652.At the same time, genes for family 2 GT PsaC and family 1 GT PsaD were found in the PssI-clusters of the latter strains.In addition, genes encoding family 2 GTs PsaB and PsaE, amino acid sequences of which are similar to GTs attaching the Glc residue via the β1-6 bond, were localized in Rlt WSM2304 and Re CIAT 652, respectively.

Table 2 .
Proposed names for genes controlling EPS biosynthesis in some R. leguminosarum and R. etli strains.