Insights into Ionizing-Radiation-Resistant Bacteria S-Layer Proteins and Nanobiotechnology for Bioremediation of Hazardous and Radioactive Waste

S-layers are crystalline arrays formed by proteinaceous subunits that cover the outer surface of many different kinds of microorganisms. This “proteinaceous cover” is particularly important in the case of ionizing-radiation-resistant bacteria (IRRB) that might be used in bioremediating hazardous and radioactive wastes (HRW). Despite the exponential growth in the number of comparative studies and solved proteic crystal structures, the proteic networks, diversity, and bioremediation-useful structural properties of IRRB S-layers remain unknown. Here, aided by literature, a tentative model of Deinococcus radiodurans R 1 S-layer proteins (SLPs) and the network of its main constituents were proposed. The domain analysis of this network was performed. Moreover, to show the diversity of IRRB S-layers, comparative genomics and comput‐ er modeling experiments were carried out. In addition, using in silico modeling, assisted by previously published data, the outermost exposed segments of D. radiodurans SlpA (surface layer protein A) that were predicted to interact with uranium were mapped. The combination of data and results pointed to various prospective applications of IRRB S-layers in nanobiotechnology for bioremediation of radioactive waste.


Introduction
Hazardous and radioactive wastes (HRW)-containing organic contaminants, toxic metals, and/or actinides as well as other radionuclides-released, for example, from nuclear weapons production, mining, and nuclear accidents can be legitimately considered as pools of ionizingradiation-resistant bacteria (IRRB)-vegetative bacterial cells with D 10 (acute ionizing-radiation dose for 90% reduction in colony-forming units (CFUs)) greater than 1 kilogray (kGy) [1]. Indeed, previously, IRRB were isolated from a high-level radioactive environments (Kineococcus radiotolerans SRS30216 [2]) and were engineered for metal remediation in radioactive mixed waste environments [3][4][5]. Yet, it is important to remember that, for instance, for the most studied "gold medalist" for radiation resistance, D. radiodurans, its ionizing-radiation resistance seems to be uncorrelated with its metal or actinide tolerance except in cases where the metal/actinide directly damages DNA [6].
Since the outer envelope-regular crystalline highly porous (glycol)protein meshworks ((s(urface)-layers)-of IRRB ( [7,8] and references therein) represents the first front that might encounter HRW and damaging radiation, it is expected to possess pertinent characteristics such as the ability to interact with actinides. These latter (e.g., americium-241 ( 241 Am), neptunium-237 ( 237 Np), plutonium-241 ( 241 Pu), and thorium-230 ( 230 Th)) might also be present in most radioactive wastes; however, uranium-238 ( 238 U) is the priority pollutant ( [9] and references therein). Analyses of the uranium bound to S-layer proteins (SLPs) of vegetative cells of IRRB to form complexes were reported in previous studies [10][11][12]. Although it is assumed that bacteria-carrying SLPs use S-layer variation-changing proteic expression through rearrangements of DNA, etc.-to adapt to different stress factors ( [12] and references therein), the role of S-layers in ionizing-radiation resistance has not yet been demonstrated, but a role in response to radiation damage has been proposed [13].
SLP lattices are composed of a single protein or glycoprotein monomers with apparent relative molecular weights ranging from 40 to 200 kDa depending on the particular bacterial species [14]. The presence of SLPs has been reported in hundreds of different species belonging to all major phylogenetic groups of (Gram-positive and Gram-negative) Eubacteria [15] [14]. Among bacterial SLPs, glycosylated proteins have been observed [17], for instance, for different species belonging to Firmicutes like two lactobacilli [18], Aneurinibacillus thermoaerophilus [19], etc.
A major challenge is to uncover structure-function relationships of SLPs to obtain many clues, for instance, about radioresistance and tolerance to toxic molecules. During decades following their discovery [20], various high-resolution techniques have been used for the observation of bacterial SLPs. Briefly, the biophysical methods of choice for studying the structure of SLPs might be categorized into three groups: (1) electron-microscopy-based techniques (e.g., electron diffraction (e.g., SLPs of Acetogenium kivui [21]), freeze-etch electron microscopy (e.g., SLPs of Thermoanaerobacter thermohydrosulfuricus [22]), atomic force microscopy (AFM) (e.g., SbpA of Lysinibacillus sphaericus [23], Hpi layer of D. radiodurans [24])); (2) scanning probe microscopy techniques (e.g., TfsA-GP TfsB-GP of Tannerella forsythia [25]); and (3) X-ray scattering techniques (e.g., X-ray crystallography (e.g., SbsB of Geobacillus stearothermophilus [26])). Except the latter, all the other methods are only useful to determine the overall topologies of SLPs and the symmetrical associations between the different molecular partners involved in their formation. The X-ray crystallography however is able to get insight into the atomic arrangements between the AAs of the SLP. This kind of data might be very useful for structure-function relationship studies as well as for developing biotechnological applications [27]. Yet, the tendency of SLPs to form two-dimensional lattices is considered as a major issue for growing crystals.
Spectroscopic methods such as nuclear magnetic resonance (NMR) spectroscopy, electron paramagnetic resonance (EPR) spectroscopy (combined with site-directed spin-labeling), and Fourier-transform infrared (FTIR) spectroscopy could also complement the structural and biochemical techniques to study the dynamics of SLPs and their interaction with other molecules (proteins, radionuclides, etc.). Indeed, for example, analyses of the secondary structure of Lactobacillus SLPs and their behavior upon heating were studied by combining FTIR and differential scanning calorimetry (DSC) methods [28]. In addition, instrumental methods such as X-ray photoelectron spectroscopy and matrix-assisted laser desorption ionization or electrospray ionization mass spectrometry have been introduced for the analysis of the protein and glycan portions of SLPs ( [29,30] and references therein). Also, immobilized C. crescentus S-layers on zincite-coated nanoparticles of iron oxide were investigated using FTIR spectroscopy, powder X-ray diffraction (XRD)-the diffraction pattern is obtained from a powder of the material rather than an individual crystal, AFM and field-emission scanning electron microscope (FESEM) [31]. Recently, Madhurantakam et al. [32] revised the methods employed to analyze the properties and structural characteristics of the S-layer lattice, demonstrating the increase in the usage of X-ray-based methods, spectroscopy, cryo-electron microscopy, and other high-throughput data-processing techniques to study the structure of SLPs during the last two decades.
It goes without saying that theoretical methods, including homology modeling, threading, and ab initio methods, cannot substitute experimental techniques to accurately determine the molecular structure of proteins. Nevertheless, even inaccurate, models can be useful especially to evaluate the overall fold of the proteic macromolecule, to map certain properties on molecular surface or to probe the three-dimensional patterns and the spatial repartition of certain AAs.
In this chapter, we present a survey of IRRB S-layers based on literature with a special focus on D. radiodurans and SlpA. Moreover, we introduce an automated computational pipeline adapted for the ease of use for the identification and analyses of SLP structures. The proposed pipeline was applied to completely sequenced genomes of previously known IRRB available in the genomes online database (GOLD) [33] and the radioresistant prokaryotes database (RadioP) [34]. Accordingly, we suggest prospective applications of IRRB SLPs in nanobiotechnology for bioremediation of hazardous and radioactive wastes.

Survey of S-layer proteins (SLPs) in Deinococcus radiodurans R 1
The order of D. radiodurans envelope layers and their nature has been investigated in many studies ( [8] and references therein). Using electron microscopy, Rothfuss et al. [8] have identified five layers: (1) the inner membrane, (2) the peptidoglycan cell wall, (3) the interstitial layer, (4) Hpi and the backing layer, and (5) the carbohydrate coat. D. radiodurans envelope has an unusual structure and composition with the outermost surface of this formation is the "pink envelope" containing carbohydrates, proteins, carotenoid (deinoxanthin [7]), lipids, and most likely the outer membrane [8]. SlpA (DR_2577) and Hpi (DR_2508) are the most abundant components responsible for the envelope maintaining [35]. The S-layer regular topology is distinguished in electron microscopy even when hpi gene is altered; however, the deletion of slpA results in the incapacity of Hpi to form the hexagonal porous structure on the S-layer outer surface and in losing the integrity of the pink envelope [8].
Fagan and Fairweather [16] estimated that the C. difficile S-layer contains up to 500,000 subunits, synthesized at a rate of 140 subunits per second per cell during the exponential growth phase. Several distinct mechanisms have evolved to cope with this high-protein flux. In many Gram-negative species, S-layer secretion relies on a specific type I secretion system; and in some studied Gram-positive bacteria, the secretion of the S-layer precursor is dependent on the accessory Sec secretion system. Interestingly, there is a striking degree of genetic linkage between the genes that encode SLPs and their dedicated secretion systems in many bacteria ( [16] and references therein). This genetic linkage was not observed in the case of D. radiodurans slpA and hpi (Figure 1). The S-layer is anchored to the cell surface via non-covalent interactions with cell surface structures, usually with lipopolysaccharides (LPSs) in Gram-negative bacteria and with cell wall polysaccharides in Gram-positive bacteria [14]. For example, in C. crescentus, the 225 Nterminal AAs from a total of 1026 residues of RsaA SLP is required for the binding to LPS on the cell surface [16,37,38]. Experimental evidence that the surface layer homology (SLH) domains recognize a cell envelope by binding to peptidoglycan was further provided for the SLPs Sap and EA1 of B. anthracis [39,40], SlpA of C. thermocellum [39,41], and the SLP and cellwall-associated xylanase of Thermoanaerobacterium thermosulfurigenes EM1 [39]. Importantly, it was shown that pure peptidoglycan was unable to bind the SLH domains [39] situated at the N-terminal region of the SLP of C. crescentus. However, it recognizes distinct oligosaccharides of the LPS as binding partners [42,43]. On the other hand, it has been proposed that hydrophobic interactions are responsible for attachment of the S-layer to the outer membrane in the backing layer, as well as for the association of the S-layer units in D. radiodurans [44]; while Hpi is involved in in vivo cleavage and is closely associated to SlpA. The SLH domain of SlpA was shown to bind deinococcal peptidoglycan-containing cell wall sacculi [45]. Indeed, it has ). (C) The five layers in D. radiodurans cell envelope [8], PilQ (DR_0774) as a component related with the S-layer [35] and the predicted association between: (i) SlpA and deinococcal peptidoglycan-containing cell wall [45], (ii) SlpA and Hpi [45], and (iii) SlpA and deinoxanthin [7]. The association of SlpA with peptidoglycan on one side and Hpi on the other localizes this protein in the "interstitial" layer of the deinoccocal cell wall [45]. (D) The predicted association between the dimeric structure of SlpA and the hexameric structure of Hpi [47]. The blue, orange, and purple shapes represent Hpi, SlpA, and DR_0774 (secretin), respectively. been previously determined that the backing layer of the pink envelope, rather than the Hpi layer, provides the rigidity and the curvature of the cell envelope [46], suggesting that the SlpA protein interacts with the backing layer [8].
Based on structural, functional, and proteomic data collected from literature, we propose in this chapter a tentative model describing the interaction between SlpA, deinoxanthin, Hpi, and PilQ (DR_0774) within D. radiodurans cell envelope (Figure 2).
The description of D. radiodurans SlpA and Hpi proteic network and domains is vital to our understanding of the function(s) of SLPs in this extremophile. Figure 3 shows D. radiodurans Hpi and SlpA known and potential functional partners predicted using the STRING database available at http://string-db.org/ [48].  [48] and pertinent literature [35,45].
Proteic domains of D. radiodurans S-layer and S-layer-like proteins, presented in Figure 3, were investigated using the SMART database [49], available at http://smart.embl-heidelberg.de/, and listed in Figure 4. Several proteins showed the presence of an SLH domain.

Automated computational pipeline for identification and analyses of Slayers in ionizing-radiation-resistant bacteria (IRRB)
IRRB were previously regrouped in the RadioP database [34]. In Deinococcus geothermalis and D. radiodurans, SLP is among the top 20 predicted highly expressed (PHX) genes; whereas K. radiotolerans and Rubrobacter xylanophilus have no similar genes [50]. In this work, the bioinformatics pipeline, as illustrated in Figure 5, was developed to perform an integrated analysis of IRRB SLPs (domains, interaction network, orthologs, etc.) including the prediction of the exposed residues in their structures to be explored for a potential use of IRRB (e.g., nanobiotechnology for bioremediation of radioactive waste). SlpA and Hpi were taken as working examples.
In an attempt to identify SLP (highlighted in Figures 3 and 4) orthologs in the genomes of IRRB species for which D 10 is known [1,34,51], computational analyses using BLASTP program [52] were performed based on the best reciprocal hits (BRH) approach (e-value threshold ≤ 1e −05 ). While D. radiodurans SLP SlpA has orthologs in related IRRB species, Hpi protein did not show any significant similarity with proteins in other related completely sequenced IRRB using our submentioned approach. As D. radiodurans Hpi lacks homology to any predicted peptide from other genomes of similar species, hpi can be considered as an orphan gene. In addition, analysis of phyletic similarity patterns-patterns of presence or absence, using similarity measures, of given proteins in the analyzed proteomes-suggests that several IRRB proteins might be key ancestral surface-layer players as they are not taxonomically restricted (Figure 6).  83778) as a database identified an hypothetical ortholog (320 AAs, WP_056676327) from Angustibacter sp. Root456 with four SLH domains, as a statistically significant (expect value of 2e −15 ) hit. Further investigations should be done on genomes of "SLH-negative" IRRB, such as K. radiotolerans, using sensitive methods for sequence similarity detection of highly diverged homologs.
The multiple sequence alignment of SlpA orthologous proteins using M-Coffee program [54] showed that conserved AAs between IRRB species (unpublished data) are extremely rare and can be summarized in a high sequence identities limited to the N-terminal region probably corresponding to the SLH domain, which is responsible for binding of the S-layer subunits to the underlying cell envelope layer [17]. The middle and C-terminal S-layers parts, comprising domains involved in the self-assembly process and exposed inside the pores and on the S-layer surface, showed very low sequence similarities.
Domain architecture of D. radiodurans SlpA and of its statistically significant hits were investigated using SMART database [49]. The structure of D. radiodurans SlpA vindicates previous results-SLH domain(s) [21] at the N-terminal part of many SLPs [41]-that all SlpA sequences display at least one SLH domain (Figure 7). Only 35% of the total sequence of the SlpA protein was assigned to a protein fold without counting the signal peptide predicted by SignalP server [56]. There are two main protein segments which were modeled with high degree of confidence returned by Phyre2 server [55].
The first segment corresponds to the SLH domain (99.6% degree of confidence) at the Nterminal portion of the sequence and was built based on the B. anthracis homologous protein domain. The domain adopts the form of a pseudo-trimer presenting a helix-bundle fold. A central groove of the domain was previously proposed as the binding site of the secondary cell wall polysaccharide (SCWP) molecules [57].
The second segment corresponds to C-terminal of SlpA that was associated with a β-barrellike fold, typical to porin structure (see Figure 7), with a degree of confidence of 88.4%. In fact, it has been shown that the β-barrel assembly machinery (BAM) complex [58] is required for the assembly of SlpA from Thermus thermophilus HB27 [59] a closely related bacteria to D. radiodurans. Indeed, the signal sequence responsible for the efficient binding of main protein from the BAM machinery to SlpA was also found in D. radiodurans. In particular, the highly conserved last phenylalanine (Phe/F) residue of the C-terminal is required for the proteinprotein interaction to ensure the proper folding of SlpA. Mutants of T. thermophilus lacking the terminal Phe of SlpA showed defective folding of the SlpA protein that was more sensitive to proteases than in a wild-type strain [59]. The amphipathic character of the β strands from the C-terminal segment supports also the presence of the β-barrel-fold type. Phylogenetic relationship between porins and S-layer has been already suggested and was supported by the electron microscopy imaging from T. thermophilus S-layer [60,61].
The threading prediction was also applied for the constructed sequences set of IRRB species (Figure 8). Figure 8. Prediction of the structures of multiple SlpA proteins from different radioresistant bacteria using Phyre2 server. The confidence level represents the probability (from 0 to 100%) that the match between the submitted sequence and the template used to generate the model is weak/strong. In the case of D. radiodurans, for example, the template used is the structure of OpdL protein (PDB code: 2Y0H). We reported only the structural domains with more than 90% of confidence or with less than 90% but with extended range over the protein sequence. Each confidence level is associated with one domain highlighted with the same color.
The first information to notice is the high conservation of the SLH domain in all taxa which is in concordance with previous findings using sequence analysis tools [41,62]. However, the length of the SLH domain differs and this is due to a variation in the numbers of the SARP subdomain units even among paralogs (e.g., Chroococcidiopsis thermalis and Deinococcus proteolyticus). This result suggests that the length of SlpA SLH domains does not affect its binding to the peptidoglycan layer.
Also, it was observed that the SlpA presents different folding categories for the non-SLH segments. This result suggests that IRRB SLPs have evolved different molecule preferences and specific biological functions although evolution of molecule preferences does not necessarily follow proteic divergence. In all the cases, the Phyre2 server succeeded to model either one or two structural domains for the orthologous sequences. Among them, six does not present any structural assignment except for the SLH domain. Except for Deinococcus gobiensis (gi|386855514), all domains are assigned to a bacterial porin like structure built by the server using different templates. Among studied IRRB species, Truepera radiovictrix (D 10 > 5 kGy [63]) is the only one with SlpA non-SLH domain presenting elusively an all α-fold type. Eight sequences showed an all β-barrel fold type for this non-SLH domain. In this context, it is important to note that previous findings have pointed out that Hpi contains a β-strand rich region [64]. Moreover, the main secondary structure of the domain PEGA (e.g., in DR_1185) -found in both archaea and bacteria and presenting similarity to SLPs-is predicted to be βstrands [65]. Taken together, these observations strongly suggest that the β-barrel fold type is "under positive selection" within the non-SLH domain of SLPs.
Based on the proposed model, it should be emphasized that the C-terminal part of the protein including the region with no assigned fold is the most interesting part being probably the outermost segment of the SLP (unpublished results). The orientation of the protein toward the outer membrane is reported in Figure 8. The residues fulfilling the criteria to chelate radionuclides (exposed, charged and forming structural clusters) generated by our pipeline are highlighted in the next section.

Prospective applications of ionizing-radiation-resistant bacteria (IRRB) S-layers in nanobiotechnology for bioremediation of hazardous and radioactive wastes
Many protocols related to nanobiotechnology with SLPs were previously highlighted by Schuster et al. [66]. Indeed, SLPs possess mesmerizing features that have attracted attention for their various promising applications as building blocks in nanobiotechnology (1-100 nm) for bioremediation of hazardous and radioactive wastes: Bacterial interactions with uranium have been documented extensively ( [12,69] and references therein): (i) biosorption and bioaccumulation; (ii) biotransformation of organic and inorganic uranium complexes; and (iii) enzymatically catalyzed reduction of U(VI) to U(IV), resulting in the precipitation as uraninite (UO 2 ). To the best of our knowledge, the molecular interaction of uranium with an SLP has been reported only for L. sphaericus JG-A12 (also known as B. sphaericus JG-A12) isolated from a uranium mining waste pile [12,70]. The percentage AA identity between the known S-layer proteins of B. sphaericus strains was shown to be between 71 and 98% for the N-terminal parts and 20 and 87% for the C-terminal segments [71]. Interestingly, ortholog of D. radiodurans SlpA in L. sphaericus (e-value = 7e −08 , 1258 aa, WP_010861478.1, hypothetical protein) shows three SLH domains at its C-terminal as depicted in Figure 9. However, there remains the question of which and how specific SLPs AA sites interact with actinides, particularly uranium? In general, the relevant functional groups involved in the interaction between bacteria and metals are reported to be -COOH, -OH, -NH 2 , and -PO 4 , etc. [72,73]. The segments at the outermost surface of SLPs are harboring -COOH, -OH, and -NH 2 groups [74]. In addition, using analyses of post-translational modifications, it was highlighted that SLPs can be phosphorylated [10]. Moreover, it was observed that, at acidic pH (4.3), SLPs formed complexes with U(VI), where uranium (U(VI)) is coordinated to carboxyl groups, in a bidendate binding fashion, and to phosphate groups in a monodendate binding mode [10][11][12]. Using our pipeline, exposed residues of D. radiodurans SlpA were predicted from the model we generated with Phyre2 This model (Figure 10) highlights the importance of the acidic residues (aspartic acid (Asp/D) and glutamic acid (Glu/E)) and the phosphorylation of the SLPs which can be carried at serine (Ser/S) and threonine (Thr/T) groups. In this context, it is important to remember that the interaction between uranium and SLPs is dependent on the oxidation state of the actinide which in turn depends on the extracellular pH. Indeed, previously, the speciation of uranium associated with Idiomarina loihiensis strain MAH1 was shown to depend mainly on the pH [11]. Taking together, the S-layer could serve as a binding matrix to introduce new functional properties like metal sorption [75]. Mapping of negatively charged (red) and putatively phosphorylated Ser and Thr residues (orange) on the outermost exposed segments of SlpA predicted using the presented workflow. The residues are indicated by spheres corresponding to Cα atom. The model is truncated because the structure was not fully predicted by Phyre2 [55].
There is plenty of room for improvement of the proposed computational pipeline to enhance its predictive power of SLPs within IRRB. Text mining, a careful analysis of the sequence data, and structure prediction methods were revealed to be useful in probing pertinent functional groups of SLPs. Overall, this chapter is expected to incite in vitro research that might elucidate various aspects related to IRRB SLPs and bioremediation of hazardous and radioactive wastes.