The RadioP1 – An Integrative Web Resource for Radioresistant Prokaryotes

The extremely radioresistant eubacterium Deinococcus radiodurans and the phenotyp‐ ically related prokaryotes, whose genomes have been completely sequenced, are presently used as model species in several laboratories to study the lethal effects of DNA-damaging and protein-oxidizing agents, particularly the effects of ionizing radiation (IR). Unfortunately, providing relevant information about radioresistant prokaryotes (RP) in a neatly centralized and organized manner still remains a need. In this study, we designed RadioP1 Web resource (www.radiop.org.tn) to gather information about RP defined by the published literature with specific emphasis on (i) predicted genes that produce and protect against oxidative stress, (ii) predicted proteins involved in DNA repair mechanisms and (iii) potential uses of RP in biotechnology. RadioP1 allows the complete RP proteogenomes to be queried using various patterns in a user-friendly and interactive manner. The output data can be saved in plain text, Excel or HyperText Markup Language (HTML) formats for subsequent analyses. Moreover, RadioP1 provides for users a tool “START ANALY‐ SIS”, including the previously described R-packages “drc” and “lethal”, to generate exponential or sigmoid survival curves with D10 and D50 values. Furthermore, when accessible, links to external databases are provided. Supplementary data will be included in the future when the sequences of other RP genomes will become available.


Introduction
To be considered as an RP, a microorganism should have a D 10 -the ionizing-radiation (IR) dose necessary to effect a 90 % reduction in colony-forming units (CFU)-threshold that is greater than 1 kilogray (kGy), corresponding to efficient physiological, genetic and proteic protection and repair mechanisms ( [1,2] and references therein). In this context, to our knowledge, even when prokaryotic members belonging to a radioresistant-species-harbouring genus have contrasted optimum temperatures-for example, ranging from 10 to 47 ºC-the least IR-resistant ones do not have D 10 values inferior to 1 kGy [3]. Furthermore, as suggested from D 10 and F 10 -the ultraviolet (UV) dose necessary to effect a 90 % reduction in CFUreported in literature [4][5][6], an RP is tolerant to both IR (e.g. α and β particles, γ-and x-rays, neutrons) and non-IR (UV light); and correlations were suggested [7]. In this context, it is important to note that UV may cause effects similar to those stimulated by IR [8].

Source of data
Used information was obtained by searching the NCBI database [54]. Clusters of Orthologous Group (COG) [55] were used to classify orthologous gene records in RadioP1. Orthology was calculated with Basic Local Alignment Search Tool (BLAST), the best reciprocal hit approach and InParanoid program.

Database construction
We built the database on a recent version of Linux operation system. MySQL workbench 5.2.38 was used to handle the database schema and to build the relational database. Perl scripts were developed to retrieve genome data and gene information from GenBank files collected from NCBI using file transfer protocol (FTP) and to store RP information in the appropriate database tables. A frontend user interface was developed using HTML5, JavaScript, Cascading Style Sheet (CSS) and Hypertext Preprocessor (PHP) program languages. Perl CGI (Common Gateway Interface) modules and PHP scripts were developed and used to link the Web interface to the database. These scripts allow all users to send requests via the Web interface to the server, run the jobs on the server and then return and display results on the Web interface ( Figure 1).  The database schema ( Figure 2) consists of 13 tables, allowing to search and to retrieve any stored biological data. Among the main tables is the species table (primary information: organism name and taxon ID), which is connected to the taxonomy and chromosome tables. from NCBI, and is linked to the taxonomy and chromosome tables. This later is connected to seqfile tables detailing the different file formats and paths related to each chromosome . The  gene table, related to the chromosome table, stores information such as gene name, gene ID

RadioP1 database user guide
RadioP1 is freely accessible through a Web browser at http://www.radiop.org.tn. There are at least three ways to use the database: browse, search and generate data.

Browse in the database
In the main page of RadioP1, a clickable list of currently available groups of IRRP-ionizingradiation-resistant prokaryotes-is organized at the top-left side, allowing users to browse pages for each of the groups, IR-resistant archaea (IRRA) and IR-resistant bacteria (IRRB).

Search the database
RadioP1 provides a search engine that is able to extract information from the database through: (i) text search, (ii) BLAST search and (iii) function category search..
The text and homology search contains three categories: 1. "SEARCH GENES": This search category allows extracting annotation information-gene symbol, chromosome name, strand, predicted orthologous genes, etc.-using querying gene locus tags.. The querying results are displayed in a table with each hit represented by a row containing a corresponding gene ID and a summary of characteristics-gene name, symbol, strand and product. In addition, each listed row in the output table provides a link to the individual gene pages, which highlight the querying genes found in the page of NCBI [54]. Users can get results in HTML, plain text or Excel formats for further analyses.

"RETRIEVE SEQUENCES"
: This search category enables extracting nucleic or proteic sequences using querying gene locus tags.
3. "HOMOLOGY DATA": This search category enables extracting predicted orthologous gene clusters using querying gene locus tags.
The function category search contains four subclasses:

"OXIDATIVE STRESS PRODUCTION":
When the generation of reactive oxygen species (ROS; superoxide (O 2⋅ ⋅− ), hydrogen peroxide (H 2 O 2 ) and hydroxyl (HO ⋅ ) radicals) produced by metabolism or irradiation exceeds the capacity of endogenous scavengers to neutralize them, cells become vulnerable to damage, a condition referred to as oxidative stress [56,57]. Typically, during irradiation, ~80 % of DNA damage is caused indirectly by irradiation-induced ROS and the remaining ~20 % by direct interaction between c-photons and DNA [57]. HO ⋅ radicals are the primary product of the radiolysis of water and in the presence of oxygen, can also generate some O 2⋅ ⋅− and H 2 O 2 by dismutation of O 2⋅ ⋅− [57]. In contrast, the primary ROS generated by metabolism are O 2⋅ ⋅− and H 2 O 2 [56]. The total intracellular titer of cytochromes and flavins might serve as a marker for the proclivity of cells to survive radiation and other oxidizing conditions [58,59]. For instance, the total number of c-type cytochromes in D. radiodurans and Shewanella oneidensis (D 10 = 70 Gy [38]) is 7 and 39, respectively [58]. Searching RadioP1 by this function subcategory provides a way to find out predicted genes involved in ROS production and subsequently estimate cellular radioresistance level.

"OXIDATIVE STRESS PROTECTION":
Unlike DNA DSB lesion yields ( [6] and references therein), in IR-sensitive cells, yields of IRinduced protein oxidation can be ~100 times greater than in IR-resistant cells [60,61]. Indeed, presently, it is demonstrated that proteins are major targets of IR damage and that shield against protein oxidation is an important mechanism for survival from IR exposure. IR resistance in some prokaryotes was highly correlated to the accumulation of high intracellular concentration of Mn 2+ , supporting the idea of a common model of Mn 2+ -dependent ROS scavenging in the aerobes ( [6,62] and references therein). For example, the aerobic archaeon H. salinarum accumulates high intracellular concentration of Mn 2+ , 155 ng/10 9 cells [62,63]. In contrast, hyperthermophilic anaerobic archaea T. gammatolerans and P. furiosus do not contain significant amounts of intracellular Mn 2+ , 3 ng/10 9 cells and 14 ng/10 9 cells, respectively [62]. These low concentrations of Mn 2+ in anaerobic hyperthermophiles were explained by the low levels of IR-generated ROS under anaerobic conditions combined with efficient detoxification systems [62]. In RadioP1, using the "OXIDATIVE STRESS PROTECTION" function subcategory, a summary table is provided to users giving insights about radioprotectors of each RP.

"DNA REPAIR GENES":
During irradiation, DNA double-strand breaks (DSBs) are considered as the most lethal damage, although they are the least frequent form of cellular DNA damage-compared to single-strand breaks and DNA base damages [60]. For example, in D. radiodurans, PprA protein has an important role in DNA DSBs repair [64]. Exploring RadioP1 by the "DNA REPAIR GENES" function subcategory allows users to generate a list of genes-orthologs of genes in Table 2-for which a functional knockout may change the level of radioresistance of mutant cells.

Generate survival plots and D x (D 10 and D 50 ) values
Cell survival models aim to describe the relationship between the absorbed dose and the fraction of surviving cells-cell survival curve. Distinct cell survival models were described [82][83][84][85][86]: the linear-single-hit single-target, the linear-quadratic (LQ) and the repairable-conditionally repairable damage (RCR) models. Other models include those based on target theory first described by Lea [87] and those described by Tobias [88], Curtis [89] and Sontag [90].
For instance, for UV-C-irradiated prokaryotes, as summarized previously [91], the mathematical dose-response models which describe the probability of a specific biological response at a given dose can be represented as follows ( t-t c ] for t≥t c (D) Where: * N and N 0 represent the microorganisms surviving at time t and those initially present at time t = 0 respectively. * α is a parameter proportional to the applied UV-C intensity and depends on the sensitivity of the microorganism to the UV-C ray exposure. * t c is the time during which microorganisms are substantially not inactivated. * F 0 represents the most resistant fraction, characterized by a lower sensitivity to the UV-C rays exposure, in a population of microorganisms, compared to the fraction (1 − F 0 ) less resistant to such exposure. RadioP1 provides a tool "START ANALYSIS" for users to generate exponential survival curves [92,93]. In addition, it integrated the previously described R-packages "drc" [94] for sigmoid curves and "lethal" [95] that computes lethal doses (LD) with confidence intervals [22]. All curves are supplied with D 10 and D 50 values.

Future directions
RadioP1 is a specialized database aimed at making a comprehensive repository of identified RP with experimentally determined D 10 . It is complemented by data extraction and analysis tools to help further analysis of RP. Researchers are kindly requested and encouraged to invigorate RadioP1 by depositing their new results-D 10 -of RP at RadioP1. Submission might either be performed through the "Submit new RP with a D 10 " form accessible under the IRRP main page or by e-mail to corresponding authors. In the future, we intend to include more detailed information about RP in the area of evolutionary biology, biotechnology and theranostics. Additional data sources like Kyoto Encyclopedia of Genes and Genomes (KEGG) and COGs will be integrated to extract further information about gene functions, clusters and pathways, helping users to categorise genes of interest into functional units and perform more efficient analysis on RP genomes.

Funding
This work was supported by the Tunisian National Center for Nuclear Sciences and Technology (CNSTN), the Pasteur Institute (Tunis) and a bilateral cooperation project coordinated by Dr Haïtham Sghaier from Tunisia and Dr Houria Ouled-Haddar from Algeria.