Safety Aspect of Recombinant Protein Produced by Escherichia coli : Toxin Evaluation with Strain and Genomic Approach

Escherichia coli is a Gram-negative bacteria which is well known for its pathogenic properties that can cause serious food poisoning, mostly indicated by diarrhea or other severe symptoms. Despite of its well-known properties due to its ability to produce toxin, most of E. coli strains are harmless and even beneficial especially in recombinant protein production. This bacterium is suitable for protein recombinant host since it has rapid growth, high expression rate, and well-known genome. Various proteins have been produced using E. coli expression systems, with therapeutic protein for medical application being the most notably produced. Apart from that, our group succeeded in producing beta galactosidase from a wild type E. coli strain B130. Furthermore, recombinant human serum albumin was successfully produced using E. coli strains BL21 (DE3). However, studies on E. coli toxin contamination in recombinant protein productions, strains, and genomic comprehen-sion are indispensable, particularly in therapeutic protein. Therefore, this chapter will discuss the safety aspects of recombinant therapeutic proteins in terms of toxin contamination by strain and genomic approaches.


Introduction
Escherichia coli is a member of Enterobacteria family which can be found in gastrointestinal tracts [1][2][3]. In general, it is well known to cause broad diseases, including gastrointestinal problems. Aside the fact that E. coli was normal to be found in colon, a number of its strains were discovered with the ability to produce toxins. Shiga toxin E. coli (STEC) and enterotoxin E. coli (ETEC) were groups of E. coli strains that have the ability to produce toxin that may cause several diseases, such as diarrhea [1,4,5].
Although E. coli may cause numerous gastrointestinal diseases; in fact, strains that are responsible for pathogenic properties were relatively minor in numbers. Furthermore, E. coli was considered harmless and even useful as a host for producing recombinant proteins. Even this bacteria becomes favorite host chosen in industrial and medical applications since it has rapid growth, well-characterized gene, and its ability to grow under aerobic and anaerobic system, and facilitates to form high cell density culture (HCDC) [6][7][8].
The discussion about advantage in producing recombinant proteins and worries of toxins of E. coli is like talking about two opposite sides of a coin. This will certainly raise a question "Is it safe to produce recombinant protein in E. coli? Will it be toxin-free contamination?" Therefore, this chapter will discuss the safety aspects of recombinant protein produced by E. coli against toxins using genomic and strains approach.

Toxin produced by Escherichia coli
Several pathogenic E. coli strains are known to be responsible for broad diseases, from mild to complicated cases. It is varying from mild diarrhea, hemorrhagic colitis, to hemolytic uremic syndrome. Among the pathogenic strains, STEC is an example of common strains which occupy high number in E. coli serotypes that produce toxin called Shiga toxins (Stx) [1,2,5,[9][10][11]. While STEC is a common pathogenic example, it belongs to a larger group named enterohemorrhagic E. coli (EHEC); also, there still exist numerous pathogenic E. coli and cause different diseases and complications. Pathogenic E. coli were classified in Table 1 along with its diseases they caused and virulence factors [1].
Considering the number of pathogenic E. coli, it is useful to classify the toxins' properties and structure. It will be convenience to determine whether the toxins belong to organic compound or peptide-based structure; therefore, we could analyze contamination probabilities in terms of producing recombinant protein. Most of the virulence factors stated in Table 1 were protein attached in bacterial membrane with the role of adhesion or recognition to host cell [12]. Meanwhile, shiga toxin, heat-stable and heat-labile toxin, and other cytotoxins were protein released by pathogenic E. coli. These toxins have specific receptors to induce invagination to the host cell, while their virulence mechanism also differs depending on the nature of each toxin and their molecular target [4].
STEC serotypes vary and differ in number of incidences, although the O157:H7 is a serotype considered to be responsible of numerous outbreaks. Shiga toxin occupies AB5 structure (see Figure 1), the catalytic subunit A (StxA) and homopentamer of subunit B (Stx B) as recognition site to globotetraosylceramide (Gb3/Gb4), which are present in the host cell surface, which leads to invagination of the toxin. STEC can produce either Stx1 (Stx1 and Stx1c), Stx2 variant (Stx2, Stx2c, Stx2d, Stx2e, and Stx2f) or range combination of both variants [4,12]. Once invagination succeeds, catalytic subunit A would disrupt cell metabolism by inhibiting elongation factordependent aminoacyl tRNA binding (see detailed mechanism in [4]). The highly specific RNA N-glycosidase activity cleaves adenine base in eukaryotic ribosomal RNA, precisely at 28S subunit on the α-sarcin loop located in position 4324 [4].
Meanwhile, heat-labile (LT) and heat-stable (ST) toxin belong to ETEC groups. Nevertheless, LT enterotoxin shares similar structures to Stx which occupy AB5 conformation. Subunit A acts as a toxin by binding to its receptor, guanylyl cyclase C (GC-C). The interaction will activate guanine nucleotide protein Gsα by ADP-ribosylation, which trigger stimulation of secretion by cAMP-dependent mechanism. Elevated numbers of cAMP cause CTFR channel to secrete water and ions, thus generating diarrhea [3]. By contrast, ST structure is relatively simple. The STa class was made up with 18-19 cysteine-rich amino acids, while STb has 48 amino acids. ST virulence acts by triggering secretion of water and ions by triggering signaling cascade through guanylyl cyclase C (GC-C) in intestine [13,14]. The structure of both ST and LT is shown in Figure 2.  The fact both STEC and ETEC toxins (Stx and LT, and ST, respectively) are peptide based elucidates its origin that were genetically listed in their DNA. These toxins were made under central dogma of protein synthesis. Therefore, analysis through genomic approach on recombinant E. coli host is possible to be conducted.

E. coli as host for recombinant protein expression
The production of recombinant proteins in microbial systems was started in 1970 and continued to boom in 1980 with the production of insulin. There is no doubt that this method has revolutionized and widened the field of biochemistry [19]. The ability to express large quantity of protein with less effort, relative to manual synthesis, allows industrial processes to produce in commercial scale. However, several considerations should be discussed before executing the production such as, appropriate vector, location of the protein of interest (whether as soluble fraction or inclusion bodies), optimum condition (pH, medium, temperature, aerobic/ anaerobic system), genetic design for convenience of purification, and at the top of it, microbial selection [7,8,20].
E. coli become preferred microbes in terms of recombinant protein host among researchers and industrial use. The simplicity of its expression system, compared to other higher level organism, and large quantity of well-characterized genomic database offer advantages in constructing the vector to be used [20]. A plenty number of research regarding E. coli also become an advantage to give amount of consideration of various expression conditions. Nevertheless, E. coli expression system has limited post-translational modification, which means that some proteins that require modification, such as alkylation or glycosylation, may not be perfectly expressed in E. coli. However, several strains of E. coli have the ability to perform specific post-translational modification [19,21]. Therefore, we provide a simple summary on recombinant proteins produced by E. coli along with strains and expression strategies in Table 2.
Among recombinant proteins mentioned in Table 2, hEGF and hPT-2 are examples of therapeutic protein. Regarding its use in medical interests, therapeutic proteins produced in E. coli have to be safe for administration into human bodies; therefore, purification steps and any contaminants present become a huge concern in producing recombinant protein. Idetifying location of protein target is a prominent fundamental to determine source of contamination and to predict any possible contamination. Understanding the protein location also helps with the purification strategies needed to separate contaminants, specifically toxins, with the result that highly pure proteins were recovered. Choi et al. [29] through Figure 3 classify locations of protein expressed in E. coli and its general purification steps needed.  [16]; (B) heat-stable enterotoxin, STa class (1etn) [17]; (C) heat-stable enterotoxin, STb class (1ehs) [18]. The distinction of protein location is affected by either the nature of the expression system or the protein construction design. Both extracellular and intracellular strategies on expressing protein give its own advantages and disadvantages. Extracellular expression offers simple purification, improved folding, and soluble products. This strategy can be achieved using signal peptide, co-expression with phospholipase, or co-expression with chaperon [33,34]. In contrast, intracellular expression prefers inclusion bodies formation. While inclusion bodies give easy separation and prevent protease degradation, it has complex purification steps and refolding process is compulsory. Fusion partners, such as intein, often added in gene construction in intracellular works to provide efficient strategy in purification steps [21,33].
Based on protein location, toxin contamination can be investigated. Both Stx and LT-ST toxins are secreted by E. coli, increasing the risk of contamination when the protein of interest is produced extracellularly. Even so, since extracellular protein exists in soluble state, purification might not be impossible. Whereas intracellular expression may put more concern at contamination risk since toxins might be clumped together in the form of inclusion bodies. This case may put more consideration in solubilization and purification process. However, these allegations are only an assessment of risk factors with the assumption that toxins are produced in E. coli, which is used for recombinant protein expression.

Safety aspects of recombinant protein production against toxin
Using comprehensive understanding of toxin origin, specifically Shiga toxin and enterotoxin, it is clear that these toxins were peptide based and generated by certain gene in STEC and ETEC. The gene stx was responsible for producing the Stx toxin using central dogma of E. coli, reciprocally to ST and LT encoding gene. Moreover, E. coli strains that are commonly used for recombinant protein work are also known. Therefore, it is possible to examine the safety aspect of recombinant protein against toxin through genetic alignment between common E. coli strains in recombinant work and toxin genes. Here, E. coli BL21 (DE3) (ACC: NC_012892) and K-12 MG1655 (ACC: U00096.3) were used as representative. While toxin genes used are Stx (ACC: AY143336.1), LT (ACC: JQ031712), and ST (ACC: P22542.1).
In term of the existence of stx gene, common recombinant host strains are absence of the stx gene. Therefore, since the strains were clearly different, it is considerably safe to use E. coli as recombinant host without neglecting other contaminants.

Expression and characterization of HSA gene in E. coli BL21 (DE3)
This step started with growing E. coli BL21 (DE3) [pD881-torA-HSA] transformant as starter culture at 200 rpm, 37°C for 16-18 hours. Then starter culture was moved as much as 1% into 25 mL Luria-Bertani medium containing kanamycin as selection marker. E. coli BL21 (DE3) cell culture was grown until OD 600nm reached 0.8 for induction. Before induction was performed, 1 mL sample from culture was separated as protein fraction before induction (t 0 ). Induction was initiated by adding L-rhamnose into the expression medium to bring the final concentration to 4 mM. To obtain protein fraction in cytoplasm, sonication method was used. Lysate from six E. coli BL21 (DE3) [pD881-torA-HSA] transformant colonies showed that HSA was expressed in cytoplasm, it was characterized with the presence of ±67.0 kDa and in the SDS-PAGE electrophoresis [28]. The result of expression is presented in Figure 4.   Based on SDS-PAGE electrophoresis of the protein produced by the E. coli cell at varying concentration of L-rhamnose as inducer (Figure 5), it can be concluded that the best concentration of L-rhamnose that induces the production of the protein of interest was 4 mM because it produces more target protein, either in the insoluble fraction of the medium or in the form of inclusion bodies at t (20) . The results also indicate that not all rhEGF translocated into the periplasm were secreted to medium. The hEGF was expressed in E. coli BL21 (DE3) with molecular weight of 6.2 kDa. The result of expression is presented in Figure 5 [35].
Apart from that, our group succeeded in producing beta-galactosidase from a wild type E. coli strain B130, with high purity. Kinetical parameter (K m and V max ) of the enzyme were 2.417 × 10 −4 mol and 4.664 × 10 −4 mol.minute −1 , respectively [36].

Conclusions
E. coli is renowned by its pathogenic properties, specifically in causing gastrointestinal disease. While in contrast, the same species also being helpful in expressing recombinant protein. Thus, contrary properties leave questions in terms of safety in expressing recombinant protein. Pathogenic E. coli strains were identified and classified in accordance with the disease caused. While most of pathogenic group gain its virulence by their membrane protein, some of it secretes toxins, like Stx from STEC or LT and ST from ETEC group. This toxin-secreting E. coli were important to understand contamination risk in recombinant protein. All three toxins were considered as peptide-based structure, in which production relies on respective genes. Alignment of toxin genes to commonly used E. coli in recombinant work makes a way to investigate toxin presence in recombinant-host E. coli. The BL21 (DE3) and K-12 MG1655 strains used as representative in alignment process, which generate non-overlapping alignment. This clears up the risk of toxin contamination on recombinant protein since the absence of toxin gene in these strains. Therefore, expressing recombinant protein, especially therapeutic protein, in E. coli was considered to be safe against toxin.