Polymerization of Peptide Polymers for Biomaterial Applications

The term biomaterial, as defined by a consensus conference on Definitions in Biomaterial Science, is a material intended to interface with biological systems to evaluate, treat, augment or replace any tissue, organ of function of the body. [1] Biomaterials range from the simple embedded material to complex functional devices. There are three primary types of biomaterials: 1) metallic, based on metallic bonds, 2) ceramic, based on ionic bonds, and 3) polymeric, based on covalent bonds. [2] Polymers that are mechanical resistance, degradable, permeable, soluble and transparent, have been used in both simple and complex biomaterial applications. [3-5] The mechanical properties of poly(esters), poly(amides), poly(amidoamines), poly(methyl methacrylate), and poly(ethyleneimine), are particularly well suited (Table 1). These polymeric materials have given rise to first and secondgeneration biomaterials. [6] ‘Next’-generation biomaterials will have less toxic degradation products, undergo hierarchal assemble to form supramolecular structures, and maintain a sustainable design. The degradation products of synthetic polymers are of acidic and cannot be metabolized biological systems. This can result in a bioaccumulation of these products offsetting the homeostatic balance of the system. The production of synthetic polymers requires bulk separation and crystallization, which often inhibits the formation of higher ordered structures. Finally, the source materials for these polymers are petrochemicals. The dependency on petrochemicals presents both environmental and sustainability concerns. The development of materials that overcome these limitations would be a significant advancement in the field of biomaterials.


Introduction
The term biomaterial, as defined by a consensus conference on Definitions in Biomaterial Science, is a material intended to interface with biological systems to evaluate, treat, augment or replace any tissue, organ of function of the body.[1] Biomaterials range from the simple embedded material to complex functional devices.There are three primary types of biomaterials: 1) metallic, based on metallic bonds, 2) ceramic, based on ionic bonds, and 3) polymeric, based on covalent bonds.[2] Polymers that are mechanical resistance, degradable, permeable, soluble and transparent, have been used in both simple and complex biomaterial applications.[3][4][5] The mechanical properties of poly(esters), poly(amides), poly(amidoamines), poly(methyl methacrylate), and poly(ethyleneimine), are particularly well suited (Table 1).These polymeric materials have given rise to first and secondgeneration biomaterials.[6] 'Next'-generation biomaterials will have less toxic degradation products, undergo hierarchal assemble to form supramolecular structures, and maintain a sustainable design.The degradation products of synthetic polymers are of acidic and cannot be metabolized biological systems.This can result in a bioaccumulation of these products offsetting the homeostatic balance of the system.The production of synthetic polymers requires bulk separation and crystallization, which often inhibits the formation of higher ordered structures.Finally, the source materials for these polymers are petrochemicals.The dependency on petrochemicals presents both environmental and sustainability concerns.The development of materials that overcome these limitations would be a significant advancement in the field of biomaterials.
Biological materials have capabilities that far exceed those which are synthesized chemically.[21,22] Biopolymers including polypeptides, can serve as replacement materials for synthetic polymers.Polypeptides such as collagen, elastin, and silk, are currently being sought as nextgeneration biomaterials (Table 2).[23][24][25][26][27][28] Collagen, a major constituent of bone, cartilage, tendon, skin and muscle, are the most abundant proteins in the human body.[29] Several different types of collagen have been identified; these proteins are distinguished by their triple-helical structure.Type I collagen forms supramolecular assemblies.This assembly is controlled by environmental parameters such as concentration, pH, and ionic strength, making it of particular interest as a biomaterial.[29] Type I collagen is approximately 1000 amino acids long and contains a tripeptide (-Pro-Hyp-Gly-)n tandem repeat, where Hyp is a postranslationally modified hydroxyproline.Elastin is an extracellular matrix protein responsible for the extensibility and elastic recoil of blood vessels, ligaments and skin.Elastin is approximately 70 kDa protein composed of crosslinking domains and elastin domains.Elastin domains are composed poly(Gly-Val-Gly-Val-Pro)n and poly(Gly-Val-Gly-Val-Ala-Pro)n of repeating sequences.These domains undergo an inverse temperature transitions where the protein forms a crystalline state on raising the temperature and redissolve on lowering the temperature.[30,31] Silks are fibrous proteins spun by silkworms and spiders.They have a range of functions, including cocoons to protecting eggs or larvae, draglines to support spiders, and the formation of webs that can withstand high impacts of insect prey.[32] The mechanical strength of dragline spider silks has led to its employment in several biomedical applications.Dragline spider silk is around 300 kDa protein composed of two repeating domains: poly(Ala)n and poly(Gly-Gly-Gly-Xaa-Gln-Tyr)n, where Xaa can be any amino acid.The strength of this polymer is a result of the highly crystalline structure of the poly(Ala) domains and the amorphous poly(Gly-Gly-Gly-Xaa-Gln-Tyr) repeat is amorphous in structure which allows for flexibility.[33,34] All of three of these high molecular weight polypeptides are composed of highly repetitive amino acid sequences.The present review will be concerned with the synthesis of polypeptides and how they polypeptides will be used in the production of new biomaterials.Elastin (-Gly-Val-Gly-Val-Pro-) n, or (-Gly-Val-Gly-Ala-Pro-) n ~ 70 [30,31] Spider Silk (-Ala-)n or (-Gly-Gly-Gly-Xaa-Gln-Tyr-) n ~ 300 [33,34] Table 2. Amino acid sequence of polypeptides being used as biomaterials

Polymer
The synthesis of polypeptides means formation of amide bonds between amino acid monomers.Amide bonds are formed by a condensation reaction between a carboxylic acid and an amine.The conventional chemical method for amide bond synthesis requires the activation of the carboxyl group followed by a nucleophilic attack by a free amine.This requires the presence of coupling reagents, base and solvents (Figure 1A).[35] When this condensation reaction occurs between two amino acids the resulting amide bond is a peptide bond.Solid phase peptide synthesis (SPPS), developed by Merrifield, provides a fast an efficient manner to synthesis polypeptides.In SPPS, the N-terminal amino acid is attached to a solid matrix with the carboxyl group and the amine group is protected (Figure 1B).Amine group undergoes a deprotection step revealing an N-terminal amine.This is followed by a coupling reaction between the activated carboxyl group of the next amino acid and the amine group of the immobilized residue.Side chains of several amino acids contain functional groups, which may interfere with the formation of amide bonds and must be protected.This process may continue through iterative cycles until the polypeptide has reached its desired chain length.Polypeptide is then cleaved off the resin and purified.[35][36][37] SPPS has enabled the synthesis of ogliopeptides ~40 to 50 amino acids residues.However, limitations in chemical coupling efficiency have made it impractical to synthesize longer polypeptides with reasonable yields.

A) B)
Peptide bonds, the key chemical linkage in proteins, are synthesized not only chemically but also biologically.In this process the primary structure of a protein is encoded in a gene, a short sequence of deoxyribonucleic acid (DNA).A two-step process of transcription and translation are required to biosynthetic extract this information from a gene.Transcription is the step where the genetic information encoded from the DNA is transcribed into messenger ribonucleic acid (mRNA) in the form of an overlapping degenerate triplet code.This process occurs in three stages: initiation, chain elongation, and termination.Initiation occurs when the RNA polymerase binds to the promoter gene sequence on the DNA strand.Elongation begins as the RNA polymerase is guided along the template DNA unwinding the doublestranded DNA molecule as well as synthesizing a complementary single-stranded RNA molecule.Finally, transcription is terminated with the release of RNA polymerase from template DNA.Translation involves the decoding of the mRNA to specifically and sequentially link together amino acids in a growing polypeptide chain.Decoding of mRNA occurs in the ribosome, a macromolecular complex composed of nucleic acids and proteins.mRNA is read in three-nucleotide increments called codons; each codon specifies for a particular amino acid in the growing polypeptide chain (Figure 2).Ribosomes bind to mRNA at the start codon (AUG) that is recognized by initiator tRNA.Ribosomes proceed to elongation phase of protein synthesis.During this stage, complexes composed of an amino acid linked to tRNA sequentially bind to the appropriate codon in mRNA by forming complementary base pairs with the tRNA anticodon.Ribosome moves from codon to codon along the mRNA.Amino acids are added one by one, translated into polypeptide sequences dictated by DNA and represented by mRNA.At the end, a release factor binds to the stop codon, terminating translation and releasing the complete polypeptide from the ribosome.The advent of recombinant DNA technology has allowed the facile, large-scale production of many polypeptide sequences.However, polypeptides containing highly repetitive sequences are difficult to express recombinantly due to undesirable elements in the secondary structure of the mRNA.[38][39][40][41] The development of techniques to synthesize high molecular weight polypeptide sequences in a high yield, low impact manner will significantly advance the field of biomaterials (Figure 3).These current methods in polypeptides synthesis, namely, SPPS and the recombinant DNA technique, have significantly improved the field of biomaterials.The 'next' generation of biomaterials will require high yields of high molecular weight, repetitive sequences bearing modified amino acids.Below we will review the recent advances of 1) chemoenzymatic synthesis by protease, 2) peptide synthesis by NRPS 3) oligomerization of peptides by amino acid ligase, and 4) native chemical ligation, in the context of biomaterial production.

Chemoenzymatic synthesis of polypeptides
In 1898 vant' Hoff postulated that based on the principle of reversibility of chemical reactions, proteases would catalyze peptide synthesis.Bergman et al. actualized this in 1938 by successfully demonstrating the protease-mediated synthesis of Leu-Leu and Leu-Gly dipeptides.[42][43][44] Under standard conditions proteases hydrolyze peptide bonds (Figure 4).The active sites of proteases are composed of subsites located on either side of the catalytic site.The geometry and electrostatic potential of these subsites influence the substrate specificity of these enzymes (Table 3).Protease mediated polypeptide synthesis proceeds through either a thermodynamically controlled synthesis (TCS) or a kinetically controlled synthesis (KCS).[45] TCS is a reversal of hydrolysis and requires conditions, which shift the equilibrium towards synthesis.Any protease is suitable for TCS, the protease increases the rate at which equilibrium is established but does not alter the final equilibrium.
[46] TCS undergoes a two-step process (Figure 5).The first step is an exergonic process where a proton is transferred to -COOH and -NH2.The second step is an endergonic condensation.Reaction conditions should be selected that ensure for optimal catalytic activity.Manipulation of reaction conditions is required to increase the product yield.Strategies such as, product precipitation, introduction of organic solvents or water immiscible solvents, have been employed to favor synthesis.In KCS the protease acts as a transferase; it mediates the transfer of an acyl group to an amino acid of peptide-derived nucleophile (Figure 5).The reaction requires an activated Cterminal ester of the substrate and a protease containing a Cys or Ser residue in the catalytic site.
In aqueous conditions, homopolymers of L-amino acids and modified polypeptides homopolymers and heteropolymers, have been synthesized using cysteine or serine proteases.[47][48][49][50][51][52][53][54] This reaction is initiated by the formation of an acyl-enzyme intermediate between the Cys or Ser residue in the active site and the ester group on modified carboxylic acid.In presence of high concentration of the substrate, the free amine group of the L-amino acids acts as the nucleophile resulting in propagation.Water can also serve as the nucleophile resulting in the termination reaction (Figure 6).Under aqueous environments, competition between synthesis and hydrolysis is significant and reaction parameters such as protease activity, pH, buffer capacity, substrate concentration, and reaction time, in influence product formation.Several proteases hydrolyze the peptide bond between -Lys-Xaa, where Xaa represents any amino acid.These proteases were surveyed for optimal conditions for oglio(L-Lys) synthesis.At pH 7.0 bromelain demonstrated a 76% monomer conversion rate and an average chain length (DPavg) of 3.5.Under basic conditions both the monomer conversion and the DPavg were reduced, 10% and 3.0, respectively.pH 10.0 was the optimal condition for trypsin-mediated synthesis of oglio(L-Lys); the monomer conversion rate was 65% and the DPavg of 2.25.At neutral pH the monomer conversion rate was 10% and the DPavg was below 2.0.Upon synthesis insoluble homopolymers of poly(L-Tyr) and poly(L-Ala) underwent selfassembly to form macromolecular structures.[51,54] Polypeptide crystals have been observed in high molecular weight α-polypeptides synthesized by ring-opening polymerization.Papain-mediated polymerization of L-Tyr ethyl ester at pH 7.0 resulted in a homopolymer with molecular weight greater than 1,000.After 100 minutes, 100% of the monomer was converted to either the polypeptide or an oligomeric state.The resulting polymer precipitated in globular highly crystalline state.Wide-angle X-ray diffraction (WAXD) and scanning electron microscopy (SEM) of the crystal demonstrated rod-like crystal structures originated in the center and radially grew to a diameter larger than 50 μM.
[51] Under alkaline pH L-Ala ethyl ester was polymerized into poly(L-Ala) resulting in higher molecular weight product as compared to those synthesized at neutral pH.These higher molecular polymers showed distinct β-sheet formation and were capable of fibril assembly.This was in stark difference to the lower molecular weight polymers that were composed of mostly random coils and formed submicron aggregates.
Chemoenzymatic synthesis of low molecular weight polypeptides have been successful in the production of cholecystokinin and aspartame in non-aqueous media [55,56] However the bulk production of high molecular weight homo-or heteropolypeptides has yet to be realized.Advances in enzyme and media engineering will significantly advance this field.

Nonribosomal peptide synthetase
Nonribosomal peptide synthetases (NRPSs) represent another enzymatic approach to synthesize polypeptides.NRPSs are multimodular complexes of enzymes found in lower organisms that assemble secondary metabolites such as polypeptides, polyketides, and fatty acids (Figure 7).[57] Each NRPSs module is subdivided into four catalytic domains 1) Adenylation domain (A-domain) 2) Peptidyl carrier protein (PCP-domain) 3) Condensation domain (C-domain) and 4) Thioester domain (TE-domain) (Figure 7).[57] Initiation occurs in the A-and PCP-domains.A-domains serve as 'gatekeepers', recognizing specific amino acids (or hydroxy acids) and activating their carboxyl group in an ATP-dependent manner.The A domain has broad substrate recognition and allows for the facile incorporation of modified amino acids.[58,59] The activated amino acids are transferred to the PCP-domain where they are covalently tethered to the 4′phosphopantetheinyl cofactor.[60] Propagation takes place at the C-domain, where the amino acids are linked via a condensation reaction.Termination occurs at the TE domain through either a hydrolysis or a cyclization reaction, resulting in a linear or cyclic polypeptide, respectively.[61] NRPSs have made significant advances in the synthesis of cyclic polypeptides and the ability to introduce modified amino acids will assist in the development of scaffolds for biomaterials.However, NRPSs produce low molecular weight polypeptides and require large enzymatic complexes, which are difficult to use in large-scale production of polypeptides.
[ [65][66][67] The reaction mechanism of these synthetases requires an ATP-dependent ligation of a carboxyl of one amino acid with an amino-or imino group of a second.These enzymes all belong to the carboxylate-amine/thiol ligase superfamily.[68] This superfamily is identified by structural motifs corresponding to the phosphate-binding loop and an Mg 2+ binding site located within the ATP-binding domain.[68] Cell-free biochemical studies suggest the existence of dipeptide synthetases, which form αpeptide linkages between L-amino acids.[73,74] Tabata et al. using in silico screening identified the L-amino acid ligase gene (YwfE) in Bacillus subtilis.[69] Recombinant YwfE (also reported as BacD, in the literature) protein demonstrated α-dipeptides synthesis activity, using from unprotected L-amino acids as the substrate.YwfE contains an ATP binding domain however this domain showed no sequence homology with aminoacyl-tRNA synthetase or A-domains of NRPSs.The crystal structure of YwfE was refined to 2.5 Å and is superimposable on DDL from E. coli (PDB ID: 2DLN).[75] YwfE is divided into three domains: an N-terminal domain, a central domain and a C-terminal domain.The N-terminal and central domains are structurally similar to DDL while the C-terminal domain is 100 residues longer and contains additional antiparallel β-sheets.
[75] Critical differences were observed in the active site cavity of YwfE when compared to DDL: binding mode of dipeptide moiety, and the size and electrostatic potential of the active site.These differences contribute to the substrate preference for L-amino acids and play a critical role in the stabilization of the tetrahedral intermediate state as proposed for DDL mechanism.[75,76].The specificity of YwfE is currently limited to non-bulky, neutral residues at the N-terminus and bulky, neutral residues at the C-terminus.
Using YwfE as a template sequence new L-amino acids ligases (L-AAL) have been identified in silico (Table 4).[70][71][72] RizB from B. subtilis NBRC3134 mediated the synthesis of branch chained L-amino acids and L-Met homo-polypeptides.[77][78][79][80][81] RizB also synthesized heteropeptides with high specificity at the N-terminal and relaxed specificity towards the Cterminal.[70] RizB polypeptide synthesis is similar to other L-AAL however RizB uses amino acid monomers as well as ogliopeptides as their substrates while other L-AAL use only amino acids monomers.In a subsequent study, RizB was used as a template sequence to identify additional L-AAL that synthesis high molecular polypeptides.[71] spr0969 and BAD_1200 from Streptococcus pneumoniae and Bifidobacterium adolescentis, respectively, were identified as RizB homologs.spr0969 showed a modest improvement in polypeptide chain length.spr0969 polymerization of Val resulted in six repeat units while RizB polymerization showed only four repeats.BAD_1200 was more promiscuous than the other L-AAL investigated.The activity towards branched amino acids was lower than RizB but BAD_1200 also polymerized aromatic amino acids.[71] Polymerization of Peptide Polymers for Biomaterial Applications 239

Ligase
Ribosomal protein S6 from Escherichia coli undergoes a unique post-translation modification where up to six glutamic acid residues are ligated to the C-terminus.[82][83][84] RimK, also a member of carboxylate-amine/thiol ligase superfamily, mediated this post-translational modification.[84] In vitro analysis of RimK synthesis resulted in 46-mer (maximum length) of α-poly (L-Glu) at pH 9.0, 30 o C. The maximum chain length was pH dependent.Furthermore, RimK demonstrated strict substrate specificity for Glu.[72] The data reported in these manuscripts demonstrates the enzymatic synthesis of homo-and heteropolypeptides.Data mining has expanded both the diversity and chain length of the resultant polypeptides.Recently, the crystal structure YwfE has been reported which will assist in the development of protein design studies.Engineering the reaction media for enhanced solubility of the polypeptides may influence the molecular weight.

Native chemical ligation
Native chemical ligation (NCL) allows the combination of two unprotected peptide segments by the reaction of a C-terminal thioester with an N-terminal cysteine peptide in a two-step process (Figure 8).The first step is a transthioesterification where the thiolate group of peptide 2 attacks the C-terminal thioester of peptide 1 under mild reaction conditions (aqueous buffer, pH 7.0, 20 o C), resulting in a thioester intermediate.This intermediate rearranges by an intramolecular SN acyl shift that results in a peptide bond at the ligation site.[85] NCL has been employed for the synthesis of biomaterials such as type I collagen and hydrogels.The length of type I collagen far exceeds the length which can be synthesized by SPPS.[86] Recombinant expression of this protein is limited due to the high number of hydroxyproline residues.
[87] Paramonov et.al. was able to overcome these limitations by employing an NCL strategy.[88] Low molecular weight (Pro-Hyp-Gly)n repeats bearing an N-terminal Cys residue and a C-terminal thioester were prepared by SPPS.Following NCL polymeric material with Mw = 28,000, Mn = 12,000, and PDI = 2.3, were observed.Circular dichroism studies demonstrated that the secondary structure of the ligated collagen was in agreement with the native collagen.TEM studies of the ligated collagen revealed a dense network of fibers with diameter in the nanometer range and microns in length.These high molecular weight products indicated that the collagen was self-assembling in a fashion similar to natural collagen.
Hydrogels are hydrophilic cross-linked polymer networks that can retain water while maintaining a distinct three-dimensional shape.[89] 'Smart' hydrogels have been designed which can shrink, swell or degrade based on environmental factors.These hydrogels are used for tissue engineering, surgical adhesives and drug delivery.[89][90][91] Polypeptide hydrogels utilize protein scaffolds such as coiled coil domains and four-helix bundles.[92] These scaffolds do not meet the required mechanical robustness for many biomaterial applications.To enhance these mechanical properties, peptide forming hydrogels were modulated with β-sheet forming peptides by NCL.[93] The storage modulus of the hydrogel with the ligated β-sheet was 48.5 kPa, five times higher than the unmodified gel.
The hydrogels with the increased stiffness were able to support higher proliferation rates of primary human umbilical vein endothelial (HUVE) cells as compared to the more elastic hydrogels.
NCL has been useful tool in the development of these biomaterials.However, to further expand this technique for the development of higher molecular weight polypeptides significant hurdles must be address.High molecular weight polypeptides would require iterative ligations with peptides bearing both an N-terminal cysteine and a C-terminal thioester.Under typical NCL conditions the expected product would be the cyclic peptide preventing the formation of larger products.[94]

Conclusion and future perspectives
As earlier noted, amide bonds are the key chemical linkage in polypeptides.In 2007 the American Chemical Society of Green Chemistry Institute named amide bond formation as a top challenge for organic chemistry.[95] Since then, several new amide bond synthesis reactions have been developed.These methods are less expensive and friendly to the environment.The further development of these reactions and the transitioning of them from small molecules to macromolecules will be a future prospect of developing polypeptides as biomaterials.
Polypeptides represent a class of molecules, which are uniquely qualified to serve as biomaterials.They undergo self-assembly to form macroscopic structures and are synthesized from renewable resources.Chemoenzymatic synthesis, identification of new enzyme sequences and native chemical ligation has advanced the more traditional routes of polypeptide production.Despite the successes outlined above, these techniques have been modest in their production of new biomaterials.Progress in the development of 'next'generation biomaterials will require media and protein engineering as well as combining these methods reviewed above.One of the major limitations in the chemoenzymatic synthesis of high molecular polypeptides is the solubility of the product.Short chain polypeptides composed of hydrophobic amino acids precipitate out of aqueous solution, yielding a phase separation between enzyme and substrate.Engineering the media to enhance the polypeptide solubility will help achieve higher molecular weight products.Screening organic solvents as well as different immobilization techniques needs to be investigated.

Figure 1 .
Figure 1.Chemical synthesis of polypeptides.A) The chemical synthesis of an amide bond requires the activation of the carboxylic acid B) Solid phase peptide synthesis.

Figure 2 .
Figure 2. Cartoon representation of polypeptide synthesis on the ribosome.

Figure 3 .
Figure 3. Synthesis challenges facing the use of polypeptides as biomaterials.

Figure 4 .
Figure 4. Protease mediated hydrolysis of peptide bonds.A) Hydrolytic reaction scheme B) Proteases active site is composed of subsites (S).Each S has an affinity for residues (P).This "lock and Key" mechanism dictates protease specificity.

Figure 5 .
Figure 5. Protease mediated polypeptide synthesis occurs either through a thermodynamically controlled mechanism or a kinetically controlled mechanism.

Figure 6 .
Figure 6.Ser/Cys protease mediated polypeptide synthesis.A carboxy terminal modified amino acid ethyl ester serves as the monomer.Initiation occurs upon the formation of the acyl enzyme intermediate.In the presence of high concentrations of the substrate, the amino terminal of the monomer will act as the nucleophile resulting in chain length propagation.A water molecule can also act as the nucleophile resulting in termination of the reaction.

Figure 7 .
Figure 7. General scheme of nonribosomal peptide synthesis (NRPS).Each NRPS module incorporates one amino acid into the growing peptide chain.The modules are composed of several domains: Adenylation domain (red) is responsible for substrate selectivity, peptidyl carrier protein domain (orange) and condensation domain (green) work synergistically to form the peptide bond, and thioester domain (blue) which terminates the reaction, resulting in either a linear or cyclic polypeptide.

Figure 8 .
Figure 8. Native chemical ligation, the sulfur atom of the N-terminal Cys residue of peptide 2 attacks the C-terminal thiol group of peptide 1 producing a thioester intermediate that rearranges to yield a peptide bond.

Table 1 .
Survey of synthetic and natural polymers used as biomaterials

Table 3 .
Protease specificity.Arrows indicate cleavage site and Xaa represents an amino acid.

Table 4 .
Amino acid ligases homologs, substrate specificity and products