Comparative analysis of DNA repair and recombination pathways in E. coli and H. pylori
Helicobacter pylori is a Gram negative bacterium found on the luminal surface of the gastric epithelium. Infection is generally acquired during childhood and persists life-long in the absence of antibiotic treatment. H. pylori has a long period of co-evolution with humans, going back at least since human migration out of Africa about 60, 000 years ago [1, 2]. This co-evolution is reflected in DNA sequence signatures observed in H. pylori strains of different geographic origin and has enabled the mapping of human migration out of Africa. This prolonged and intimate relationship is likely to have shaped the large and diverse repertoire of strategies which H. pylori employs to establish robust colonization and persist in the gastric niche. Key challenges that H. pylori encounters are fluctuation of acidic pH of the gastric lumen, peristalsis of the mucus layer leading to washout in the lower intestine, nutrient scarcity, and the innate and adaptive immune responses promoting local inflammation or gastritis [3-8]. These challenges, particularly host immune responses, are likely to represent the selective pressure driving H. pylori micro-evolution during transmission leading to persistence in the human host.
Host defences against H. pylori have been extensively studied including mechanisms which H. pylori uses to avoid or inhibit an effective host immune response and review of these related studies is beyond the scope of this chapter (see reviews [9-24]). Instead, key strategies of H. pylori immune escape with emphasis on regulation of inflammation are succinctly presented in the context of H. pylori persistence. H. pylori has evolved to avoid detection by pattern recognition receptors of the innate immune system, such as toll-like receptors and C-type lectins. Indeed, the TLR4 determinant of H. pylori lipopolysaccharide is a very weak stimulus as a result of its altered and highly conserved lipid A structure [25, 26]. In addition, the lipopolysaccharide O-antigen mimics Lewis antigen expressed on host cells and has been shown to regulate dendritic cell function through its binding of DC-SIGN [27-32]. Mutation of the TLR5 recognition site in the flagellin and the sheath protecting the flagella prevent strong activation of the TLR5 signalling pathway [33-35]. H. pylori inhibits the adaptive immune response by blocking T-cell proliferation at different levels via at least three different factors, the gamma-glutamyltranspeptidase , the cytotoxin VacA  and its unique glucosyl cholesterol derivatives  (produced from the cholesterol H. pylori extracts from host cells). A recent study on the role of the inflammasome during H. pylori infection unveiled the pro-inflammatory and regulatory properties of caspase-1 mediated by its substrates IL-1β and IL-18, respectively . In light of the acid-suppressive properties of IL-1β , the latter observation exemplifies how seamlessly adapted H. pylori is to its human host in its ability to balance gastric pH, inflammation and avoid overt gastric pathology to maintain the physiology of its niche and persist for decades. It would therefore be interesting to note the higher risk for atrophic gastritis in patients with IL-1β polymorphisms that leads to increased expression of IL-1β [41-43] as elevated IL-1β levels might interfere with the dual role of caspase-1 and promote overt inflammation during H. pylori chronic infection. Further studies on the activation/regulation of the inflammasome are warranted to gain new insights into gastric cancer caused by H. pylori infection.
The scope of this chapter is to review H. pylori genetic and epigenetic plasticity and discuss the hypothesis that this plasticity promotes H. pylori adaptation to individual human hosts by generating phenotypically diverse populations. Emphasis has been put on mathematical modelling of H. pylori chronic infection , its micro-evolution and related mechanisms for the generation of diversity including genetic [45-49] and epigenetic diversity [50, 51]. Mechanisms of horizontal gene transfer and the generation of intra-strain genetic diversity are reviewed and the implication of phasevarion-mediated epigenetic diversity is discussed in the context of bacterial population and adaption.
Examples of experimental strategies to study and decipher H. pylori persistence are presented and include bacterial genetics combined with the use of animal models as well as H. pylori comparative genomics during chronic and acute infection in humans. The chapter summarises the mechanism of H. pylori micro-evolution, in particular the tension between generation of genetic diversity to adapt and genome integrity. Finally, alternatives to antibiotic treatment by targeting H. pylori persistence are discussed based on the urease enzyme.
2. H. pylori persistence: Mathematical modelling
H. pylori survive in the gastric niche in a dynamic equilibrium of replication and death by manipulating the host immune system to keep a favourable balance that allows for persistence and transmission. Blaser and Kirschner developed an elegant mathematical model of H. pylori persistence based on the Nash equilibrium, specifically that H. pylori uses the evolutionary stable strategy based on cross-signalling and feedback loop regulations between the host and the bacteria . In this model, a set of interactions between bacteria and the host is defined as well as their corresponding rate parameters. Two populations are considered, the non-replicating free swimming bacteria in the mucus and the adherent bacteria replicating in a nutrient-rich site. This model predicts clearance of the bacteria in the presence of a strong host immunological response and persistence if the host response is weaker. However, this model does not take into account random fluctuations for stochastic phenotype transitions. H. pylori is likely to exhibit phenotypic and genetic plasticity to adapt to changing gastric environments but it has relatively few sensors of gastric environment change (e.g. pH, immunological responses, receptor availability, and nutrients). H. pylori’s apparently limited gene regulation and its small genome suggest alternative adaptive mechanisms, different from exclusive maintenance of active sensory machinery that is costly. Possibilities include small RNA regulation , automatic random genetic switches for generating diverse adaptive phenotypes , exemplified by the frameshift-prone repetitive sequences at the beginning of certain phase variable genes [47, 50, 51], and the numerous duplicate and divergent outer membrane genes, which could be part of a more general gene regulation network, so far unidentified. Thus further refinement of this model is required to understand the mechanisms involved in establishing the optimal balance between sensing changes and random phenotype switching. Introducing random fluctuations for stochastic phenotype transitions in this model is highly relevant to phase variation and phasevarion, two mechanisms H. pylori uses to generate phenotypic changes and adapt.
3. Genetic diversity
The above mentioned mathematical model based on cross-signalling and feedback loop regulation between the host and the bacteria predicts a unique H. pylori population in every human host. In other words, H. pylori transmission results in adaptation to a specific host during the acute phase of transmission as well as in the chronic phase. The Nash equilibrium model for H. pylori colonization is in line with the genetic diversity of H. pylori populations as the result of human migration out of Africa and with vertical transmission. Indeed, H. pylori strains transmitted within families are genetically less diverse than strains from unrelated infected persons. This highlights the isolation of H. pylori strains within a host and genetic adaptation to human subpopulations. Multi-locus sequence typing analysis has identified 6 ancestral populations of H. pylori named ancestral European 1, ancestral European 2, ancestral East Asia, ancestral Africa1, ancestral Africa2 , and ancestral Sahul .
3.1. Intra-strain generation of genetic diversity
Adaptive evolution of species relies on a balance between genetic diversity and genome stability promoted by genome maintenance mechanisms and DNA repair preventing mutations and ensuring cell viability. Intra-strain or intracellular genetic changes have several origins including spontaneous chemical instability of DNA such as depurination and deamination, errors during DNA replication and the action of DNA damaging metabolites, either endogenous or exogenous. The DNA repair machinery is essential to all living organisms and has been best studied for the model organism Escherichia coli. The advances in DNA sequencing technologies and comparative genomics provided a unique opportunity to better understand genome maintenance beyond E. coli model organism by comparing the DNA repair gene content in different bacterial species. This is of particular interest for bacterial pathogens that have to overcome immune responses and associated DNA damaging oxidative stress . Comparative genomics of nine human pathogens (Helicobacter pylori, Campylobacter jejuni, Haemophilus influenza, Mycobacterium tuberculosis, Neisseria gonorrhoea, Neisseiria meningitidis, Staphylococcus aureus, Streptococcus pneumonia and Streptococcus pyogenes) revealed a reduced number of genes in DNA repair, recombination and replication compared to E. coli .
During replication DNA polymerase encountering DNA damage could either be blocked or continue and introduce a mutation into the daughter strand. Maintenance of the template for DNA replication before the replication fork reaches the DNA lesion is therefore an effective DNA repair strategy employed by the cell to avoid mutation or replication arrest. A blocked replication fork requires the homologous recombination machinery to repair the damaged DNA and to resume replication. DNA template maintenance is achieved through several mechanisms pre- and post-replication:
Direct repair that reverses base damage.
Excision repair that removes the lesion from the DNA duplex. There are three types of excision repair:
Mismatch repair (MMR) is a post-replication mechanism which contributes to the DNA polymerase fidelity by identifying mismatched bases and removing them from the daughter strand .
Recombinational repair that exchanges the isologous strands between the sister DNA molecules.
Table 1 shows that nucleotide excision repair is the only fully conserved repair pathway amongst the nine pathogens mentioned above and that the SOS response related genes are completely missing from H. pylori [46, 54]. Direct repair and mismatch repair are often completely absent whereas base excision repair, recombinational repair and replication (dnaA, dnaB, dnaG, gyrA, gyrB, parC, parE, priA, rep, topA and polA) are often missing one or several genes. This absence of DNA repair and replication genes suggests either that functional homologs remain to be discovered or that specific genome dynamics and genome integrity maintenance strategies are at play in different microbial pathogens to adapt to their niche.
|Pathway||Protein||H. pylori gene||Protein function||Bacterial species|
|Base excision repair|
|Tag||Glycosylase I (adenine)||+||-|
|AlkA||Glycosylase II (adenine)||+||-|
|MagIII||HP0602||Glycosylase ( adenine)||-||+|
|Nucleotide excision repair|
|UvrA||HP0705||DNA damage recognition||+||+|
|Mfd||HP1541||Transcription-repair coupling factor||+||+|
|Mismatch excision repair|
|Mismatch recognition||MutS1||Mismatch recognition||+||-|
|MutS2||HP0621||Repair of oxidative DNA damage||-||+|
|MutL||Recruitment of MutS1||+||-|
|RecA||HP0153||DNA strand exchange and recombination||+||+|
|RecBCD pathway||RecB (AddA)||HP1553||Exonuclease V, β subunit||+||+|
|RecC (AddB)||HP0275||Exonuclease V, γ subunit||+||+|
|RecD||Exonuclease V, α subunit||+||-|
|RecFOR pathway||RecF||Gap repair protein||+||-|
|RecJ||HP0348||5ʹ-3ʹ ssDNA exonuclease||+||+|
|RecO||HP0951||Gap repair protein||+||+|
|RecR||HP0925||Gap repair protein||+||+|
|Branch migration||RuvA||HP0883||Binds junctions; helicase (with RuvB)||+||+|
|RuvB||HP1059||5'-3'junction helicase (with RuvA)||+||+|
|RecG||HP1523||Resolvase, 3'-5'junction helicase||+||+|
|Chromosome dimer resolution|
H. pylori specific DNA repair and replication pathways and their potential role in colonization, virulence and persistence are discussed below based on experimental evidence.
3.1.1. DNA repair and mutagenesis
The most striking feature of H. pylori DNA repair gene content is the absence of the mismatch repair. A distant homolog of mutS was identified [56, 57] and phylogenetic analysis revealed that MutS belongs to the MutS2 subfamily of proteins  that are not associated with MMR. Functional analysis of H. pylori MutS2 identified a role of this protein in repair of oxidative DNA damage and Muts2 is required for robust colonization in the mouse model of H. pylori infection . Deficiency in MMR activity leads to an increase in mutation rate and is known as the mutator phenotype in Enterobacteriaceae and Pseudomonas aeruginosa [60, 61]. The apparent lack of MMR is in line with H. pylori mutation rate that is about 2 orders of magnitude higher than in E. coli . H. pylori mutator phenotype could confer genetic diversity and a selective advantage to adapt and persist in the changing gastric niche. Alternatively, the mutator phenotype of H. pylori might promote transmission as postulated for Neisseria meningitidis based on the observation of high prevalence of mutations in MMR genes in a N. meningitidis epidemic .
Numerous reports have confirmed H. pylori dependence on DNA repair to establish robust colonization and to persist, suggesting that the human gastric niche induces bacterial DNA lesions .
Four of the base excision repair proteins only (MutY, Nth, Ung and Xth) are present in H. pylori [54, 64-67] in addition to a novel 3-methyladenine DNA glycosylase (MagIII) that defines a new class within the endonuclease III family of base excision repair glycosylases resembling the Tag protein [68, 69]. magIII and xth mutants were identified in a signature-tagged mutagenesis screen based on the mouse model of H. pylori infection suggesting a role during colonization . Deletion mutants mutY, ung and xth exhibited higher spontaneous mutation frequencies compared to wild-type, with a mutY mutant displaying the highest frequency of spontaneous mutation. mutY mutants colonized the stomach of mice less robustly compared to wild-type, demonstrating a role for MutY in base excision repair in vivo to correct oxidative DNA damage . The presence of an adenine homopolymeric tract in mutY suggests that MutY phase varies. This raises an interesting question whether H. pylori can vary its mutation rate to adapt to its gastric niche, and highlights the tension between mutation and repair. Deletion of the nth gene also led to hypersensitivity to oxidative stress, reduced survival in macrophages and an increased mutation rate compared to wild-type . The nth mutant also colonized the mouse stomach poorly 15 days post challenge and was almost cleared after 60 days .
Mutants in nucleotide excision repair genes uvrA, uvrB, uvrC and uvrD have been constructed in H. pylori [49, 72, 73] and their UV sensitivity phenotype confirmed their role in DNA repair. Although surprisingly uvrA and uvrB mutants had lower mutation rate and recombination frequencies . This phenomenon can be explained by nucleotide exchange of undamaged DNA and was hypothesized to be another mechanism H. pylori uses to generate genetic diversity . Furthermore, uvrC mutation led to an increase in the length of DNA import, suggesting that NER influences homologous recombination. UvrD limited homologous recombination between strains [49, 73]. A mutant deficient in Mfd, the transcription repair coupling factor, was found to be more sensitive to DNA damaging agents , suggesting that H. pylori may also detect blocked RNA polymerase as a damage recognition signal in addition to the DNA distortion recognition properties of UvrA and UvrB. In summary, NER has opposite dual functions; maintenance of genome integrity by excision repair versus generation of genetic diversity by increasing the spontaneous mutation rate and controlling the rate of homologous recombination and corresponding import length of DNA. Full conservation of the NER pathway in H. pylori contrasts with other lacunar DNA repair pathways and highlights the importance of the dual role of NER for H. pylori during its replication cycle to balance genetic diversity and genome integrity. To date, the role of NER in genetic diversification has not been tested in vivo. Only the mfd mutant was identified in a signature-tagged mutagenesis screen based on the mouse model of H. pylori infection, suggesting a role of NER during colonization .
Finally, recombinant H. pylori overexpressing DNA polymerase I displays a mutator phenotype suggesting a role of replication in generating genetic diversity. Bacterial DNA polymerases I participates in both DNA replication and DNA repair. H. pylori DNA PolI lacks a proofreading activity, elongates mismatched primers and performs mutagenic translesion synthesis. Conversely, the DNA polymerase I deficient mutant exhibited lower mutation frequency compared to wild-type.
3.1.2. DNA recombination
184.108.40.206. Homologous recombination
Recombination between similar sequences is called homologous recombination (HR). HR participates in DNA repair of double strand breaks and stalled replication forks. It is dependent on RecA, a protein that binds and exchanges single stranded DNA. As depicted in Figure 1, HR is a three-step process involving presynapsis, synapsis and postsynapsis. The presynapsis pathway is dictated by the nature of the DNA substrate. Two categories of proteins prepare the single stranded DNA for binding by RecA. A linear DNA duplex with a double-strand end (that could arise during partial replication of incoming single stranded DNA during conjugation, transduction, or DNA damage) is processed by RecBCD. Gapped DNA (that may form during replication) is processed to single-stranded DNA by RecQ and RecJ, whereas RecFOR inhibits RecQ and RecJ activities to allow RecA binding. The result is a nucleoprotein filament that is ready for the search of homologous sequence in the DNA duplex and RecA-mediated strand exchange once that homologous sequence is found. This synapsis step leads to the formation of a structure termed the D-loop. Postsynapsis involves D-loop branch migration and Holliday junction formation catalysed by RuvAB prior to resolution by RuvC or RusA. RecG has also been shown to be involved in recombination and to catalyse branch migration, in addition to its role in replication fork reversal. Interestingly, RuvC-mediated Holliday junction resolution is biased towards non-crossover, avoiding the formation of a chromosome dimer that requires the Xer/dif machinery for resolution.
H. pylori expresses most of the HR proteins of E. coli including; RecA, AddAB instead of RecBCD, RecOR (lacking RecF and RecQ), RuvABC (lacking RusA), RecG, and XerH/difH for chromosome dimer resolution. The presence of most HR genes in H. pylori suggests that HR plays an important role in H. pylori gastric colonization. Intragenomic recombination in families of genes encoding outer membrane proteins leads to H. pylori cell surface remodelling to adapt to the human host by adjusting bacterial adhesive properties, antigen mimicry [75, 76] and modulation of the immune system . HR was suggested to be the underlying recombination mechanism for homA/homB and galT/Jhp0562 allelic diversity [77, 78], whereas gene conversion (non-reciprocal recombination) is responsible for sabA diversity . Mutants in recA, addA or recG had lower rates of sabA adhesin gene conversion suggesting that RecA-independent gene conversion exists and that this recombination may be initiated by a double-strand break .
The RecA deficient mutants are sensitive to DNA damaging agents such as UV light, methyl methanesulfonate, ciprofloxacin, and metronidazole [80, 81]. RecA was the first HR protein to be characterized in H. pylori and it was found not to complement an E. coli RecA deficient mutant . Lack of cross-species complementation was first attributed to the putative post-translational modification of RecA , however, studies showed that the lack of complementation was due to species specific interaction of RecA with proteins involved in presynapsis such as RecA loading on the single stranded DNA by AddAB . RecA’s role in vivo is supported by poor colonization of a RecA deficient mutant . Interestingly, RecA was shown to integrate the transcriptional up-regulation of DNA damage responsive genes (upon DNA uptake) and natural competence genes (upon DNA damage) in a positive feedback loop. The interconnection of natural competence and DNA damage through RecA highlights the role of HR in persistence and in generating genetic diversity. Alternatively, and not exclusive to a role in generation of genetic diversity, RecA-mediated genetic exchange might represent a mechanism for genome integrity maintenance in an extreme DNA-damaging environment.
Several gene deletion studies have shown that H. pylori has two separate and non-overlapping presynaptic pathways, AddAB and RecOR, contrasting with the redundancy of RecBCD and RecFOR in E. coli [83, 84]. The single addA mutant and double mutant addA recO exhibit similar sensitivity to double strand break inducing agents, suggesting that AddAB is involved in the double strand break repair pathway, and RecOR in gap repair. Finally, RecOR is involved in intragenomic recombination and AddAB in intergenomic recombination. Both pathways are required in vivo for robust colonization and persistence based on the lower colonization loads of single addA, recO, and recR mutants in the mouse model of H. pylori infection with the double addA recO mutant displaying the lowest bacterial load. As expected for recombinational repair proteins, RecN mediated DNA double strand break recognition and initiation of DNA recombination is also required in vivo for robust colonization .
Resolution of the Holliday junction formed by the action of RecA is performed by RuvABC in H. pylori and recA, ruvB, or ruvC mutants exhibited similar UV sensitivities [86, 87]. Colonization of the recombinational repair mutant, ruvC, was affected and 35 days post-infection the ruvC mutant was cleared by mice. Thus, although dispensable for the initial colonization step, recombinational DNA repair and HR are essential to H. pylori persistent infection. Furthermore, the ruvC deletion mutant elicited a Th1 biased immune response compared to a Th2 biased response observed for wild-type, highlighting the role of homologous recombination in H. pylori immune modulation and persistence .
Unexpectedly, the RecG homolog in H. pylori limits recombinational repair  by competing with the helicase RuvB. Mutation of RecG increased recombination frequencies in line with a role of RecG in generating genetic diversity. The term ‘DNA antirepair’ was coined to highlight the tension between the generation of diversity and genome integrity maintenance for H. pylori adaptation to its niche. Further regulation of homologous recombination is mediated by the MutS2 protein that displays high affinity for DNA structures such as recombination intermediates thus inhibiting DNA strand exchange and consequent recombination . MutS2 deficient cells have a 5-fold increase in recombination rate .
220.127.116.11. Non-homologous recombination
XerH/difH machinery for chromosome dimer resolution was found to be essential for H. pylori colonization . Deletion of xerH in H. pylori caused: (i) a slight growth defect in liquid culture, as is typical of xer mutants of E. coli , (ii) markedly increased sensitivity to DNA breakage inducing and homologous recombination stimulating UV irradiation and ciprofloxacin, (iii) increased UV sensitivity of a recG mutant , and (iv) a defect in chromosome segregation. The inability of the xerH mutant to survive in the gastric niche contrasts with ruvC mutant colonization and further supports the idea that XerH is not involved in DNA repair but in chromosome maintenance such as chromosome dimer resolution, regulation of replication and possibly in chromosome unlinking. This, in turn, suggests that slow growing H. pylori depends on unique chromosome replication and maintenance machinery to thrive in their special gastric niche.
Rearrangement of the middle region of the cagY gene, independent of RecA , leads to in frame insertion or deletion of CagY and gain or loss of function of the CagA type IV secretion system. Recombination of cagY was proposed to be a mechanism to regulate the inflammatory response to adapt and persist in the gastric niche . To date the exact recombination mechanism involving direct repeats in the middle region of cagY remains unknown.
3.1.3. Phase variation
Host adapted human pathogens, such as H. influenzae, Neisseria species and H. pylori, have evolved genetic strategies to generate extensive phenotypic variation by regulating the expression of surface bound (or secreted) protein antigens that directly (or indirectly) interact with host cells. Phenotypic variation of the bacterial external composition will alter the appearance of the bacterium as sensed by the host immune system. One common regulatory mechanism to achieve antigen diversity within a bacterial population is known as phase variation [94, 95].
In pathogens, simple sequence repeats (SSRs) are tandem iterations of a single nucleotide or short oligonucleotides that, with respect to their length, are hypermutable (Figure 2). Reversible slipped strand mutation/mispairing of SSRs within protein coding regions cause frame shifts, resulting in the translation of proteins that vary between being in-frame (on), producing functional full-length proteins, and out-of-frame (off), where a truncated or non-sense polypeptide is produced . Additionally, SSRs may occur in the promoter region of genes where variation in their length may affect promoter strength by mechanisms such as alteration of the distance between -10 and -35 elements.
In H. pylori, phase variation regulates the expression of genes that are likely to be important for adaptation in response to environmental changes and for immune evasion in order to establish persistent colonization of the host. Analysis of DNA sequence motifs based on annotated genomes of H. pylori strains 26695  and J99  revealed substantial occurrence, intra- and intergenic, of homopolymeric tracts and dinucleotide repeats. Certain categories of genes (or their promoter region) were particularly prone to contain SSRs, such as those coding for LPS biosynthesis enzymes, outer membrane proteins and DNA restriction/modification systems, and thus have been identified as possible candidates regulated by phase variation .
Further genome analysis of H. pylori strains 26695 and J99 demonstrated an expanded repertoire of candidate phase-variable genes. In addition to previous sequence motif analyses of the annotated genomes by Tomb and Alm, 13 novel putative phase-variable antigens were identified in silico . Poly-A and poly-T repeats were almost exclusively found in intergenic regions whereas poly-C and poly-G repeats were mostly intragenic. Five classes of gene function were described; i) LPS biosynthesis (seven genes), ii) cell surface associated proteins (22 genes), iii) DNA restriction/modification systems (nine genes), iv) metabolic or other proteins (three genes), and v) hypothetical ORFs with unidentified homology (five genes). This analysis highlights the importance for a bacterial species such as H. pylori to be able to regulate cell surface antigen expression that is responsible for direct interaction with a changing environment.
Multi locus sequence typing (MLST) analysis was then applied for analysis of sequence motif variation in 23 H. pylori strains selected on the basis of ethnicity and country of origin (Table 2). Four strain types were investigated, i) hpEastAsia, ii) hpLadakh, iii) hpEurope, and iv) hpAfrica1 and hpAfrica2. In conclusion, approximately 30 genes have been identified as likely phase-variable and it has been postulated that a much higher degree of recombination occurs for genes under constant selective pressure as opposed to more neutral genes such as those encoding ‘housekeeping’ functions . DNA sequence analysis of H. pylori strains indicated that recombination of LPS biosynthesis genes may reflect genetic exchange within the population lineage and that phase variable gene evolution occurs at a high rate .
|Gene function||CDS||Repeat||Strain variation 1|
|26695 (HP)||J99 (jhp)||+2||-3||Abs4|
|FutC (α-1, 2-fucosyltransferase)*||0093-4||0086||C||23||0||0|
|RfaJ (α-1, 2-glycosyltransferase)||0208||0194||GA||20||0||3|
|RfaJ (α-1, 2-glycosyltransferase)||NH5||0820||C||8||11||4|
|β-1, 4-N-acetylgalactoamyl transferase*||0217||0203||G||21||2||0|
|FliP (flagellar protein)||0684-5||0625||C||6||16||1|
|OM adherence protein||1417||1312||GA||14||0||9|
|Streptococcal M protein||0058||0050||C||12||2||9|
|TlpB (chemotaxis protein)||0103||0095||G||20||0||3|
|BabA (Leb binding protein)||1243||0833||CT||Putative|
|BabB (Leb binding protein homologue)||0896||1164||TC||22||0||1|
|SabA (sialic acid binding adhesion)||0725||0662||TC||17||2||4|
|SabB (sialic acid binding adhesion)||0722||0659||TC||7||16||0|
|PldA (phospholipase A)||0499||0451||G||20||3||0|
|HcpA (cysteine rich protein)||0211||0197||-||Putative|
|HcpB (cysteine rich protein)||0335||0318||G||13||0||10|
|Type III restriction enzyme||1369-70||1284||G||18||3||2|
|Type II R/M enzyme||1471||1364||G||14||1||8|
|Type III R/M enzyme||1522||1411||G||6||0||17|
|HsdR (type I restriction enzyme)*||0464||0416||C||23||0||0|
|MboIIR (type II restriction enzyme)||1366||1422||A||6||12||5|
|FrxA (NADPH flavin oxidoreductase)||0642||0586||G||6||15||2|
18.104.22.168. Lipopolysaccharide (LPS) biosynthesis
All Gram-negative bacterial outer membranes contain a structurally important component called LPS (or endotoxin). H. pylori LPS consists of three major moieties; a lipid A membrane anchor, a core- and an O-polysaccharide antigen. Although structurally similar to many other Gram-negative bacteria, H. pylori LPS has low immunological activity . The O-polysaccharide chain of the LPS of most H. pylori strains contains carbohydrates that are structurally related to human blood group antigens, such as Lewis a, b, x and y. The structural oligosaccharide pattern of the LPS of some pathogenic bacteria, including H. pylori, is regulated by phase variable fucosyl- and glycosyltranferases; enzymes that transfer sugar residues to its acceptor.
H. pylori strain NCTC 11637 (ATCC 43504, CCUG 17874) expresses the human blood group antigen Lewis x (Lex) in a polymeric form (Lex)n on its core antigen. However, Lex expression is not stable and can lead to different LPS variants in single cell populations. Loss of α1, 3-linked fucose resulted in a non-fucosylated (lactosamine)n core antigen, known as the i antigen, that was reversible. Other LPS variants lost the (Lex)n main chain resulting in the expression of monomeric (Ley)-core-lipid A or had acquired α1, 2-linked fucose expressing polymeric Lex and Ley simultaneously. Most H. pylori isolates have been shown to be able to switch back to the parental phenotype but with varying frequency .
Moreover, poly-C tract length variation causes frame shifts in H. pylori α3-glycosyltransferases that can inactivate gene products in a reversible manner. Serological data suggested that LPS structural diversification arises from phase variable regulation of glycosyltransferase genes, provisionally named futA and futB . Phase variation of futA and futB genes independently has been confirmed and genetic exchange between these loci was shown to occur in single colonies from the same patient and also during in vitro passage .
H. pylori strain NCTC 11637 also has been shown to express blood group antigen H type I. This epitope demonstrated high frequency phase variation that was reversible. Insertional mutagenesis of gene jhp563 (a poly-C tract sequence containing an ORF homologous to glycosyltransferases) in NCTC 11637 showed that LPS then lacked the H type I epitope. DNA sequence analysis confirmed gene-on and gene-off variation. In H. pylori strain G27 mutagenesis of jhp563 yielded a mutant expressing Lex and Ley as opposed to wild-type [H type 1, Lea, Lex and Ley]. Jhp563 may encode a β3-galactosyltransferase involved in H type I synthesis that phase varies due to poly-C tract changes .
H. pylori ORF HP0208, and its homologues HP0159 and HP1416, show homology to the waaJ gene that encodes a α1, 2-glycosyltransferase required for core LPS biosynthesis in Salmonella typhimurium. HP0208 contains multiple repeats of the dinucleotide 5ʹGA at its 5ʹ end and transcription of its gene product has been predicted to be controlled by phase variation. Most strains examined, including strains 26695, J99 and NCTC 11637, had repeat numbers inconsistent with expression of the gene; i.e. placing the translational initiation codon out-of-frame with the full length ORF. A ‘phase-on’ HP0208 was constructed in the genome of strain 26695. Tricine gel and Western blot analysis demonstrated a role for HP0208 as well as HP0159 and HP1416 in the biosynthesis of core LPS . It is likely that the biosynthesis machinery of not only the H. pylori LPS O-antigen side chain but also the core oligosaccharide of H. pylori LPS is subject to phase variation. These complex processes possibly give rise to the diversification of LPS observed in clonal populations of H. pylori.
22.214.171.124. Lewis expression in vitro
The α1, 2-fucosyltransferase (futC) of H. pylori catalyses the conversion of Lex to Ley, the repeating units of the LPS O-antigen. futC is subjected to phase variation through slipped strand mispairing involving a poly-C tract. Single colonies (n=379) from in vitro cultures have been examined for Lewis expression and demonstrated equal distribution of Lex and Ley expression and the phenotypes correlated with futC frame status. The founding population remained, since phenotypes did not change significantly over additional hundreds of generations in vitro .
Two single colonies of the same isolate of H. pylori that expressed Ley of different molecular weights demonstrated wild-type Lewis phenotype after 50 in vitro passages after expansion of a larger cell mass; however after 50 in vitro passages of single colonies, ~5% of the analysed strains also expressed considerable levels of Lex in addition to low levels of Ley, suggesting reduced expression of futC. Successive in vitro passaging of single colonies introduced a much more frequent phenotypic diversification in terms of O-antigen size and Lex expression .
126.96.36.199. Lewis expression in vivo
With a limited number of passages of strains in the laboratory, analysis of the phenotypic diversity of Lewis antigen expression from 180 clonal H. pylori populations from primary cultures of 20 gastric biopsies indicated a substantial difference in Lewis expression in 75% of the patients. The variation of Lewis expression was unrelated to the overall genetic diversity. In experimentally infected rodents however, Lewis expression was highly uniform . Intra population diversity of Lewis expression has since been confirmed. H. pylori isolates with identical DNA signatures (arbitrary primed PCR) from the same chronically infected patient demonstrated variations in the amount and size (length) of the O-antigen and immunoassays detected exclusively the presence of Ley, suggesting simultaneous expression of both α1, 2- and α1, 3-fucosyltransferases. LPS diversification has also been investigated in transgenic mice expressing Leb on gastric epithelial cells. The challenging strain expressed a high molecular weight O-antigen and showed a strong antibody response against Ley. More than 90% of the mouse output isolates produced glycolipids of low molecular weight compared with the input strain. Subsequent immunoblot analysis demonstrated decreased or no Ley expression .
188.8.131.52. Adhesins and cell surface proteins
Expression of bacterial outer membrane proteins can be regulated by environmental changes through signal transduction as well as the generation of genetic changes controlling protein function. Cell surface associated proteins are the most abundant group of H. pylori proteins that is subject to phase variation. Such proteins include so called adhesins, flagellar and flagellar hook proteins, pro-inflammatory proteins, cysteine-rich proteins as well as some other categories. With exception of adhesins, most proteins in this group remain uncharacterized. The outer membrane of H. pylori partially comprises adhesins, which bind to host gastric epithelial cell surface receptors. Gene functions of H. pylori adhesins, many of which belong to the so called hop gene family, are regulated through phase variation.
The sialic acid binding adhesin (SabA) of H. pylori adhere to glycosphingolipids that display sialyl Lewis x antigens. Such antigens have been shown to be upregulated on human epithelial cell surfaces as a result of gastric inflammation . DNA sequence analysis demonstrated that locus HP0725 (sabA) of H. pylori strain 26695 contained repetitive CT dinucleotides at the 5ʹ end of the ORF . Translational modification may encounter premature termination (non-functional protein) or a full length functional adhesion protein. sabA has promoter poly-T as well as ORF 5ʹ end CT tract repeats. Multiple length alleles have been shown to occur in single colonies isolated from the same individual and genome sequence analyses of isolates of H. pylori strains demonstrated genetic and phenotypic variation of SabA .
The H. pylori protein HopZ is a candidate to be involved in the adherence to host gastric epithelial cells and hopZ is likely phase variable due to a CT dinucleotide repeat in the signal sequence of the gene . Human volunteers have recently been challenged with a H. pylori strain with hopZ ‘off’ status. Out of 56 re-isolated strains (from 32 volunteers at 3 month post inoculation) 68% had switched to a hopZ ‘on’ status. After 4 years, paired isolates had 54% hopZ ‘on’ status. Sequence analysis of hopZ ‘on’ and ‘off’ status in 54 H. pylori strains representing seven different phylogeographic populations (hpAsia2, hpEurope, hpNEAfrica, hpAfrica1, hpAfrica2, hpEastAsia and hpSahul) and 11 subpopulations, demonstrated variability between and within most populations; only two subpopulations (hspAfrica2SA and hspAmerind) were exclusively hopZ ‘off’ .
Many H. pylori strains bind to the human blood group antigen Leb. This adherence is mediated by the blood group antigen binding adhesin BabA. Some strains contain two alleles; babA1 which is ‘silent’ and babA2 that expresses the adhesin . Although BabA expression has not been identified to be phase variable by DNA sequence analysis , experimental infection of rhesus macaques showed that, in some animals, compared with its parent challenge strain, output strains lost BabA expression due to an alteration in dinucleotide CT repeats in the 5ʹ coding region. Output strains from other macaques that also had lost Leb binding, had babA exchanged for babB. Duplication of babB has also been observed in human clinical H. pylori isolates . babB and babA have very similar 5ʹ and 3ʹ end sequences but babB lacks the mid region that codes for the Leb binding epitope. BabB is an uncharacterized outer membrane protein and babB contains repetitive sequence motifs and is likely subject to frameshift-based phase variation [57, 97].
Gene function of hopH (oipA) depends on slipped strand mispairing in a CT dinucleotide repeat in the signal sequence of the gene. HopH has been associated with increased interleukin-8 (IL-8) production in epithelial cells and gastric epithelial adherence in vitro [114, 115]. H. pylori isolates with a hopH ‘in-frame’ status were more common than ‘out-of-frame’ strains in patients with chronic gastritis, a feature that correlated with the virulence factor status of the strains, particularly cagA . In a patient setting, hopH (oipA) ‘in-frame’ status was associated with a higher colonization density of H. pylori and clinical presentation regarding gastric inflammation and mucosal IL-8 production .
184.108.40.206. Acid adaptation
H. pylori can reversibly change its membrane phospholipid composition, producing variants with differing concentrations of lysophospholipids. Lysophospholipid-rich cells are more adherent, secrete more VacA and are more haemolytic. As opposed to neutral culture conditions, growth at low pH (3.5) renders an accumulation of membrane lysophospholipids. This variation in lipid composition is mediated by phase variation in the phospholipase A (pldA) gene. A change in the C-tract length of the ORF results either in a functional full-length or a truncated non-functional gene product .
The structure and composition of LPS adapts to an acid environment. Under acidic growth conditions (pH 5) in vitro, the LPS core and lipid A moieties seem not to be altered. However, the O-side chain backbone is partially fucosylated forming Lex, whereas the terminal sugar residues on the O-side chain are modified differently and terminate with Ley instead of Lex .
220.127.116.11. Immune modulation
H. pylori genomes contain a family of genes coding for proteins designated Helicobacter cysteine rich proteins (Hcp). HcpA, a secreted protein, has partially been characterized. Recombinant HcpA, as opposed to HcpC, induced maturation of non-adherent human Thp1 monocytes into macrophages (star-like morphology with filopodia) with phagocytic ability and surface adherent properties .
Surfactant protein D (SP-D), a component of innate immunity, is expressed in human gastric mucosa. SP-D has an affinity for simple sugars and likely functions as a mucosal receptor that recognizes pathogen associated molecular patterns (PAMPs). SP-D induces aggregation of microorganisms facilitating pathogen clearance by neutrophil and macrophage phagocytosis. Some H. pylori strains lack a ligand for SP-D and this ‘escape’ mechanism is associated with phase variation of the LPS structure. Fucosylation of the O-side chain, determined by slipped strand mispairing in a fucosyltransferase gene leading to terminal Lex (SP-D binding) or Ley (escape), controls the H. pylori ligand recognized by SP-D .
DC-SIGN, a C-type lectin, is a surface receptor expressed on dendritic cells that captures and aids the internalization of microbial antigens. H. pylori strains that express Lewis antigens on their LPS bind DC-SIGN and thereby block T helper cell (Th) 1 development, whereas some strains that do not express Lewis antigens escape binding to dendritic cells and promote a Th 1 response. Phase variation of LPS in terms of Lewis expression may influence host immunity through the dendritic cell pathway. Clonal populations of H. pylori, with high frequency of subclone phase variation in LPS biosynthesis genes, are proposed to manipulate the Th response for optimal persistence that prevents severe atrophy and destruction of the ecological niche [29, 121].
18.104.22.168. Experimental infections
Phase variable OMP gene switch status of H. pylori have been investigated in an experimental mouse model of infection. The gene switch status of several OMPs (hopH, hopZ, hopO and hopP) influenced H. pylori density in gastric tissue and its ability to colonise mice. If two or more of the genes were in the switch ‘off’ mode, the colonization ability was markedly reduced. These results correlated well with observations in humans; i.e. patients with strains whose hop genes were in the ‘off’ mode had lower bacterial load .
In a murine model of infection, repeat length and function of 31 phase variable genes of several H. pylori strains were followed for up to one year. At endpoint, 15 genes had a change in repeat length. However, a third of these did not lead to an alteration in protein expression. Ten genes demonstrated a frame shift to an ‘on’ mode of the encoded protein. At early time points (3 and 21 days), mixed pldA phenotypes rapidly and exclusively changed to ‘on’ status, followed by LPS biosynthesis genes modifying terminal sugars. Glycosyltransferases modifying LPS core structures remained in ‘off’ configuration throughout the study. From 21 days onward some OMPs (babB and hopZ) switched from ‘on’ to ‘off’. Restriction/modification systems did not show a particular pattern over time .
BabA expression has been shown to be lost during experimental infection in rhesus macaques, either by allele replacement with babB, or phase variation. A follow up study investigated this phenomenon in other animal hosts using additional H. pylori strains. Murine and gerbil experimental infection models further demonstrated loss of Leb binding (as a result of babA recombination) of output strains by varying mechanisms. In the mouse, BabA expression was lost due to phase variation in a 5ʹ CT repeat region of H. pylori strain J166 .
To conclude, H. pylori randomly exhibits phase variation in sets of genes that directly interact with the environment. These include LPS biosynthesis genes, adhesins and genes with an impact on the structural composition of the bacterial outer membrane. Phase variation also influences the expression of some genes affecting host immune responses. Taken together, these traits are likely to aid a continuous adaptation to the ecological niche and persistence of the bacterium.
Selection pressure in terms of host niche physiology and maturation of host immune response likely contributes to the genetic regulation and diversification of bacterial adherence properties as well as the composition of the outer membrane. For bacteria such as H. pylori that may cause life-long colonization, surface antigen diversity likely requires parallel evolution with host cell subsets over time for continuous adaptation to a dynamic host environment.
3.1.4. Epigenetic diversity: phasevarion
As previously described, phase variation via the high-frequency reversible ON/OFF switching of gene expression is beneficial to pathogenic bacteria, including H. pylori, as a means of rapidly generating the genotypic and phenotypic diversity required for adaptation to the host environment and evasion from the immune system . However, certain evolutionary advantages arise when expression of a repertoire of genes is brought under the influence of a single phase-variable gene as is the case for a phase-variable regulon or “phasevarion” (Figure 2).
The phasevarion is an epigenetic regulatory system whereby the expression of a set of genes is randomly switched as coordinated by the activity of the modification (Mod) component of a restriction-modification (R-M) system . R-M systems share two components; the restriction (Res) component that specifically cleaves unmethylated DNA at a recognition sequence, and the Mod component that methylates the same recognition sequence to prevent cleavage by the Res component . Most R-M systems fall within one of three major families (Type I, II, III). Type II and type III Mod proteins recognise specific DNA sequences whereas type I Mod proteins require an additional specificity subunit . A role for R-M systems in host-pathogen interaction was unexpected, as these systems generally function to protect the genome integrity of bacteria from invasion by foreign DNA by restriction of DNA that does not share the same modifications as that of the host. However, where the Res component is absent or not functional, phase variation of the Mod component results in the random switching of methylation of DNA sequences recognized by the mod encoded methyltransferase. For genes where DNA methylation by the phase-variable Mod affects promoter activity, either by altering the DNA binding affinity of regulatory proteins for promoters , or by other mechanisms, differential gene expression results.
Although phase-variable R-M systems had previously been identified in a number of bacteria , the first experimental evidence for the phasevarion was from H. influenzae, a pathogen of the upper respiratory tract . The mod gene of the sole type-III R-M system of H. influenza contains tetranucleotide repeats consistent with an ability to phase vary. Microarray analysis of a mutant in this gene revealed differential expression of 16 genes, including genes implicated in pathogenicity, in the absence of mod. The differential expression of these genes was shown by reporter gene fusions to be dependent on the phase variation of mod. Hence the phase variation of a single gene, mod, was shown to influence the expression of multiple genes, suggesting the presence of a phase-variable regulon, or “phasevarion.”
Phasevarion mediated gene switching has now been confirmed in H. influenzae , N. gonorrhoeae and N. meningitidis , and H. pylori . This wide distribution, and the allelic diversity of phase-variable methyltransferases, indicates that there is strong selective pressure on phasevarions. It has been postulated that it may be simpler for a gene to evolve to join a phasevarion than become phase variable. The evolution of phase-variation requires the generation of repeat sequences without destroying either promoter or gene function, whereas joining a phasevarion requires only a few key point mutations to generate the methyltransferase recognition site in a region where methylation would affect gene expression . A further evolutionary advantage of the phasevarion may be that it represents an extension of the regulation achieved by phase variation. Rather than randomly reversibly switching a single gene, the phasevarion switches a whole set of genes, thereby differentiating a bacterial population into two different cell types based on many phenotypic characteristics [50, 130]. This switching between different physiological states, rather than merely individual proteins, may assist the bacterium in taking advantage of microenvironments within the host.
Genome sequencing of H. pylori 26695  and J99  revealed a surprisingly large number of R-M systems (22 in 26695) when compared to other sequenced bacterial and archaeal genomes. Each strain of H. pylori contains its own complement of R-M systems, some of which may have the potential to phase vary due to the presence of repeat regions [47, 97, 131, 132]. Typically multiple mod genes may be present in any given strain and there may be multiple different mod alleles for each mod gene within a given species . Diversity of mod genes can be driven by recombination of DNA recognition domains between non-orthologous genes and horizontal gene transfer [133, 134]. Phylogenetic analysis of clinical isolates of H. pylori revealed a diverse set of 17 alleles of the modH gene that differed in the DNA recognition domain and phase-varying repeat region. This diversity in mod genes indicates corresponding diversity in the set of genes regulated by Mod and also indicates that there may be many phasevarions present in H. pylori . Any R-M system present within the genome with an inactive Res may represent a phasevarion. Phasevarions may therefore represent common epigenetic mechanisms for generating rapid reversible phenotypic diversity in bacterial host-adapted pathogens such as H. pylori.
Functional analysis of the Type-II R-M systems in J99 and 26695 revealed that less than 30% were fully functional (with both Res and Mod functional) and that there were many functional Mod enzymes with no apparent functional Res partner [132, 135], indicative of these mod genes regulating a phasevarion. Repeat sequences and homopolymeric tracts, indicative of phase variation, where identified in a number of R-M systems (both type-II and type-III) in H. pylori [47, 57, 97, 98, 130]. Direct experimental evidence of phase variation within an R-M system was first given by lacZ reporter fusions to a type-III R-M system . An R-M system has also been shown to phase vary in the mouse model, suggesting a role for these systems in host-adaptation .
Given that H. pylori contains a diverse repertoire of R-M systems that phase vary and R-M systems with orphaned Mod proteins, it is likely that many of these systems regulate a phasevarion. The first evidence that the activity of a phase-variable Mod influenced the transcription of other genes in H. pylori came from the analysis of the type-II methyltransferase M.HpyAIV in 26695. M.HpyAIV was found to phase vary due to the presence of homopolymeric tract of adenine residues. Analysis of the genomes of 26695 and J99 revealed 60 genes common to both strains where the M.HpyAIV DNA-methylation sites occurred in the intergenic region upstream of ORFs, indicating a possible role for these sites in regulating gene expression. Differential expression of these genes was studied by qPCR and the expression of catalase encoding katA was shown to be significantly decreased in a 26695 mutant of M.HpyAIV .
Further evidence of the presence of phasevarions in H. pylori came from deletion of the modH5 type-III methyltransferase from H. pylori P12. Microarray analysis of the mutant strain revealed six genes, including outer membrane protein HopG and flagellar associated proteins, that were differentially expressed compared to wild-type .
A phasevarion regulated by hpyAVIBM may also be present in H. pylori. The presence of AG repeats in this type-II methyltransferase indicates that this gene may have the potential to phase vary, although this has not been experimentally demonstrated. Deletion of hpyAVIBM from H. pylori SS1 and clinical isolate AM5 resulted in differential regulation in a diverse set of genes, including outer membrane proteins, genes involved in motility, pathogenicity, LPS biosynthesis, and R-M systems, when compared to WT by microarray analysis. Further analysis revealed corresponding alterations in phenotype of the hpyAVIBM mutant such as altered motility, increased expression of CagA as determined by Western blot, altered LPS profile, improved ability to induce IL-8 production in human AGS cells, and a decrease in transformation efficiency. Differences observed in gene expression and phenotype due to deletion of hpyAVIBM varied between the two strains investigated potentially due to different distribution of the methylation recognition site of HpyAVIBM across the genome of the two strains. The occurrence of the hpyAVIBM allele in H. pylori strains isolated from individuals with duodenal ulcer and healthy individuals revealed that the methyltransferase was present in most strains isolated from symptomatic patients but absent in most strains isolated from healthy individuals, indicating that hpyAVIBM expression may be clinically relevant . Even if hpyAVIBM does not phase vary, this study demonstrates the wide ranging regulatory capabilities of DNA methyltransferases.
3.2. Inter-strain generation of diversity
3.2.1. Natural transformation
H. pylori is naturally competent, it is able to be transformed by the uptake and incorporation of foreign DNA into its genome . Potential reasons for natural competence in bacteria are a matter of discussion. It is postulated that bacteria may utilize the uptake of foreign DNA for nutrition, for DNA repair, for evolution via horizontal gene transfer, or that DNA-uptake is an evolutionary spandrel of adhesion and twitching motility [140, 141]. As we will discuss, it is becoming increasingly clear that natural transformation plays an important role in H. pylori genome evolution and host adaptation.
22.214.171.124. Quick overview of players in uptake and recombination
A complete picture of the process of DNA uptake followed by integration into the genome in H. pylori is beginning to emerge, although many details remain to be clarified. Uptake of foreign DNA into the bacterial cell is achieved by a two-step mechanism as shown in Figure 3. Firstly, dsDNA is taken up through the outer membrane by the ComB type-IV secretion system . The ComEC system is then responsible for DNA uptake through the inner membrane [143, 144]. Transport of DNA by ComEC likely results in the entry of single-stranded DNA into the cytoplasm based on the function of ComFA in Bacillus subtilis, although this has not been directly experimentally demonstrated in H. pylori. Incoming DNA is at some point subjected to the activity of restriction endonucleases . Once in the cytoplasm DprA and RecA cooperatively bind the incoming ssDNA forming a heterodimer . During recombination RecA mediates strand invasion of incoming DNA with chromosomal DNA and this process is subject to interference from UvrD and MutS2 [49, 73, 90, 147]. RecA mediated synapsis with chromosomal DNA results in the formation of four-way branched DNA intermediate structures referred to as Holliday junctions whose migration and resolution are mediated by DNA processing enzymes. Branch migration of Holliday junctions is mediated by either the competing RuvAB or RecG helicases . Resolution of the Holliday junction is primarily by DprB in instances of natural transformation with homeologous DNA from other H. pylori strains, although RuvC, the DNA-repair resolvase, can partially compensate for a loss of DprB .
126.96.36.199. Structure of the uptake system
In contrast to other bacterial species where DNA-uptake occurs via systems related to type IV pili, the ComB system of H. pylori is related to type IV secretion systems and its components have been named for their homologues in the Agrobacterium tumefaciens VirB type IV secretion apparatus [142, 149]. The genes encoding ComB are organized in two separate loci with an operon consisting of comB6 – 10 and a second operon consisting of comB2 – 4 [142, 150-152]. All the comB and comEC genes are essential for competence with the exception of comB7 which is postulated to play a stabilizing role for the comB complex [142, 143, 151]. Sequence homology with the VirB type IV secretion system and topological mapping of the ComB proteins has given some insight into the structure of the ComB apparatus [151, 153]. ComB2 is postulated to be located as a “stump structure” in the external membrane and have a role in initial DNA-uptake. It has also been associated with adhesion to human gastric tissue . ComB7 is also associated with the outer membrane and may serve to stabilize ComB9, which is present in the periplasm with an anchor to the outer membrane via a disulphide bond. ComB8 contains a large periplasmic domain and spans the inner membrane and may interact with ComB9 and/or ComB10 which is postulated to be anchored in the inner membrane where it may be present as a homodimer. ComB4 is cytoplasmic and serves as the ATPase that energises the ComB machinery. A role for ComB3 is largely unknown although it is predicted to contain one transmembrane domain.
188.8.131.52. Process of DNA uptake
The process of DNA uptake by H. pylori has been studied using fluorescently labelled DNA and single molecule analysis with laser tweezers . The initial step in transformation is the binding of extracellular DNA to the surface of the bacterium. The ComB machinery may play a role in DNA-binding to the cell as mutants lacking inner-membrane components of the ComB machinery showed impaired DNA-binding . Once bound, DNA was found to be rapidly taken into the periplasm of the cell through ComB via an ATP-dependent mechanism likely driven by the ATPase ComB4. Multiple DNA-uptake complexes were found to be simultaneously active. Uptake of DNA through the outer membrane by ComB appears to be non-specific as uptake of DNA was not distinguished on the basis of DNA sequence. Following transport of DNA into the periplasm by ComB, ssDNA enters the cytoplasm via ComEC. Transport of DNA by ComB and ComEC appears to be spatially and temporally uncoupled. The identity of any motor driving uptake of DNA by comEC remains to be uncovered. Transport of DNA by ComEC appears to be more discriminating than ComB, as covalently labelled DNA transported by ComB could not enter the cytoplasm via ComEC. As DNA sequence does not seem to play a role in the initial uptake of DNA by H. pylori, discrimination of incoming DNA and protection from the potentially hazardous consequences of the incorporation of foreign DNA may come from the numerous restriction-modification systems of H. pylori.
184.108.40.206. The restriction barrier - frequency
The fate of incoming DNA is either one of restriction or recombination. Restriction of foreign DNA forms the most significant barrier to natural transformation in H. pylori . Like other bacteria, H. pylori discriminates the DNA of self from non-self by the modification of bases by methylation. Restriction modification systems (R-M) consist of a methyltransferase that methylates specific DNA sequences and a restriction endonuclease that cleaves non-methylated DNA at the same sequence. Incoming foreign DNA that does not share the same methylation pattern as that of the host is thus digested. In this manner, many R-M systems function to prevent transformation and protect the host from foreign DNA . The number and diversity of R-M systems in the H. pylori genome is notable, with many being strain specific . The diversity of R-M systems can be driven by deletion and acquisition of such systems by horizontal gene transfer . These strain specific R-M systems may be responsible for the observation that competence of different H. pylori strains varies . The barrier posed by R-M systems to competency has been experimentally demonstrated by assessing transformation frequency in the presence and absence of R-M systems. The removal of four active type-II restriction endonucleases from H. pylori 26695 lead to higher transformation efficiency both of donor DNA from E. coli and other H. pylori strains . The removal of two active type-II restriction endonucleases from NSH57 greatly reduced the barrier to transformation resulting in greater transformation frequency with DNA from a J99 donor .
220.127.116.11. The restriction barrier – integration length
In addition to transformation frequency, restriction of incoming DNA also influences the length of incoming DNA that is integrated into the host chromosome. It has been proposed that although H. pylori takes up long DNA fragments by natural transformation, only shorter fragments are integrated into the genome. Genomic sequencing of isolates revealed that sequences recombined with imported DNA varied in length from 261 to 629 bp and were clustered. This observation was suggested to be consistent with the uptake of long stretches of DNA, corresponding to the length of the region in which recombination sites were clustered, that has subsequently been broken up and partially integrated . Mutants of H. pylori NSH57 lacking active type-II restriction endonucleases where found to integrate longer DNA fragments into their genome following natural transformation than the WT strain .
18.104.22.168. DrpA overcomes restriction
Although restriction presents a barrier to transformation, H. pylori achieves a balance between restriction and recombination. It has been proposed that the concentration of restriction enzymes in the cell may be limited to produce only partial cleavage of incoming DNA in order to allow a basal level of transformation . In addition to this proposal, the DNA processing protein A (DrpA), has been shown to lower the barrier to recombination in a number of bacterial species and is widely conserved . Deletion of drpA in H. pylori has been reported to result in either a significant decrease [158, 159], or abrogation of transformation frequency . In H. pylori DprA has a polar localisation and interacts with incoming DNA, binding ssDNA and also dsDNA to a lesser extent. DprA protects incoming DNA from restriction by both preventing the access of type-II restriction endonucleases to DNA and enhancing methylation of incoming DNA by direct interaction with methyltransferases. It thus appears to play a key role in the balance between restriction and recombination . However, the temporal and spatial aspects of restriction endonuclease cleavage and DprA activity are not yet clearly understood. Single-stranded DNA is thought to enter the cytoplasm following uptake by ComEC and DprA binds preferentially to ssDNA, yet restriction enzymes, which contribute the most significant barrier to transformation, find ssDNA a poor substrate in preference to dsDNA.
22.214.171.124. DNA processing enzymes and competence
As depicted in Figure 3, following entry into the cytoplasm, incoming DNA is co-operatively bound by DrpA and RecA. RecA acts to mediate strand invasion of foreign DNA with the host chromosome and promotes homologous recombination. The nucleotide excision repair helicase, UvrD, likely disrupts this process by removal of RecA from DNA, preventing potential recombination events. Mutants in RecA therefore are unable to undergo recombination whereas mutants in UvrD display a hyper-recombination phenotype [49, 73]. MutS2 is also proposed to interrupt RecA-mediated strand invasion. Deletion of MutS2 from H. pylori results in an increase in transformation efficiency suggesting a role of MutS2 in regulation of homologous recombination. Analysis of MutS2 indicates that it is not a member of the mismatch repair pathway but rather has a distinct function in strand displacement of incoming DNA from RecA-mediated D-loop formation during stand invasion of incoming DNA with the host chromosome. This function is independent of the degree of homology of the two strands [90, 147]. Thus in addition to the restriction barrier, enzymes involved in determining the outcome between formation or dissolution of the D-loop also appear to play a role in determining the fate of foreign DNA. The influence on transformation of other DNA processing enzymes acting during the recombination event is less clear. Reports regarding transformation efficiency of a recG mutant vary [145, 161] and integration length in a recG mutant was reported to be decreased . In contrast to an initial report where deletion of ruvC resulted in a decrease in transformation frequency , two reports have found no decrease in transformation frequency [145, 148] but an increase in integration length in a ruvC mutant was noted . Reports regarding the phenotype of a mutant in dprB, the recombination specific Holliday junction resolvase, are consistent that deletion of dprB results in a decrease in transformation frequency but does not influence integration length [145, 148].
126.96.36.199. Other factors
A further factor required for competence that appears to be unique to H. pylori is ComH. Mutants in comH could not be transformed with chromosomal or plasmid DNA [163, 164]. ComH is a surface-exposed outer membrane protein that binds ssDNA but whose function in competence remains to be further characterized .
The role of the nuclease NucT in H. plyori competence is also unclear. NucT is an outer membrane bound nuclease that preferentially cleaves ssDNA. The observation that transformation rates are reduced in a nucT mutant leads to the proposal that NucT functions either in initial DNA binding, or translocation into the cell . In contrast, a later study found that deletion of nucT did not influence transformation efficiency, but did result in an increase in the length of DNA integrated into the chromosome . The presence of NucT as a stable outer membrane nuclease and the requirement of H. pylori to scavenge purines for growth as a result of its inability to synthesise purines de novo led to an investigation of the role of NucT in purine acquisition . NucT was found to be required for growth when exogenous DNA is the only purine source, indicating that the primary function of NucT is the digestion of human DNA in the gastric mucosa as a purine source. Digestion of dsDNA in the absence of its preferred ssDNA substrate in vitro could be responsible for the conflicting results obtained in assays of transformation efficiency.
188.8.131.52. Regulation of competence is by DNA damage
Competence is usually a tightly regulated and transient feature in bacteria. H. pylori is unusual in that it displays very high levels of competence. Competence was found to vary across stages of growth with each strain displaying a different pattern of peaks in transformation efficiency in different growth phases. The pattern observed was independent of the type of donor DNA . Competence is upregulated in response to DNA damage by the induction of the transcription and translation of competence machinery. A marginal increase in transformation frequency was observed after UV induced DNA damage . Transcription of several genes involved in competence were found to be upregulated in both cDNA microarrays of cells where acute double strand break DNA damage had been induced by ciprofloxacin treatment and in cells where chronic DNA damage had accumulated in addA mutants deficient in double-strand break repair. These genes include components of the ComB system and a lysozyme proposed to function by lysing neighbouring cells to provide DNA for uptake. This transcriptional response was dependent on both recA and the ability to uptake DNA. The authors proposed a RecA-dependent positive feedback loop in the induction of DNA damage responsive genes that is initiated by DNA damage and amplified by DNA uptake. This system was found not to be important in initial mouse colonization but may be of relevance to persistence. These results demonstrate that unlike other bacteria, H. pylori does not mount an SOS response to DNA damage but relies on homologous recombination for maintenance of genome integrity . It seems plausible that H. pylori responds to stress induced by the immune system during persistence in the human host by increasing competence to not only repair DNA damage but also to increase genetic diversity in order to continually adapt to a changing host-mediated niche .
3.2.2. Competence in vitro
184.108.40.206. Substrate requirements for transformation
The characteristics of natural transformation in H. pylori in vitro have been studied in detail. Investigations into the influence of the properties of substrate DNA on natural transformation of H. pylori 26695 revealed a number of interesting features :
Transformation efficiency decreases with shorter DNA fragments although transformants could be obtained with fragments as short as 50 bp.
Although uptake of DNA occurs within minutes of exposure, transformation efficiency increases with increasing time prior to transformant selection to allow time for uptake, recombination, and expression of a new phenotype.
Transformation frequency of selectable alleles decreases with decreasing length of flanking sequences, although transformants were obtained with as little as 5 bp flanking sequence.
Transformation frequencies are higher with chromosomal DNA than PCR products.
Transformants could be obtained with both single stranded and double stranded DNA but with 1000-fold greater efficiency for double stranded DNA.
Transformation efficiency is greater with homologous DNA than homeologous DNA.
Different H. pylori strains vary in their competence.
DNA uptake could be saturated at high DNA concentrations.
220.127.116.11. Length of insertions
Study of transformation in vitro has also revealed that H. pylori typically imports fragments of short length into the chromosome in comparison to other bacteria and that these imports are regularly interrupted by wild-type recipient sequences. Mean lengths of between 1294 and 3853 bp of integrated DNA were observed from transformations of rifampicin sensitive recipient strains with DNA from resistant donors with different import lengths observed with different donor/recipient combinations . Transformation of streptomycin sensitive strains with DNA of streptomycin resistant strains obtained a mean length of integration of 1300 bp. Furthermore the effect of the restriction barrier may have been observed, as endpoints of integration were found to be clustered, consistent with restriction of incoming DNA at those sites . Short length of imported fragments has also been recorded in vivo .
18.104.22.168. Why are insertions interspersed?
The region of integrated DNA is commonly reported to be interrupted by short interspersed sequences of the recipient (ISR) and multiple explanations for this observation are present in the literature. Lin et al. have suggested ISR could result from two separate but neighboring strand invasion events that would be consistent with the restriction of an invading strand prior to recombination . Conversely, Kulick et al. reported that overexpression of the base excision repair glycosylase MutY increased the occurrence of ISR within imported regions, implicating it in their formation. MutY was also found to influence integration length. It was proposed that MutY-mediated DNA repair at impaired bases following recombination results in the insertion of host sequence within recombined regions . The allelic variation generated by ISR is a further mechanism increasing genotypic variation.
22.214.171.124. Promiscuity of DNA uptake
H. pylori is very promiscuous as it does not require incoming DNA sequences for transformation to be as closely related as what other bacterium do. This may be due to the absence of a mismatch repair system, a lack of DNA sequence specificity in DNA uptake machinery, absence of DNA uptake sequences, and a relaxed restriction barrier. Recombination between unrelated strains in co-infected hosts has been observed. Analysis of synonymous distance between recombined alleles within the genomes of H. pylori isolates from two South African families revealed recombination with sequences from unrelated strains, in contrast to studies of S. enterica, E. coli, and B.cereus where recombination only occurred between members of the same lineage . Recombination can also occur between different Helicobacter species. Transformation of H. pylori 26695 with both homologous and homeologous streptomycin resistance conferring rpsL DNA derived from 26695 and H. cetorum respectively, yielded transformants, although with 1.5 log10 greater efficiency with homologous DNA. No transformants could be obtained with DNA from C. jejuni. Competition between DNA from different sources for transformation revealed that H. pylori DNA could compete with H. pylori DNA but DNA from unrelated sources could not compete with H. pylori DNA, indicating that H. pylori can distinguish DNA from different sources on the basis of DNA sequence [155, 173]. This discrimination does not occur at the level of DNA uptake by ComB/ComEC, as cytoplasmic uptake of λ-phage DNA was comparable to H. pylori DNA , but is likely a consequence of the restriction barrier. Thus although transformation of H. pylori is efficient with DNA from unrelated strains, and even to a lesser degree with DNA of other Helicobacter species, it has not been observed with DNA from other bacterial genera.
H. pylori displays considerable allelic diversity and genetic variability. The high degree of competence facilitates frequent horizontal transfer of genetic material to the extent that the genome has been found to be in linkage equilibrium [174, 175]. Although genetic variability typifies H. pylori, with populations being regarded as panmictic, clonality is observed in the natural transmission of strains within closely related and co-habitating individuals . Also, H. pylori populations can be grouped according to geographic location as strains located within a region are more closely related to each other than to strains outside the region . The uptake of foreign DNA by natural transformation and resulting recombination generates a considerable portion of the genetic diversity observed in H. pylori.
3.2.3. Plasmids and mobilizable transposons conjugation
In many bacterial species plasmid transfer by conjugation is a significant contributor to the acquisition of genetic material by horizontal gene transfer and often mediates the dissemination of genes of particular phenotypic importance such as antibiotic resistance, virulence determinants, and the ability to utilize certain substrates. Bacterial conjugation can generate chromosomal rearrangements due to plasmid insertion and excision and can also transfer chromosomal genetic material when errors in excision occur. Although diversity exists in bacterial conjugative mechanisms, a generalized overview of the process can be formed. Initially a mating pair of cells must make contact and be brought into proximity. This is typically achieved by the expression of the sex pilus, an elongated tubular appendage, by the donor cell. Once in contact a conjugative pore or some other mechanism for the transfer of DNA to the recipient must be established. This is commonly achieved by a type-IV secretion system. In order to transfer plasmid DNA, a relaxase and accessory proteins, the relaxosome, bind to the plasmid origin of transfer (oriT) where they cleave a single strand. The single strand and bound relaxase is recognized by a coupling protein and transferred into the recipient cell in tandem with the rolling circle replication of the intact strand of the plasmid. The relaxase then circularizes the single strand which is replicated to form an intact plasmid within the recipient .
H. pylori commonly carry plasmids of varying sizes , often of low copy number, which can be divided into two groups [179, 180]. The first are homologous to Gram positive vectors that replicate via the rolling circle mechanism. The second, more common group, are proposed to replicate via the theta mechanism predominately utilising the replication protein, RepA, which binds to short tandem repeats in the plasmid origin of replication. There are several reports characterising the properties of various plasmids in H. pylori [180-185].
Plasmid transfer has been demonstrated to occur between different strains of H. pylori [186, 187]. Mating experiments with H. pylori P8 and P12 revealed that transfer of two conjugative H. pylori plasmids could occur through three distinct routes. Firstly, natural transformation, which was dependent on the ComB type-IV secretion system and was DNaseI sensitive, secondly, DNaseI insensitive mobilisation which was dependent on both ComB and the plasmid encoded relaxase (mobA), and finally, an alternative DNaseI resistant (ADR) pathway that is independent of ComB. No evidence was found for the involvement of chromosomal relaxases in any pathway . Backert et al. also demonstrated conjugative plasmid transfer in H. pylori but used two mobilizable vectors containing a broad host range oriT . In this instance transfer was insensitive to DNaseI and dependent on viable cell contact, indicative of a conjugative process. No role was found for ComB, indicating that natural transformation was not taking place, but both the chromosomally encoded TraG coupling protein homolog and relaxase rlx1 were required. The mechanism of transfer thus appears similar to the ADR pathway observed by Rohrer et al. It may be that the ADR pathway is a minor pathway for conjugative plasmids with an endogenous relaxase as utilized by Rohrer et al., but may be the only pathway for transfer of mobilizable plasmids via the utilization of chromosomal mobilisation genes. Transfer of H. pylori conjugative plasmids was found to occur at a rate orders of magnitude higher than that for the introduced mobilizable plasmids (10-4 vs 10-7). The observation that H. pylori is capable of transferring plasmids with a broad-host range oriT utilising a chromosomal mobilisation system indicates that H. plyori may be promiscuous in its ability to uptake foreign plasmids. In both studies the cagPAI and tfs3 type-IV secretion systems were found to play no role in plasmid transfer. Given that no type-IV secretion apparatus has been implicated in the ADR pathway, the mode of DNA transfer by this route has yet to be determined.
The high rate of plasmid occurrence and the presence of three pathways for plasmid acquisition indicate that genes carried by plasmids may assist in host adaptation and be an important source of genetic diversity amongst H. pylori strains. The observation of chromosomal sequences in H. pylori plasmids indicates that they are capable of acquiring genetic material from the chromosome [181, 188]. Additionally, the presence of genes, displaying homology to chromosomal genes of unknown function, flanked by repeat and IS sequences on pHel4 and pHel5 gave rise to the proposal that H. pylori plasmids have a modular structure where genes can be integrated from the chromosome or other plasmids at repeat sequences by recombination or IS sequences by transposition events. Such shuffling of genes could represent a mechanism generating inter-strain diversity in H. pylori .
A role for plasmid encoded genes in virulence or host adaptation has yet to be clearly demonstrated as most plasmids characterized to date carry predominately only elements required for replication and hypothetical genes of unknown function . Genes with homology to the E. coli microcin toxin operon have been identified in plasmids pHel4 and pHPM8. These genes function in E. coli to block protein biosynthesis in closely related bacteria, but their function has yet to be demonstrated in H. pylori. Similarly, both plasmids also contain a gene with homology to the tetracycline resistance determinant, tetA, but which has been demonstrated not to confer tetracycline resistance to host cells and thus remains of unknown function [179, 180, 188].
Plasmids are not the only entities transferred by conjugation in H. pylori. Initial evidence that H. pylori may be capable of transferring chromosomal elements by conjugation came from mating experiments in the presence and absence of DNaseI . Subsequently, the transfer of chromosomally encoded streptomycin resistance by conjugation was demonstrated from H. pylori into C. jejuni. The transfer required cell to cell contact and was independent of any known type-IV secretion system .
More recently, the horizontal transfer of H. pylori plasticity zones has been investigated. Comparison of the first two genome sequences of H. pylori revealed the presence of chromosomal regions, termed plasticity zones, which contained strain specific genes. These regions vary from the rest of the genome in their G+C content, indicating possible acquisition from a foreign source . Plasticity zones have been found to contain genes of interest, including DNA processing enzymes, type IV secretion systems [191, 192], and genes implicated in disease outcome. Plasticity regions have been found to be diverse in their gene content and presence in H. pylori strains [191, 193-195]. A large number of studies have implicated plasticity region localized genes, particularly JHP947 and surrounding genes, as virulence determinants in H. pylori. The presence or absence of these genes in clinical isolates have variously been correlated with gastric carcinoma and duodenal ulcer disease status and have been associated with variations in inflammatory cytokines [194-202]. The plasticity zones of H. pylori have been found to be able to horizontally transfer as conjugative transposons [191, 193].
Conjugative transposons are a form of integrative and conjugative elements (ICEs). These elements reside within, and are replicated by, the host chromosome and contain the elements required for their excision and integration. Following excision they circularize, are replicated and transferred to a recipient cell by conjugation. Following transfer, the ICE integrates into the recipient chromosome and the copy remaining in the donor also reintegrates. Many plasticity zones and genomic islands present in many bacterial species may be functional ICEs or remnants of mobile elements . Given the importance of genomic and pathogenicity islands in the physiology of many bacteria, in particular the cag pathogenicity island and genes implicated in disease within the plasticity zones of H. pylori, ICEs may represent a rapid means of generating genetic diversity of clinical relevance between different H. pylori strains. Deletion of plasticity zones has been found to decrease the fitness of some strains in vivo and mediate the induction of inflammatory cytokines in human AGS cells .
An investigation into the nature of plasticity zones by sequencing H. pylori strains to identify their gene content and genomic location lead to the proposal that they were conjugative transposons termed ‘transposon plasticity zone’ (TnPZ) . The plasticity zones in the sequenced strains could be classified according to their gene content and arrangement. Plasticity zones of a particular class could be identified at different genomic contexts in different strains, indicating their ability to move as a discrete unit. The plasticity zones were often found to be flanked by direct repeats and contained terminal inverted sequences, characteristic of transposable elements. There was also evidence of recombination occurring between different classes of TnPZs and many strains appeared to contain vestigial remnants TnPZs including J99 and 26695 whose plasticity regions were described as ‘complex mosaics of TnPZ remnants’. Full length TnPZs were found to contain DNA processing enzymes, including a candidate transposase, xerT, which displays homology with the recombinases XerC and XerD of E. coli and the H. pylori dif-specific XerH recombinase. Transfer of TnPZ1 of H. pylori strain P12 was demonstrated by co-cultivation experiments performed with both WT and mutant donor and recipient strains . Transfer of the plasticity zone was observed by a conjugative mechanism that could occur, although at reduced frequency, in the absence of the TnPZ-encoded type-IV secretion system and was dependent on the TNPZ-encoded xerT. Circular DNA intermediates from donor strains could be visualised by PCR. Excision of the TnPZ was found to leave one copy of the direct repeat in the donor chromosome while the circular intermediate contained the second.
The genes present on TnPZ may represent a pool of genes that can be quickly acquired as a set, as the conjugative method of transfer is not subjected to the same restriction barrier as transformation. It seems that plasticity zones have been present in H. pylori for a long time as similar zones are also found in other Helicobacter species including H. acinonychis and H. cetorum and many have been subject to gene deletion and recombination over that time. Plasticity zones are widely present in H. pylori strains with each strain having a unique combination of such zones and the strain specific genes found within, they thus contribute significantly to the Helicobacter pan-genome . The transfer of TnPZs may represent a further method utilized by H. pylori to generate a diverse population that displays fitness in a diverse set of niches. The physiological function of genes in plasticity zones requires further investigation to clarify their potential roles in host adaptation and disease outcome. Although conjugative transfer of TnPZs, other chromosomal elements, and plasmids does occur in H. pylori, the precise mechanisms of such transfer and their regulation are not completely understood. The contribution of conjugative processes to genetic diversity in H. pylori may not be as significant as that contributed by natural transformation but is still of evolutionary significance.
3.2.4. Transduction of bacteriophage
Phages are often important drivers of genetic diversity and host adaptation. Phages represent a significant mode of horizontal gene transfer responsible for a large portion of strain-specific genetic diversity in many bacterial species. The action of temperate phages may play a significant role in the evolution of bacterial genomes by the disruption of host genes during insertion events, delivery of bacterial DNA for recombination by transduction, or the introduction of new genes (morons) that are carried, but not required, by an integrating phage. This has phenotypic consequences and the presence of prophages in the bacterial genome may alter bacterial fitness and influence host-bacterial interactions. This is particularly significant in many instances where morons encode virulence factors such as bacterial toxins . Despite the significance of phages in the evolution of other bacterial species, the presence of phages in Helicobacter has drawn little attention until recently.
Bacteriophages are relevant to the physiology of H. pylori in vivo. Early observations of H. pylori noted the presence of bacteriophage-like particles within cells in electron microscope images of gastric biopsies . Similarly, electron microscopy of clinical isolate SchReck 290  revealed the presence of the bacteriophage HP1 that could be propagated in a lytic cycle in vitro . Active H. pylori bacteriophages can be isolated from human faeces .
More recently, several bacteriophages have been characterised in H. pylori. Genome sequencing of clinical isolate, B45, revealed the presence of a 24.6 kb prophage within the genome . The phage was found to be similar to a previously identified phage in H. acinonychis  and could be induced by UV irradiation, producing phage particles. Screening of a further 341 clinical strains from different geographical regions by PCR revealed that the phage was widespread, being detected in 21.4% of isolates. This high prevalence could indicate that bacteriophages play a role in Helicobacter evolution .
The bacteriophage 1961P was isolated from a clinical isolate of H. pylori. It has a genome of 27 kb containing 33 genes. Sequence comparison revealed that prophages or prophage remnants similar to 1961P are likely present in six other sequenced H. pylori strains. Interestingly, 1961P was demonstrated to be a transducing phage demonstrating that transduction of H. pylori by bacteriophages does occur .
The sequence of two phages isolated from East-Asian-type isolates from Japanese patients has been reported  and one, KHP30, has been characterized further . The phage did not appear to be integrative and may represent a novel category of bacteriophage.
Although it is becoming clear that bacteriophages may be widely present in H. pylori, with some recently well characterized, a potential role for bacteriophages in genomic evolution and host-adaptation in H. pylori remains largely unexplored. Genomic sequencing revealed the presence of two prophages within the feline-associated H. acinonychis genome which account for 26% of genes present in the sequenced H. acinonychis but absent in H. pylori J99 and 26695. Although most of these genes appear to have been acquired after the divergence of the two species and largely encode hypothetical proteins, three genes from one of the two prophages were implicated by the authors as potentially having played a role in enabling the host jump of the H. acinonychis ancestor into felines .
Lehours et al. have raised the hypothesis that plasticity zones in H. pylori could have a phage origin much as phages mediate the transfer of pathogenicity islands in Staphylococcus . Kersulyte et al. have proposed that plasticity zones transfer as conjugative transposons . Further research will be required to determine if the bacteriophages in Helicobacter have played any significant role in the evolution of its genome.
4. Small RNAs and regulation
To date only a limited number of transcriptional regulators have been discovered in the small 1.67 Mbp genome of H. pylori, including a mere three sigma factors, namely RpoD, FliA and RpoN . Consequently H. pylori was thought not to have any extensive regulatory networks. However, a recent comprehensive analysis of the H. pylori primary transcriptome has revealed that the compact genome produces an abundant number of antisense and small RNA (sRNA) transcripts portending a great potential for the use of riboregulation by H. pylori in gene expression .
Small RNAs have emerged as essential regulators that allow organisms to cope with environmental changes and stresses [216, 217]. Like transcription factors, sRNA can modulate the expression of multiple target genes and thereby function as key regulators of metabolic pathways and stress responses. In bacteria, sRNAs have been discovered to regulate processes as varied as carbon metabolism, iron homeostasis, RNA polymerase function, virulence, quorum sensing, biofilm formation, as well as response to stresses such as oxidation and outer membrane perturbation [216, 218-220].
Bacterial sRNAs exhibit a diverse range of molecular mechanisms of action. One class of bacterial sRNA acts by binding directly to protein targets and modulating their activity [221, 222]. Another group of sRNAs known as riboswitches consist of RNA sequences located in the 5ʹ untranslated regions (UTRs) of mRNAs. Riboswitches regulate gene expression through their capacity to adopt different conformations that are mediated by factors such as temperature or small molecule metabolites that specifically bind to the riboswitch [223, 224].
The most extensively studied class of sRNAs are those that act through antisense base pairing with target mRNAs. These sRNAs are classified as either cis-encoded, because they are transcribed from the strand of DNA opposite their mRNA targets and so have extensive complementarity to their target, or as trans-encoded sRNAs, which are transcribed from a genomic location different from those of their targets. The base pairing interactions of trans-encoded sRNAs tend to involve less complementarity with their targets, permitting the regulation of multiple genes . Furthermore many trans-encoded sRNAs in enteric bacteria require binding to Hfq, a chaperone protein that aids in sRNA stability and base pairing with target mRNAs . Antisense base pairing of sRNAs can exert either negative or positive regulatory effects on their mRNA targets including RNA degradation, termination of transcription, inhibition of translation and the relief of intrinsic translation-inhibitory structure formation in the mRNA’s 5ʹ UTR (For detailed reviews of basepairing sRNAs and their regulatory mechanisms, refer to [220, 226-228]).
The discovery of substantial antisense transcription as well as a high number (>60) of sRNAs which included potential regulators of cis- and trans-encoded mRNA targets has revealed a new class of molecules that should to be considered for their role in mediating H. pylori persistence. Since sRNAs in H. pylori were discovered only recently, knowledge regarding their target genes and the mechanism by which they regulate them is still very limited. The current literature on riboregulation for H. pylori has been recently reviewed . When corrected by genome size, the sRNA repertoire of H. pylori rivals that of E. coli, the model organism of bacterial RNA research , and thus the mechanisms of sRNA action in H. pylori are likely to be much more diverse than what is currently known.
With the exception of a handful of housekeeping RNAs, none of the enterobacterial sRNAs are conserved in H. pylori . H. pylori has no Hfq homolog, an RNA chaperone shown to be important for mediating sRNA function in enteric bacteria, and H. pylori lacks homologs of endonucleolytic RNases E/G and other common processing factors of stable RNAs [57, 230]. It has been hypothesized that antisense-mediated processing by double-strand specific ribonuclease RNase III may compensate for this paucity and act as the major regulator as has been recently suggested for S. aureus [229, 231]. To date the only sRNA shown to be essential in H. pylori and required for stress response is tmRNA, which is involved in the rescue of stalled ribosomes .
One of the most abundant transcripts identified in H. pylori is a homolog of 6S RNA , a ubiquitous riboregulator which mimics an open promoter complex and thereby sequesters RNA polymerase [233, 234]. In E coli, deletion of 6S RNA has no obvious phenotype during exponential growth, however altered growth phenotypes are observed during stationary phase and under extreme stress conditions . Whether 6S RNA has a similar role during stress response, stationary growth, or if it impacts on virulence as seen for Legionella  and therefore has implications in H. pylori persistence in the host still needs to be investigated.
Antisense transcripts have been identified for one third of the putative phase-variable genes with functions in lipopolysaccharide biosynthesis, surface structure and DNA restriction/modification , raising the possibility of antisense regulation of surface structure and host interactions. Furthermore, acid-stress-induced antisense RNAs opposite to known acid-stress-repressed genes have also been detected, suggesting that antisense regulation would also play an important role in adapting to changing environmental pH. To date only a few physiological effects of sRNA regulation have been demonstrated in H. pylori. The ureAB operon has been shown to be negatively regulated by a cis-encoded antisence 5′ ureB-sRNA which is induced by unphosphorylated ArsR under neutral conditions . The chemotaxis receptor TlpB, which is involved in pH-taxis, quorum sensing, colonization and gastric mucosa inflammation, has been shown to be negatively regulated by the trans-encoded HPnc5490 sRNA [237-239].
It has become increasingly clear that sRNAs serve as diverse regulators that impact almost every aspect of bacterial physiology in response to changes in the environment. It is therefore highly likely that riboregulation may have a very important role in H. pylori persistence. The stomach is a highly dynamic environment undergoing constant changes in pH, due to long periods of fasting interspaced with the intake of various food material, while continual shedding of gastric mucus, where a majority of the bacteria reside , leads to washout into the lower intestine where the bacteria are exposed to an anaerobic environment and greater exposure to the host immune system through Peyer’s patches. Here riboregulation may be very important for the quick changes and modulation, such as initiation of peptidoglycan remodelling and alterations in the lipid content [241, 242], necessary to transition from the spiral to coccoid form that is phagocytosed by dendritic cells in Peyer’s patches .
5. Experimental strategies to study Helicobacter pylori persistence
A multidisciplinary approach is required to study persistence in vivo, through the combination of microbiology, immunology, genetics and clinics. Although powerful techniques in cellular microbiology as well as the flurry of transgenic mouse strains have been instrumental in investigating H. pylori pathogenesis, in vivo studies are limited to loss of function mutants based on gene deletion. This prevents the distinction between colonization and/or persistence functions of the gene of interest. Furthermore, deletion mutants also suffer the drawback of potential rapid genetic adaption in case of a strong selective pressure (which might often be the case already in vitro) in vivo, and this is particularly relevant in view of H. pylori‘s genome plasticity.
5.1. Gene deletion
Both bacterial and host genetics have led to a tremendous progress in our understanding of H. pylori pathogenesis. However, the mechanisms of persistence of H. pylori have not yet been rigorously tested in animal models due to the difficulty in assessing the temporal requirement of virulence factor expression for colonization and/or persistence. Only a few genes have been strictly shown to be required for persistence based on the inability of mutant strains to colonize in the long-term despite identical short-term colonization loads compared to wild-type. For example, the peptidoglycan deacetylase (pgdA) mutant displayed significant attenuation in its ability to colonize mouse stomachs at 9 weeks infection although it is dispensable for up to 3 weeks during the initial colonization . Furthermore, compared to wild-type H. pylori-infected mice, elevated levels of MIP-2, IL-10 and TNFα were observed in mice infected with the pgdA mutant, indicating that peptidoglycan deacetylation modulates the host immune system . Another study reported the DNA repair ruvC mutant to be spontaneously cleared from the gastric mucosa of mice between 36 to 67 days of the initial challenge of mice . Interestingly, the ruvC mutant was less efficient in modulating the murine immune system, suggesting that DNA recombination is critical for immune modulation and persistence of H. pylori . Catalase (KatA) and the catalase associated proteins (KapA) have also been implicated in persistence of H. pylori infection . KapA and KatA were proposed to promote survival of H. pylori in the inflamed gastric mucosa where the concentration of reactive oxygen species, particularly hydrogen peroxide, is high . Very recently, a comparative analysis of the colonization loads between wild-type H. pylori and the lpxE isogenic mutant, harbouring a partial TLR4 determinant restoration, was conducted in C57/BL6J and the corresponding TLR4 knockout transgenic mice . This two-sided genetic approach demonstrated that although LpxE-mediated avoidance of TLR4 recognition had little effect on the early colonization phase 15 days) lpxE mutant was prevented from persisting for more than 45 days due to TLR4 activation  that likely led to sterilizing adaptive immunity.
The natural competence of H. pylori has also recently been demonstrated to promote chronic colonization. A competition experiment between wild-type and comB10 or drpA mutant strains showed that mutation of either the Com apparatus or a cytosolic competence factor resulted in reduced persistence despite normal initial colonization . This recent data strongly suggests that the exchange of DNA between a heterogeneous population and genome plasticity and integrity are important for H. pylori to maintain chronic infection.
5.2. Gene mutagenesis: The urease example
As an alternative to gene deletion, protein engineering was applied to the urease complex to investigate its role in the host-pathogen interaction without affecting the enzymes ureolytic activity which is essential for colonization . Surface exposed loops that display high thermal mobility were targeted for in-frame insertion mutagenesis (Figure 4). H. pylori expressing urease with insertions at four different sites retained urease activity and were able to establish robust infections. Bacteria expressing one of these four mutant ureases, however, had reduced bacterial loads after longer term (3 to 6 months) colonization. These results indicate that a discrete surface region of the urease complex, distinct from the ureolytic activity per se, is important for H. pylori persistence during gastric colonization .
H. pylori urease consists of a ball-like supramolecular structure of about 1000 kDa. It is composed of twelve UreA-UreB subunits (26.5 and 61.7 kDa, respectively), assembled in a high molecular weight complex of four alpha/beta trimers [248, 249]. Urease is an abundantly expressed enzyme (10% of the cell weight) that decreases the acidity of H. pylori’s immediate environment by generating ammonia and carbonate from the urea we secrete as metabolic waste and is one of the first characterized factors identified as essential for colonization by H. pylori [250-252]. The majority of the urease is localized in the bacterial cytoplasm, however, urease is also present on the cell surface and in the extracellular medium. The secretion of urease is still a matter of debate as both autolysis and specific secretion [250, 253, 254] have been postulated to be responsible for the release of urease.
Uptake of urea into the cytoplasm is controlled by an acid gated urea channel, UreI, that does not open until the periplasmic pH decreases below pH 6.5 to avoid alkalization of the cytoplasm [255-257]. The recruitment and interaction of the urease complex with UreI at the inner membrane is hypothesized to enable coupling of urea transport and urease activity for efficient pH homeostasis of the periplasm . It is tempting to postulate that the surface properties of urease plays a key role in the intrabacterial regulation of its enzymatic activity by docking of the urease complex to the urea channel (UreI) to maintain periplasmic pH.
Although such local control of gastric acidity is considered essential, urease-negative H. pylori strains were unable to colonize piglets whose acid secretion had been suppressed, suggesting an additional role for urease . Possible explanations include the utilization of urease generated ammonia by H. pylori to synthesize essential metabolites, especially amino acids ; protection from peroxynitrite , enhanced survival in macrophages , evasion of phagocytosis , complement mediated opsonisation , and decreased viscosity in gastric mucin at high pH facilitating H. pylori motility . An alternative explanation invokes the existence of urease-host tissue interactions, that are independent of urease enzymatic activity, and is based on in vitro studies that detected urease mediated activation of macrophages , monocytes  and blood platelets [266, 267], dysregulation of gastric epithelial tight junctions  and induction of cytokine production from gastric epithelial cells  through binding to CD74 (MHC class II invariant chain) .
The above examples highlight the importance of studying the kinetics of colonization so as to gain deeper insights into the H. pylori persistence mechanisms and revisit the role of well-known and studied virulence factors such as urease.
5.3. Conditional mutants
The use of conditional H. pylori mutants makes it possible to investigate genes for their role in persistence by turning off target gene expression in vivo once colonization has been established. Several H. pylori conditional knockouts based on the lac operon have been constructed to study essential genes in vitro [271, 272]. So far however, this system has not been tested in the mouse model of H. pylori infection. Due to certain limitations, including leakiness of transgene expression and limited bioavailibility of the IPTG transgene-inducers, this lac operon-based system is not very well suited for in vivo studies. On the contrary, the tetracycline-mediated gene expression regulation has been extensively used to construct mouse conditional knockouts as well as to study host-pathogen interactions using tetracyline-regulated bacterial conditional knockouts [273, 274]. Tetracycline gene regulation has recently been developed in H. pylori [275-277] (Figure 5) and successfully applied in vivo to monitor changes in colonization load upon the down regulation of ureB gene expression .
This system will enable study of the temporal requirements of specific genes during H. pylori colonization as well as during persistence. More importantly, tetracycline-mediated H. pylori gene regulation will allow for subtle characterization, in vivo and ex vivo, of changes in the innate and adaptive immune responses by monitoring the activation of antigen presenting cells and T cell proliferation, respectively.
The tetracycline gene regulation combined with host genetics will facilitate the investigation of H. pylori persistence mechanisms and may allow for the discovery of new markers for cancer risk and for novel therapeutic interventions specifically targeting persistence.
6. Comparative genomics
The genome era created the possibility to study gene contents of individual H. pylori strains, allelic diversity and related genetic plasticity. The first H. pylori strain, 26695, was sequenced in 1997 and was isolated from an English patient with chronic gastritis. The chromosome of strain 26695 is circular and composed of 1 667 867 base pairs . J99, isolated from an American patient with a duodenal ulcer, was sequenced in 1999 and for the first time, a comparison of two unrelated genomes was made . Compared to 26695, J99 has a slightly smaller circular chromosome (1 643 831 base pairs). H. pylori is believed to possess a large degree of genomic diversity but the overall genomic organization, gene order and predicted proteomes of the two sequenced strains was found to be similar. The two genomes displayed a high degree of diversity in terms of insertions and deletions. There are 1406 genes shared by both strains, but 86 open reading frames are absent from strain 26695. Both strains contain a complete cag pathogenicity island that codes for a type IV secretion system to deliver the CagA cytotoxin protein. Comparison of the two genomes revealed between 6 to 7% of the genes were specific to each strain, with almost half of these genes being clustered in a single hyper-variable region or plasticity zone. The presence of a variable gene pool is indicative of horizontal gene transfer. Because strain-specific genes could be involved in gastric adaptation during co-evolution, this flexible gene pool was extensively investigated in many strains worldwide. In silico analysis of the two H. pylori genomes revealed that both housekeeping genes and virulence genes are transferred between H. pylori strains [278, 279].
Next generation sequencing has enabled the study of unrelated H. pylori strains isolated from unrelated individuals and further studies have focused on evolution of H. pylori within the same host. Commonly, isolates have been derived from chronically infected individuals that have been exposed to H. pylori for decades, such as 26695 and J99. More recently, analysis of the genetic relationships between strains of H. pylori that had been sequentially isolated from the same host at different times also showed high mutation rates and genetic diversity.
Inter-strain diversity is represented by variations in the number and contents of genes, chromosomal rearrangements and allelic diversities, and is not unique to H. pylori . For H. pylori, each strain contains many strain-specific genes. It has been proposed that a particular bacterial species contains a core set of genes and auxiliary genes. The core genome contains genes that are present in all or nearly all of the strains. It determines the properties that are characteristic for the species. The auxiliary genes are present in some strains, and are determinants of the biological properties unique to each strain.
6.1. Chronic infection
The first studies of genetic diversity and evolution amongst H. pylori strains were performed using isolates derived from chronically infected individuals. Early studies of genetic change during H. pylori chronic infection were restricted to selected housekeeping genes within a bacterial population in single and mixed infected hosts.
Molecular fingerprinting studies using RAPD  and amplified fragment length polymorphisms (AFLP)  first showed that strains isolated from single patients were closely related, but were subtly different. Strains isolated at the same time or months and years apart also showed similar divergence [281-285]. Thus, while populations diverge within the host, divergence appears to occur slowly.
Newer technologies allowed for more accurate comparisons of H. pylori genomes. Salama et. al. reported genetic diversity amongst 15 unrelated H. pylori strains from different geographical locations and identified strain-specific genes based on whole genome H. pylori DNA microarray. Analysis of the genomic content of H. pylori strains from chronically infected patients found that 22% of H. pylori genes were dispensable and defined a minimal core genome of 1,281 H. pylori genes . Of these, more than 300 genes were not homogeneously distributed and many of the dispensable genes were located in plasticity zones and in the cag pathogenicity island. Core genes encoded mostly metabolic and cellular processes, while strain-specific genes included genes unique to H. pylori, restriction modification genes, transposases and genes encoding cell surface proteins.
Gressmann et al. used a collection of 56 globally representative H. pylori strains that included examples from all known populations and subpopulations in comparative genome hybridizations with a microarray representing the genes of the combined genomes of 26695 and J99. The study found 1, 150 genes were conserved between the 56 strains and estimated that the core genome conserved in all H. pylori strains is ~1, 111 genes. The remaining 400 genes that each H. pylori genome contains come from a pool of genes that are strain-specific and located within the cag pathogenicity island , consistent with the findings of the previous study.
Other studies also identified the cag pathogenicity island and plasticity zones as regions of genetic diversity and DNA exchange. Israel et. al. examined the gene content of two H. pylori strains that had similar virulence genotypes (cagA+ vacA s1a iceA1), but were phenotypically different and showed by H. pylori whole genome microarray analysis that the gene content within the cag island differed substantially . Occhialini et. al. examined the composition of the plasticity zone in a collection of 43 H. pylori strains from diverse clinical origins and showed that the plasticity zone is highly mosaic and represents a large fragment of foreign DNA integrated into the genome .
Genetic variation can be generated in a bacterial population by mutation and/or recombination between different strains. The studies described so far focus on unrelated clinical isolates from unrelated infected individuals. However, to address the question of mutation and recombination rates of H. pylori during chronic infection, a comparison of H. pylori isolates from the same host must be made.
One of the first such studies compared the genome of J99, originally isolated in 1994, with multiple strains that had been isolated from the same patient six years later . All follow-up strains differed from the original isolate by one or multiple gene losses or gains. Randomly amplified polymorphic DNA PCR and DNA sequencing of four unlinked loci revealed that these isolates were closely related to the original strain. In contrast, microarray analysis revealed differences in genetic content among all of the isolates that were not detected by randomly amplified polymorphic DNA PCR or sequence analysis . Whole-genome microarray revealed a difference in genomic composition in 3% of J99 loci as well as the relatedness of these isolates to each other when compared with H. pylori isolates from other individuals.
Subsequent studies observed sequentially isolated H. pylori strains from the same host. One of the early reports examined the sequences of ten gene fragments (encoding house-keeping enzymes and virulence associated proteins) of paired sequential isolates from 26 patients in two geographical areas at between 3 month and 48 month intervals (average 1.8 year interval) and found that point mutations occur in the stomach of a single host and that mostly small DNA segments (median size of 417 base pairs) are exchanged with other bacteria in the stomach . A Bayesian model was used to calculate mutation rate, import size and the frequency of recombination. On average, pairs of bacteria differed by ~100 DNA imports, corresponding to three percent of the genome or 50 kb. A mutation rate of 4.1 × 10-5 per year and base pair was estimated. Recombination occurred at a rate of 60 imports spanning 25, 000 base pair per genome per year. Authors concluded that recombination is so frequent that appreciable fractions of the entire genome are exchanged during the colonization of a single human, resulting in a highly flexible genome content and frequent shuffling of sequence polymorphisms throughout the local gene pool .
Kraft et al. examined paired strains of H. pylori with respect to their genomic contents using the DNA microarray method and also reported evolutionary changes in the H. pylori genome. Isolates were obtained from the same patients at intervals of 3 to 36 months. Of the 21 pairs of strains examined, 4 pairs showed differences in their genomic contents, suggesting the occurrence of evolutionary recombination events. These included a complete deletion and a partial loss of the cag pathogenicity island, a replacement of an open reading frame of unknown function, an acquisition of 14 genes in the plasticity zone, a duplication of the ceuE genes (HP1561/HP1562) and a truncation of tandem arranged ackA and pta genes resulting in the formation of pseudo-genes .
A more recent study estimated the short-term mutation and recombination rates of H. pylori by sequencing an average of 39, 300 base pairs in 78 gene fragments from 97 isolates . These isolates included 34 pairs of sequential samples at intervals of 3 months to 10.2 years. They also included single isolates from 29 individuals from 10 families. The accumulation of sequence diversity increased with time of separation in a clock-like manner in the sequential isolates. Approximate Bayesian Computation was used to estimate the mutation and recombination rates, mean length of recombination tracts, and average diversity in those tracts. The short-term mutation rate is estimated to be 1.461026 (serial isolates) to 4.561026 (family isolates) per nucleotide per year and that three times as many substitutions are introduced by recombination as by mutation. Comparisons with the recent literature show that short-term mutation rates vary dramatically and can span a range of several orders of magnitude.
The above studies analysed the genetic relationships between strains of H. pylori sequentially isolated from the same patient. The studies are based mainly on multi-locus sequence analysis of mostly house-keeping genes and have limitations. Bayesian inference was used to estimate recombination and mutation but the multi-locus approach did not allow conclusions about the chromosomal distribution of import events or about the relative frequencies of imports in different categories of genes.
More recently, mutation and recombination rates have been estimated by a genome-wide analysis using pyro-sequencing technology. Kennemann et al. analysed the genomes of five sets of sequential isolates of H. pylori, including four pairs of isolates from the earlier studies (with isolation intervals of 3 years) and recent follow-up isolates for two of the pairs that were obtained 16 years after the first isolates. The genome comparisons of the four sets of isolates confirmed previous estimates of the length of imported fragments and reported an average length of 394 base pairs, which is in agreement with the previous estimate of 417 base pairs. The 16-year isolates differed from the initial isolates by far more SNPs and CNPs than the 3-year isolates, indicating that diversity caused by mutation and recombination had accumulated over time. The average genome-wide mutation rate for the four 3-year pairs of sequential isolates from chronically infected individuals was found to be 2.5 × 10-5 (range = 0.5–6.5 × 10-5) per year per site. This rate is ∼18-fold faster than the mutation rate previously calculated for serial H. pylori isolates based on analysis of housekeeping genes. The rate of recombination was 5.5 × 10-5 recombination events per initiation site and year, similar to what was previously estimated , but 122-fold higher than the rate of 4.4 × 10-7 calculated from housekeeping genes. These differences can be attributed to multiple factors. Faster rates can be expected for a genome-wide analysis, because housekeeping genes are likely to be under strong purifying selection, whereas the genome-wide analysis comprises noncoding DNA as well as genes under diversifying selection. In addition, the rates of mutation and recombination varied strongly between different infected individuals in both studies, which could be because of strain properties, the extent of mixed infections that determines the availability of exogenous DNA, or varying selective forces in infected hosts. This study of H. pylori genomic evolution during human infection shows genome-wide recombination in H. pylori colonizing humans in a high-prevalence area with a high rate of mixed infections. Genome-wide analyses of the length of individual import events were in good agreement with earlier estimates, but two important findings have emerged: 1) imports were often clustered and 2) imports frequently affected genes coding for outer membrane proteins of the Hop family.
Taken together, these provide evidence of genetic variability and DNA exchange among H. pylori strains and demonstrate that gene contents of H. pylori isolates from the same and different individuals displays between 3% and 22% variability, respectively. Data indicate that within an apparently homogeneous population, remarkable genetic differences exist among single-colony isolates of H. pylori and provide direct evidence that this bacterium has the capacity to lose and possibly acquire exogenous DNA and is consistent with the theory of continuous microevolution within a cognate host.
H. pylori are naturally competent for transformation  and non-random distributed repetitive sequences are found in the genome, which leads to frequent recombination events . Calculations of mutation and recombination frequencies with respect to insert sizes revealed that genetic diversity displayed by the panmictic population structure is a result of continuous DNA exchange between parental strains and daughter strains, which have accumulated mutations. This was supported by gene content analysis of isolates taken from single patients at different time points, which demonstrated that the great majority of genetic changes were caused by homologous recombination, indicating that adaptation of H. pylori to the host individual is more frequently mediated by sequence changes acquired by recombination events rather than loss or gain of genes .
The comparative studies of the H. pylori genome reveal the genomic changes during the cycle of invasion, colonization and transmission to a new host. Invasion into a new host seems to have little effect on the gene composition of H. pylori, suggesting that the current genome of H. pylori has sufficient capacities for permitting bacterial invasion into a human host. Once the infection is established, the bacterium has to cope with the dynamic changes of the physiological environment during the long-term coexistence with the host. Genomic diversifications, or gain and/or loss of genes, occur in response to these changes. The diversifications involve genes that are mainly those strain-specific genes observed from comparative studies of unrelated strains of H. pylori. Intra-host evolution of H. pylori, thus, results in the creation of a unique and strain-specific combination of genes enabling persistence in individual hosts.
6.2. Acute infection and human challenge studies
Most studies of H. pylori adaptation to the human host have relied on samples from adults who had most likely been infected since childhood thus showing changes only after a long adaption of the bacterium to the individual host. It is also of interest to understand the evolution of the H. pylori genome in the first weeks and months after infection to reflect the selective pressure occurring during adaptation to a new host.
A limited number of H. pylori challenge studies in humans have been performed over the past 2 decades, mainly to support the development of anti-Helicobacter vaccines. These studies have provided valuable samples for the study of intra-host adaptation of H. pylori during the acute phase of H. pylori infection.
The first human challenge study was documented by Graham et al. to develop a reliable challenge model to evaluate H. pylori vaccine candidates . Healthy human volunteers were infected with the H. pylori strain BCS 100 and H. pylori clones were re-isolated from stomach biopsies 3 months after challenge. The Baylor challenge strain (BCS 100) is cag pathogenicity island negative, and positive for vacA s1c-m1.
Another challenge study using the same BCS 100 strain study was reported for the development of a vaccine against H. pylori . Performed in Germany at the Paul Ehrlich Institute, a live vaccine against H. pylori was tested in human volunteers sero-negative for, and without evidence of, active H. pylori infection. Volunteers (n=58) were immunised orally with Salmonella enterica serovar Typhi Ty21a expressing H. pylori urease or HP0231, or solely with Ty21a, and then challenged with H. pylori. Gastric biopsies were taken before and after vaccination and pre- and post-challenge.
A third unpublished acute human challenge study using a CagA positive H. pylori strain was reported by Malfertheiner et al. at the AGA Annual Meeting 2011 (AGA abstract #432, S-86). In brief, 34 healthy subjects that tested negative for H. pylori infection received an experimental vaccine or placebo prior to subsequent challenge with H. pylori isolate at a dose of 5x106 CFU. The isolate was susceptible to all current antibiotics used in H. pylori therapy. Subjects were evaluated at 2, 4 and 12 weeks after challenge and the published data revealed changes in pepsinogen I and II and gastrin-17 levels. Authors concluded that challenge with a CagA positive H. pylori strain induced moderate dyspeptic symptoms that resolve within a few days and profoundly affected gastric physiology with a distinct reactive pattern of serum pepsinogen I and II and gastrin-17.
With the challenge strain BCS 100 and the reisolate 8A3, the analysis of a pair of isolates for which the time elapsed between the infection and reisolation was exactly known, allowed the rate of genetic changes occurring in the early phase of H. pylori infection to be addressed. The genomes of the challenge strain (BCS 100) of the vaccine trial and a reisolate (8A3) from a volunteer (non-vaccinated control group) who had been infected with BCS 100 for 3 months  were sequenced. Whole-genome sequence comparison revealed very few differences between the two isolates . Three point mutations were identified and confirmed by Sanger sequencing. All three were nonsynonymous mutations, leading to predicted single amino acid changes in the proteins δ-1- pyrroline-5-carboxylate dehydrogenase (PutA; HP0056), pyridoxal phosphate biosynthetic protein J (PdxJ; HP1582), and HP1181 (a predicted multidrug efflux transporter). In addition to point mutations, noted were repeat length differences (RLDs) in two different dinucleotide repeat sequences and one repeat consisting of multiple copies of an 8-basepair motif.
In striking contrast to the pairs from the earlier study of isolates from chronically infected individuals, no single recombination event was detected. This is most likely due to the lack of co-infection, more commonly observed in chronic infection. H. pylori prevalence has fallen in most Western countries, including Germany where the study was conducted, reducing the risk of acquiring multiple strains. A recent study of two pairs of sequential H. pylori isolates from Sweden also did not detect any evidence of recombination during chronic infection . Likewise, there was no evidence of recombination during 3 months of infection of a volunteer in Germany, providing evidence that H. pylori can establish chronic infection after infection with a single strain and that its genome can be stable in the absence of mixed infection.
In addition, H. pylori isolates from four healthy adults (patients 101, 103, 104, and 105) who participated in the BCS 100 human challenge study  collected 15 days or 90 days post-infection were examined. Adult volunteers were not related to each other and were not related to the patient from which the challenge strain was obtained. Therefore, if host-specific differences select for genetic changes in the bacteria, such conditions would be present during this human challenge experiment. Comparative genome array analysis with single colony isolates demonstrated that their genomic contents were identical to the challenge strain. No rapid changes in gene content or sequence divergence up to 3 months after transmission was observed .
To date, no data has been published on the infectivity and colonisation rate of vaccinated and non-vaccinated subjects challenged with the CagA positive strain, or the genetic sequence of isolates collected at post-challenge time-points compared to the original challenge strain. This type of genome sequence analysis would provide new information on the adaptation of H. pylori during the acute phase of infection in the host and would allow comparison of genetic events occurring in the presence or absence of virulence genes such as CagA.
A fourth H. pylori human challenge study has recently been performed by investigators at Ondek Pty. Ltd. for the development of mucosal delivery technology based on live H. pylori (Benghezal et al. unpublished data). Five genotypically different strains of H. pylori, including CagA positive strains, were used to challenge healthy human volunteers in a Phase I study (clinical trial reference: SCGH HREC #2009-062). Briefly, 36 H. pylori sero-negative subjects were screened and randomised into 6 groups to receive either one of the 5 H. pylori strains or placebo. Subjects were monitored for the duration of the 12 week study and stomach biopsies were collected at 2 and 12 weeks post challenge, time points that depict an acute and start of the chronic phase of infection, respectively. Single colony isolates collected at the 2 and 12 week time points represent a unique resource for the investigation of H. pylori adaptation in humans based on a multi-strain, multi-person experimental study. Indeed, humans differ in physiologic and immunologic traits, and these traits change with age and in response to H. pylori infection. Such host diversity should select for adaptive changes in H. pylori genes important for host interaction. Yet the types of adaptive changes that H. pylori undergo during colonisation, the underlying mechanisms and functional significance and the resulting changes in the host response remain largely unknown. Thus a sweeping characterization of H. pylori adaptation is now made possible by the unique opportunity to access strains from the Ondek-sponsored clinical trial in which 6 subjects per group were each challenged with one of several H. pylori strains. H. pylori recovered during the acute (2 week) and chronic (12 week) phases of infection will be compared with input strains by genome sequencing to gain new perspectives on bacterial adaptation to different human traits. Of note, comparative genomic analysis of the input/output strains of H. pylori will provide insight into the genome plasticity and stability of a live bacterial vector after human challenge for further development of a novel mucosal delivery technology.
7.1. Insight into bacterial pathogenesis and microbial evolution
7.1.1. Genomic diversity
The sequencing of H. pylori host isolates has revealed the nature and extent of recombination arising from inter-strain horizontal transfer in vivo. Generation of genetic diversity by recombination likely relies on the presence of co-infection with multiple strains. Individuals who are already colonised are able to be infected with new additional strains . A study of 127 individuals from three regions in Venezuela found evidence of mixed colonization in 55% of subjects . Genome sequencing of sequential H. pylori isolates from chronically infected individuals in Columbia revealed high rates of recombination attributable to uptake of DNA from unrelated strains during co-infection . Thus co-infection with multiple strains is likely to be common, at least in areas with endemic H. pylori infection [172, 298], as was the case through most of human history. Where recombination occurs between strains it has been found to introduce 100-fold more genetic alterations than mutation  and occurs genome-wide .
However, co-infection may be less pervasive in most Western countries where H. pylori prevalence is low. In the absence of mixed infection recombination has little effect on genetic diversity . No evidence of recombination was found in a German individual experimentally infected with a single strain  or in sequential H. pylori isolates from Sweden . The genome sequencing of H. pylori isolates from 52 members of two South African families found both high and low rates of recombination which the authors attributed to the presence of co-infection with multiple strains or infection with only a single strain, respectively . Thus it is hypothesized that in instances of co-infection recombination can introduce genomic diversity but that a single strain of H. pylori can still establish a robust colonization and stably maintain its genome in the absence of recombination of DNA between strains, suggesting that intra-strain genetic diversity is a sufficient driver to adapt to the changing host. However, lack of opportunity for horizontal transfer between strains in mixed infections in developed countries may be accelerating the disappearance of H. pylori in these regions . Indeed transient and self-clearing infections do occur  suggesting, in these instances, an inability of the bacterium to adapt to the host or a role of multiple infections in H. pylori transmission.
The competence of H. pylori likely confers evolutionary advantages. Competence has been found to confer a fitness advantage over non-competent strains in vitro . Wild-type G27 was able to adapt to laboratory conditions more rapidly than a competence null comH mutant. Although competence is not required for gastric colonization in the gerbil model , competence may be important for host adaptation. Natural transformation generates diversity by the introduction of new alleles and mosaic alleles in recipient strains. Greater genetic diversity amongst strains acts to increase the pool of variants available from which selection can act to confer an advantage on those who are fitter in the changing environment of the host or in different individuals. In the presence of co-infection recombination occurs throughout the time-course of infection. The continued requirement for host-adaptation results in the in vivo selection for genes that have undergone recombination that favour persistence [157, 172, 284, 301]. Evidence for recombination is often found in genes related to virulence and persistence.
Analysis of a mixed infection from a Lithuanian patient found that recombination between strains resulted in the loss of the cag pathogenicity island and changes in alleles encoding VacA and outer membrane proteins . Genome sequencing of H. pylori strains isolated sequentially from chronically infected individuals demonstrated that the number of clustered nucleotide polymorphisms (CNPs), attributable to genetic recombination with DNA from different strains during co-infection, was much greater in strains isolated later in infection (16 years) than in those isolated earlier (3 years), demonstrating that recombination continues through the time-course of infection. CNPs occurred more regularly in some gene families than others; most notably in genes encoding outer membrane proteins, particularly those of the hop family  whose proteins are known to play key roles in adhesion . This indicates that diversifying selection acts in vivo on recombined genes . Indeed, genetic evidence of low effective population sizes in isolates from South Africa may reflect population bottlenecks arising from selection pressures imposed by the host immune system .
Horizontal gene transfer by natural transformation in the presence of multiple strains is a key driver in the genetic diversity of H. pylori. Evidence has emerged that this generation of genetic diversity plays a role in the ability of Helicobacter to continually adapt to the host throughout prolonged infection. Recombination has the advantage of allowing multiple beneficial mutations to be combined within the same genetic background by horizontal transfer rather than by sequential mutation. This would allow more rapid fixation of beneficial alleles . Indeed, evidence is also emerging that H. pylori experiences selective pressure for fitter variants by the host immune system.
7.1.2. Epigenetic diversity
Recently, SMRT sequencing has been utilized to sequence the methylomes of H. pylori 26695 and J99. Analysis of the methylome following genetic manipulation of candidate methylases has allowed characterisation of a number of methyltransferases in H. pylori. Interestingly, a methyltransferase (HP1353) has been identified that contains two phase-variable repeat sequences. One phase variable repeat appears to function canonically to regulate expression of the methyltransferase. Phase variation of the second repeat appears to switch the protein between two forms that vary in their methylation recognition sequence. The protein thus has three different phase-variable states; off, methylation of recognition sequence 1, and methylation of recognition sequence 2 . It is tempting to speculate that this represents an additional layer of complexity to the phasevarion. Should this methyltransferase regulate the transcription of other genes, it could, as a single methyltransferase, function to regulate two separate phasevarions.
Phase-variable methyltransferases have been identified in H. pylori and the investigation of their corresponding phasevarions has been initiated. The coordinated reversible switching of many genes within a phasevarion is postulated to play a role in host adaptation, immune evasion and pathogenicity. However, given the potential significance of the phasevarion, experimental evidence is lacking regarding the role of the phasevarion in vivo in host colonization, persistence, and disease pathology. In one study investigating the significance of methylation by the iceA-hpyIM R-M system, expression levels of hpyIM as determined from RT-PCR and/or RNA slot-blot analysis of RNA isolated from gastric biopsy specimens could not be correlated with iceA expression, disease sequelae, colonization density, or mucosal IL-8 levels. However, this methyltransferase has not been identified as having the potential to phase-vary and appears to be a stationary growth phase regulator .
Phase variation of individual genes and phasevarions impart to the bacterium an ability to rapidly and reversibly switch between a vast array of different phenotypes. It is postulated that for pathogenic bacteria, there is significant pressure to constantly avoid detection by the immune system. The ability to generate epigenetic and phenotypic variation in this way may enable a bacterial population to play a lottery against the human immune system in which at least the phenotype of one individual imparts sufficient fitness within a changing environment to allow persistence. Thus, both the continued characterisation of phasevarions and their in vivo relevance for host adaptation and virulence is of importance in our understanding of H. pylori persistence and pathogenesis.
7.1.3. H. pylori regulation of genetic diversity and evolution constraints?
The ability of H. pylori to generate genetic diversity is likely to be regulated so as to achieve a subtle balance between the generation of genetic diversity and maintaining the integrity of its small genome to establish persistent infection. The different mutation rates reported to date are in line with regulation of generation of genetic diversity in H. pylori. Phase variation of DNA glycosylase MutY, an enzyme of the base excision repair process, represents further evidence for regulation of genetic diversity [66, 67]. Alternatively, a constitutive high mutation rate (named the mutator phenotype in E. coli) could be counterbalanced by efficient mechanisms to maintain genome integrity, such as homologous recombination using DNA template from neighbouring cells by natural competence. This possibility has gained some support due to recent reports describing an increase of DNA uptake and RecA homologous recombination upon DNA damage  and poor persistence of the dprA mutant deficient in natural competence, although the initial colonization level was comparable to wild-type .
The ability of H. pylori to genetically adapt to the human host is remarkable based on adaptation to human subpopulations with 6 ancestral populations of H. pylori named ancestral European 1, ancestral European 2, ancestral East Asia, ancestral Africa1, ancestral Africa2 , and ancestral Sahul . As described above, H. pylori genetic plasticity is the result of several mechanisms ranging from poor replication fidelity, lack of DNA repair genes and horizontal gene transfer promoted by natural competence, bacteriophage transduction, conjugation of TnPZs and plasmid transfer. An extreme example of H. pylori’s adaptive ability is exemplified by the host jump from humans to large felines, predicted to have occurred some 200, 000 years ago that resulted in a new species; Helicobacter acinonychis. This particular host jump was accompanied by relative conservation of the core genome, inactivation by different mechanisms of genes encoding surface proteins and acquisition of genes involved in sialyation of the bacterial surface to evade immune responses in the new host .
Altogether H. pylori micro-evolution within the human host and the above host jump conserved the core genome with changes affecting mainly hotspots and accessory genes. This suggests that constraints in the genome architecture and gene repertoire exist and these limit the evolutionary trajectories of H. pylori. Recent advances in evolutionary biology suggest a plurality of constraints on evolution, including the sequence type (coding sequence, structural RNAs, micro RNA and else) and genome architecture and gene repertoire underlying the sum of all phenotypic traits of the organism or phenome . Thus to make sense of H. pylori’s high mutation and recombination rates and investigate the role of genetic diversity in phenotypic adaptation to the human host, it is important to investigate the constraints, their networks and interactions limiting H. pylori evolvability.
7.1.4. Robustness and evolvability
Robustness of systems is the resistance to change under perturbation. Robustness, either mutational, or phenotypic, has been proposed to make biological systems more evolvable [305-308]. Studying H. pylori mutational and phenotypic robustness is a highly attractive approach to understand its adaptation to, and persistence in, the human host based on the latest findings and concepts of evolutionary biology.
The evolvability, or capacity of biological systems for adaptive evolution, was proposed to depend on their genotype-phenotype maps . Evolutionary change takes place in a population with each population member having some genotype defining a collection of genotypes in the genotype space . Through mutations, members of the population can change their location in the genotype space. In the population some individuals have a phenotype either superior, or inferior to the existing well-adapted phenotype. Natural selection, while eliminating poorly adapted individuals, preserves the well-adapted ones and selects superior ones. The genotype space reconciles the key problem of evolutionary adaptation of finding the rare superior genotype while preserving the population of well-adapted ones. A first characteristic of the genotype space is the existence of a set of genotypes with the same phenotype or connected genotype networks. The second characteristic of the genotype space is the number of genotypes that can be derived from any one genotype via mutation. The set of these genotypes is named neighbourhood genotype and its size is a simple measure of phenotypic variability of a genotype in response to mutation. In summary, the first feature of the genotype space allow individuals in a population to preserve their phenotype while changing their genotype and the neighbourhood genotype allows exploration of novel superior phenotypes in the population. Thus robustness is beneficial to both the individual and the population. As a consequence, robustness in the presence of genetic mutation/recombination allows cryptic genetic diversity to accumulate (in the sense of a capacitor) and promotes evolutionary adaptation through greater phenotypic variability in the population.
Robustness as a variability principle in the context of the mutator phenotype of H. pylori (due to its high mutation and recombination rates) would reconcile the interest of the single cell and the cell population and may underlie H. pylori persistence in the gastric niche. A robust cell phenotype during persistent colonization of the gastric mucosa would not easily be disturbed by mutations, would be beneficial to single cells and would allow greater phenotypic variability to be achieved at the cell population level on a (micro-) evolutionary time scale by accumulation of cryptic genotypic diversity.
The presence of phasevarions in H. pylori raises the possibility to influence robustness by dramatically increasing phenotypic variability of the neighbourhood genotype by simple mutation. Greater robustness of H. pylori would enhance adaptation and persistence in the changing human gastric niche. Other interesting questions related to evolutionary dynamics are the size of the H. pylori population and its mutation rate and how these parameters influence robustness and adaptation to the host. For example, one could hypothesize that the lack of inter-strain recombination weakens the robustness of H. pylori leading to its disappearance in Western countries where multiple infections are rare.
To conclude, the investigation of H. pylori robustness to better understand the adaptation of this bacterium to the human host will require both computational modelling and experimental data on genotype-phenotype maps of H. pylori populations made accessible by the latest sequencing and high-content technologies.
7.1.5. Refining H. pylori persistence mathematical model
H. pylori apparently lacks active and costly sensory machinery gene regulation and its small genome suggests alternative adaptive mechanisms including small RNA regulation , automatic random genetic switches for generating diverse adaptive phenotypes [47, 50, 51, 53], and the numerous duplicate and divergent outer membrane genes, which could be part of a more general gene regulation network so far unidentified. Introducing random fluctuations for stochastic phenotype transitions in the H. pylori mathematical model of persistence published by Blaser and Kirshner  is highly relevant to the robustness principle of biological systems and their evolvability. Thus further refinement of this model by introducing random fluctuations for stochastic phenotype transitions in the spirit of Kussell and Leibler , could help to further our understanding of how H. pylori establishes the optimal balance between sensing changes and random phenotype switching to adapt to its niche. Model predictions could be tested experimentally in animal models, using the power of molecular-bacterial genetics, including tetracycline-based gene regulation [275, 277], to validate or invalidate aspects of the model and gain insight into the dynamics of H. pylori populations.
7.2. Therapeutic potential of targeting persistence mechanisms
H. pylori chronic infection remains a significant health burden, a long regimen of triple or quadruple antibiotic therapy is currently the only available treatment and antibiotic resistance is emerging . Since no prophylactic or therapeutic vaccine has been successfully developed to date, new target ideas are needed for the development of innovative drug therapies to achieve H. pylori eradication. Targeting H. pylori persistence mechanisms is one such strategy that could encompass immune evasion, micro RNA regulation (of urease for example ), genetic and phenotypic diversity generation underlying host adaptation and robustness of biological systems. In this regard, phasevarions, recombination (XerH for example ), DNA repair, replication (low fidelity Polymerase I for example ) and competence would represent target pathways of choice. Also a drug targeting discrete sites on the urease complex surface shown to be required for infection  would have several advantages over eliciting site-specific urease antibodies to inhibit H. pylori persistence; e.g. better bioavailability in the mucus layer and a higher concentration that is more likely to achieve inhibition of the large amount of urease produced by H. pylori. Other potential targets are the UreI channel (whose structure is available for drug design ) that is required for buffering the periplasm during colonization [255-257], the gamma-gltutamyltransferase required for immune modulation and bacterial metabolism to maintain colonization , H. pylori LPS biosynthetic proteins contributing to gastric adaptation, adhesion and modulation of the host immune response by changing antigen expression in the same human host over time . Compounds targeting the biosynthesis of glycolipids that H. pylori uses for immune evasion showed promising inhibition in vitro and further work is required to assess their potential . With the recent adaptation of the tetracycline-based gene regulation system in H. pylori and its functionality in vivo [275, 277], potential targets can now be validated in animal models using tetracycline-based conditional knockouts. Furthermore, conditional knockouts will be instrumental in testing the rate of emergence of resistance for target candidates to select targets less prone to genetic adaptation before embarking on a long and costly drug discovery and development path. H. pylori-specific drugs targeting persistence mechanisms would have the advantage of leaving the gastrointestinal microbiota intact, avoiding side effects such as diarrhoea and increased patient compliance. Based on the use of triple and quadruple antibiotic therapies and the genetic plasticity of H. pylori, it is likely that more than one H. pylori-specific drug targeting persistence will be required to avoid emergence of resistance and achieve high eradication rate.
This work was supported by a NHMRC Sir McFarlane Burnett Fellowship grant (572723) to BJM, NHMRC Project Grant (634465) to BJM and MB and NHMRC Early Career Fellowship grant to AWD (1073250). We thank the team at Transittranslations who proofread the manuscript.