The copy number of MIRU-VNTR loci in BCG Russia genome.
Background: The genome stability of attenuated live BCG vaccine preventing the acute forms of childhood tuberculosis is an important aspect of vaccine production. The purpose of our study was a whole genome comparative analysis of BCG sub-strains and identification of potential triggers of sub-strains’ transition.
- BCG sub-strains
- Mycobacterium bovis
- genome stability
- genome rearrangements
Tuberculosis (TB) is one of the top causes of death in the world. Currently, the only authorized vaccine for primary vaccination of children from TB remains BCG, first applied in 1921. It is broadly used in different countries as part of the national childhood immunization program. Despite the attempts of TB control through widespread introduction of vaccination it was estimated in 2014 worldwide that 9.6 million people have fallen ill with TB. Nevertheless, vaccination against TB reduced TB prevalence by 42% in 2015 compared to that in 1990 .
World Health Organization (WHO) controls BCG vaccine, and the WHO Expert Committee on Biological Standardization (ECBS) has developed the international requirements for the manufacture and control of BCG vaccine. In 2009, for BCG vaccines of three different sub-strains (Danish1331, Tokyo 172-1 and Russian BCG-I), WHO Reference Reagents were established by WHO ECBS. In addition, quality control requirements comprising molecular genetic characterization of final lots and working seeds of BCG vaccines were suggested . Russian research laboratories performed whole genome sequencing (WGS) of BCG Russia sub-strain genome as WHO and good manufacturing practice (GMP) recommended [3, 4, 5]. Currently, ten whole genome sequences of BCG sub-strains including BCG Russia are available in GenBank. It should be noted that since the 1920s, cultivation of the original strain BCG resulted in the emergence of numerous sub-strains that have evolved from it. So, now we could investigate the evolution of BCG sub-strains and the endpoints of this evolution could be assessed likewise in the study of Darwinian biological species evolution . The reason for BCG sub-strains’ transition remains unclear because the progenitor of BCG strains was lost. The comparative analyses of genome features of different BCG sub-strains can help in solving this problem.
The attention was focused on mobile elements of BCG sub-strain genomes especially on prophage sequences because of their contribution to the bacterial genome patterning. Following Brüssow et al. , 12 years later, we can reaffirm that there is a renaissance of phage research because now we have a lot of information about bacterial and phage genomes in the international databases. It was noticed that reintroduction of the fitness factor by phages usually influences the pathogenic factors of bacteria cells . Thus, phages are of great importance for bacterial short-term adaptation and our goal was to estimate a potential contribution of prophage sequences on the mosaic structure of vaccine BCG sub-strain formation.
2. BCG genome sequencing
3. Comparative genome analyses as proof of BCG Russia genome stability
In the vaccine manufacture, one of the important features of BCG sub-strain is the genome stability. So, BCG vaccines’ quality control and production now include characterization of BCG sub-strain genome. The importance of molecular genetic characterization is confirmed by the WHO requirements. According to these requirements, WGS of the last seed lot of BCG Russia (BCG Russia 368, 2006 year) was performed. Besides, two BCG Russia sub-strains from seed lots of 1963 and 1982 years (BCG Russia 311 and BCG Russia 977) were analyzed on the basis of WGS.
Comparative analyses of three BCG Russia sub-strains from different seed lots revealed only two differences. The first difference was the
The second change in genome-affected glycerol-3-phosphate acyltransferase gene is shown in Figure 1. The mutation that occurred in this gene in the position 2,744,580 (an insertion of TGT bases instead of C base) truncated the protein. Nevertheless, the mutation was not concerned with the conservative domain of glycerol-3-phosphate acyltransferase and the protein could be functional. It should be noted that not all reads had the insertion of TGT. The changes were registered only for 14% reads of BCG Russia 311 and 54% reads of BCG Russia 977. In the last BCG Russia 368 generation, this mutation wasn’t found. So, the genome structures of three different BCG Russia seed lots remain stable. The last BCG Russia 368 generation that is discussed in the text later was deposited in GenBank with the Accession Number NZ_CP009243.1.
In silicogenotyping of BCG-Russia sub-strain
Genomic feature of BCG Russia, as well as all BCG sub-strains, is a large deletion of the 10-kb genomic region of difference 1 (RD1) .
Spoligotyping profile is the second known characteristic of BCG sub-strain. This method is based on detection of Direct Repeats (DR) on the right and left sides of IS6110. DR loci are members of a universal family of sequences, designated as clustered regularly interspaced short palindromic repeats (CRISPR) sequence family.
Spoligotype profile of BCG Russia was typical for
The Mycobacterial interspersed repetitive unit (MIRU) profile of BCG Russia sub-strain based on
The whole data of BCG Russia MIRU-variable-number tandem repeats (VNTR) loci are summarized in Table 1. The repeat unit size (bp) and repeat number are indicated in the brackets, partial repeat size (bp) is outside. One discrepancy in polymerase chain reaction (PCR) and
|Locus||Copy number of locus, bp*||Locus||Copy number of locus, bp|
|MIRU_2||(53 × 2) + 8||MIRU_23||(53 × 4) + 5|
|Mtub04||(51 × 0) + 30||MIRU_24||(54 × 2) + 30|
|ETR_C||(58 × 4) + 37||MIRU_26||(51 × 5) + 13|
|MIRU_4/ETR_D||(77 × 3) + 4||MIRU_27/QUB-5||(53 × 3) + 25|
|MIRU_40||(54 × 2) + 19||Mtub34||(54 × 2) + 51|
|MIRU_10||(53 × 1) + 51||MIRU_31/ETR_E||(53 × 2) + 3|
|MIRU_16||(53 × 3) + 18||Mtub39||(58 × 2)|
|Mtub21||(57 × 0) + 34||QUB-26||(111 × 4) + 24|
|MIRU_20||(77 × 2) + 11||QUB-4156||(32 + (59 × 0) + 19)|
|QUB-11b||(69 × 3) + 10||MIRU_39||(53 × 1) + 29|
|ETR_A||(75 × 5) + 20||QUB-3232||(56 × 5) + 48|
|Mtub29||(13 + (57 × 1) + 35)||VNTR-3820||(59 × 5) + 47|
|Mtub30||(58 × 1) + 53||VNTR-4120||(57 × 2) + 23|
|ETR_B||(57 × 5) + 8|
5. Is the original BCG Russia sub-strain
High degree of genomic stability of BCG Russia sub-strain is seen as an inexplicable fact by some scientists. One of the explanations of this fact proposed by Keller et al. is in the highly cited paper . They postulated
Among the mechanisms of bacterial evolution, the leading role belongs to recombination events. The large-scale rearrangements, deletions and duplications were revealed during comparative genomics analyses in
Keller et al. detected the single-nucleotide insertion of “C” at the 5′ end of the
Keller et al. findings may indicate that the sub-strain used by authors was not original or has been changed during cultivation. The last one is possible. According to our data whole and truncated variants of glycerol-3-phosphate acyltransferase gene was identified in one of the BCG Russia generations (Figure 1).
6. The “early” sub-strain genomes comparison
Genome sequences were compared using BCG sub-strain Tokyo 172 genome, a member of the “early” sub-strains group as reference. First, among this group, BCG Tokyo sub-strain is closest to BCG Russia sub-strain as regards the time of its provision by the Pasteur Institute to Tokyo (in 1925). Second, it was lyophilized in the 1940s and used later as a freeze-dried vaccine, as BCG Russia sub-strain. Then, in 1960, the 172nd transfer on bile-potato medium was freeze dried and adopted as a primary seed lot . Finally, one of the first BCG genomes that were accurately sequenced, assembled, and submitted to GenBank was the genome of this seed lot .
We observed no significant diversity in the sequences of the BCG Russia 368 and BCG Tokyo 172 genomes. The revealed genomic differences were summarized in Table 2 and could be subdivided into three groups: region of differences (RDs), ins/del and SNP. Only two RDs were detected between the “early” sub-strains. First, a 22 bp insertion was found in the TetR family transcriptional regulator gene of BCG Russia 368 genome. One variant of Japan BCG vaccine (Type I), submitted in GenBank, included this deletion (RD16). The RD16 band identical to those of other BCG sub-strains was found in the Type II strain . A 1602 bp deletion in BCG Russia 368 genome was the second RD, corresponding to the region from 4,110,452 to 4,112,053 bp in BCG Tokyo 172, beginning in JTY_RS19265 (ribonuclease gene), including JTY_RS19270 (antitoxin VapB48 gene) and finishing inside JTY_RS19275 (glutamate-cysteine ligase gene).
|Type of differences||Number of differences|
|BCG Russia368/BCG Tokyo 172||BCG Tokyo 172/BCGPasteur1173P2 |
|Region of Differences (more than 20 bp)||2||20|
|Insertions/deletions <20 bp (1–9 bp)||10||20|
|SNP in total||52||68|
The sub-strains used for vaccine production in Bulgaria (BCG Sofia) and India were obtained from BCG Russia. Nowadays, UNICEF uses four variants of BCG vaccine on behalf of the Global Alliance for Vaccines and immunization. The Statens Serum Institute in Denmark produces BCG-Denmark; Bulbio (BBNCIPD) in Bulgaria; and the Serum Institute in India produces BCG-Russia (genetically identical to BCG-Bulgaria) and the Japan BCG laboratory produces BCG-Japan .
We could trace the genome characteristics of BCG Russia daughter sub-strains using published data. Stefanova et al. analyzed the BCG sub-strain used for production in Bulgaria (named Sofia SL222) with
According to Seki M. et al. differences between the “early” sub-strain Tokyo and the “late” sub-strain Pasteur were more significant and the number of RD increased tenfold .
Less ins/del differences were found between BCG Russia and BCG Tokyo genomes, then between BCG Tokyo and BCG Pasteur genomes. The size of ins/del differences was small: only 1–9 bp.
However, the number of SNPs was nearly the same in the two pairs of the genomes. Non-synonymous SNP in BCG Russia 368 amounted to 60%, but most of them were associated with conservative substitutions in the proteins. Only seven proteins had radical substitutions, though three of them were from the PE-PGRS/PPE family. This finding has emphasized the significance of these proteins for BCG sub-strain adaptation.
7. A whole genome restriction map analysis
The large array of published literature accentuated the important role of RD in BCG sub-strains differentiation. For checking these statements the methods by OpGen Incorporated Company was used. So, first of all, the assembly of DU2 region and the number of tandem duplications in this region in BCG Russia 368 genome were performed by the Argus™ Optical Mapping System. WGM of the sub-strain BCG Russia 368 was created by the laboratory of OpGen Incorporated Company (Maryland, the USA), according to the Argus™ Optical Mapping System user manual . DNA was digested with
The cluster construction based on map similarity of the six references of BCG sub-strains is shown in Figure 4. As you can see, the cluster was split into two groups: BCG Tice (ATCC 35743) was attributed to the group of the “early”, while BCG Mexico to the “late” group of sub-strains in accordance with the
8. Genome map construction and the analyses of repetitive elements of BCG Russia 368
The whole genome gap-less BCG Russia 368 chromosome after the verification of the number of DU2 repeats was visualized in GeneWiz  (Figure 5). Genome atlas option of GeneWiz primarily GC Skew was selected as an appropriate instrument for verifying the accuracy of genome assemblies and OriC detection. The place of the change in GC Skew agreed with the OriC and the first nucleotide position in the BCG Russia 368 genome. Other DNA properties, intrinsic curvature, stacking energy, position preference, global direct repeats, global inverted repeats, and AT-content, were essential for genome structure description.
|PPE protein gene||66|
|PE protein gene||33|
|PE_PGRS protein gene||69|
Locations of IS elements, repeats, prophage sequences, and PE, PPE, and PE_PGRS genes in BCG Russia 368 genome are visualized in Figure 6. Most of the repetitive elements of BCG Russia 368 genome, including some prophage sequences, are coinciding, overlapping, or interconnecting. So, it is hard to annotate some fragments of BCG Russia 368 genome. Special difficulties have arisen in the differentiation of bacterial and phage genes during PE/PGRS genes characterization.
9. The phages predicted in
Along with other mobile elements, the variability of predicted prophages may be the best indexes for characterization of mosaic structure genome. All of the prophages described in
The mosaic BCG genome structure has been verified by comparative prophage analyses. A partial similarity of BCG Tice/BCG Montreal prophage fragments has been identified after pair-wise alignment of BCG Montreal and BCG Tice phage sequences with BCG Russia 368 whole genome. The regions of similarity defined as the purple (BCG Tice) and blue (BCG Montreal) blocks on circle B are phage sequences discovered in different
In turn, the 7.5-kb BCG Russia 368 prophage was split on 0.9- and 6.6-kb fragments located in different regions of BCG Tice/Montreal genomes. Moreover, these fragments lacked the transposase gene, which was specific to 7.5-kb BCG Russia 368 prophage.
Also, the 7.5-kb BCG Russia 368 prophages had 922-bp repeats in BCG Russia 368 genome (red bar in Figure 6, circle B) and was located near ISMt1 insertion element. Interestingly, besides prophage fragments, two intact prophages associated with the insertion elements have been predicted by PHAST in BCG Russia 368 genome. Thus, 11-kb and 7.5-kb prophage sequences were linked with IS6110 and IS1560 elements, respectively. So, the connection between prophage sequences and the IS elements has a considerable impact on the BCG genome evolution.
10. Phylogeny reconstruction
Phylogeny reconstruction was made using the genome sequences of analyzed
Numerous comparative genomics investigations of BCG sub-strains confirmed significant genomic polymorphism of BCG sub-strains which arose from one progenitor. RDs, indels and SNPs are real evidences still going on in the
Here we supposed that genomic evolution and the BCG sub-strains diversity is a direct consequence of prophage-associated genome rearrangements. It is well known that 10–20% of bacterial genomes represented prophage sequences. Most of the prophages are damaged and mutated. Nevertheless, recombinant events between homologous prophage sequences are possible. Moreover, some genes of defective prophages can be still working . So, prophage genome content is an important biological driver/trigger of genomic rearrangements and evolution. Extensive contribution of prophages to bacterial fitness was supposed as a result of unexpected evolutionary prophage patterns . Suggesting our assumption, outstanding differences between prophage profiles have been revealed in our comparative genome analysis of nine BCG sub-strains and three
People migration from regions with a high incidence of TB and the growth of the number of HIV-infected individuals last decade resulted in the necessity of TB vaccination not only among children but also among adolescents and adults. In 2015, 15 vaccine candidates were considered in clinical trials. BCG vaccine replacement or vaccine boosting for the protection of adolescents and adults were considered. Recombinant BCGs, recombinant viral-vectored platforms, protein/adjuvant combinations, attenuated