Mycobacterium tuberculosis Transcriptome In Vivo Studies – A Key to Understand the Pathogen Adaptation Mechanism

The infectious diseases caused by intracellular bacterial pathogens, such as M. tuberculosis, are among the most important problems in public health worldwide. The development of an infectious process depends on intricate interactions between the host defence systems and the specific systems regulating mycobacterial gene expression. Changes in expression in response to host defence are a necessary condition for the M. tuberculosis survival and functioning. Tracking these changes makes possible to analyze the biochemical cascades that are triggered in response to host defence mechanisms and to find the targets for designing new therapeutics and monitoring bacterial infections; in addition, these results are useful for both theoretical (for example, dynamics of the pathogen transcriptome changes during longterm persistence in the host) and applied (for example, the study of the bacterial response to various therapeutic interventions) research.


Introduction
The infectious diseases caused by intracellular bacterial pathogens, such as M. tuberculosis, are among the most important problems in public health worldwide. The development of an infectious process depends on intricate interactions between the host defence systems and the specific systems regulating mycobacterial gene expression. Changes in expression in response to host defence are a necessary condition for the M. tuberculosis survival and functioning. Tracking these changes makes possible to analyze the biochemical cascades that are triggered in response to host defence mechanisms and to find the targets for designing new therapeutics and monitoring bacterial infections; in addition, these results are useful for both theoretical (for example, dynamics of the pathogen transcriptome changes during longterm persistence in the host) and applied (for example, the study of the bacterial response to various therapeutic interventions) research.
The completion of the M. tuberculosis genome sequence in 1998 (Cole et al., 1998) marked the beginning of the so called post genome era, the main characteristic of which are large scale studies of genome functional activity. The information on the bacterial genome organization allowed to construct macro-and microarrays containing fragments of a majority of ORFs, which enabled analysis of the pathogen transcription profile variations under different conditions. It's no wonder that the first study of the M. tuberculosis transcriptome using microarray technology was carried out in the first year after the publication of the genome sequence (Wilson et al., 1999). In as little as 5 years, there have been published many reports with the results of using microarrays for in vitro mycobacterial transcriptome analysis (for review, see (Butcher, 2004;Kendall et al., 2004)).
However, the in vivo analysis of mycobacterial gene expression during infection process, which is of special scientific interest, is rather complicated, and this can explain a relatively small number of such works. Experiments with analysis of a pathogen transcriptome in vivo are determined by the choice of: (1) an experimental model of infection; (2) a method of pathogen RNA isolation; (3) a method of analysis of RNA or cDNA enriched in bacterial transcripts. The brief features of these steps are given in the review. The methods of RNA isolation and enrichment are described in details in our recent review . Here we would like to emphasize that until the present time, practically all ways of preparing cDNA libraries have lead to the further microarray hybridization analysis. Hybridization-based microarray techniques are widely used in transcriptome studies, but development of novel high-throughput DNA sequencing methods (reviewed in (Shendure & Ji, 2008)) obviously demonstrated such microarrays limitations as high background levels owing to cross-hybridization, problems of rare transcript detection, and expression quantification (Shendure, 2008).
We proposed a new method for the evaluation of sequences of bacterial pathogens specifically transcribed in infected tissues . It is based upon the coincidence cloning approach that allows isolation of representative bacterial cDNA pools from infected organs. Co-denaturation and co-renaturation of the excess of bacterial genomic DNA with the cDNA transcribed from total RNA of the infected tissue enabled selective isolation of the bacterial cDNA fraction from the sample, and a single round of coincidence cloning resulted in >1000-fold enrichment of bacterial transcripts. The enriched cDNA library is suitable for high-throughput sequencing analysis, which is less biased and more reliable than other methods including microarrays.

Experimental models of infection
Experimental models of tuberculosis used for whole transcriptome studies in vivo are very diverse. The most frequently used models and examples of their applications are described below.

Host phagocytes (mouse and human)
The first work with a whole genome description of M. tuberculosis gene expression is that of Schnappinger et al. (Schnappinger et al., 2003). Activated by INF-and non-activated mouse macrophages were used in this work as a model system. Rachman et al. characterized the M. tuberculosis genes with enhanced expression in activated and inactivated mouse macrophages both relative to each other and to mycobacteria in vitro (Rachman et al., 2006). Rohde et al. studied changes in the M. tuberculosis transcriptome at the initial stages of infection of mouse macrophages and demonstrated a dynamic enhancement in the expression level of some genes during the first 24 hours post infection (Rohde et al., 2007). Cappelli et al. were the first to characterize the M. tuberculosis whole genome gene expression in vivo in human macrophages (Cappelli et al., 2006). Fontan et al. analyzed transcriptomes of M. tuberculosis from macrophages of the THP-1 cell line in 4 and 24 hours post infection (Fontan et al., 2008). In the work of Tailleux et al., the authors performed the first comparative analysis of gene expression in M. tuberculosis from infected macrophages and dendrite cells (Tailleux et al., 2008). An in vivo transcriptome comparison of two differently virulent M. tuberculosis strains (H37Rv and H37Ra) was first done by Li et al. (Li et al., 2010).
By the present time, the most extensive study of mycobacterial gene expression in vivo is that by Homolka et al. (Homolka et al., 2010). The authors performed a comparative analysis of expression profiles for 3 clinical isolates of M. africanum, 12 clinical isolates of M. tuberculosis, and two reference laboratory strains (M. tuberculosis H37Rv and CDC1551) in activated and non-activated mouse macrophages. This work resulted in the isolation of gene groups whose expression changes irrespective of mycobacterial strain and/or the activation status of the host macrophage.

Models of M. tuberculosis infection in laboratory animals
In 2004, Talaat et al. first performed an analysis of whole genome M. tuberculosis gene expression under natural conditions in a living organism (mouse) (Talaat et al., 2004). They studied changes in the pathogen transcriptome composition at different time intervals post infection (7, 14, 21 or 28 days) and for different host genotypes (immunocompetent Balb/c mice and immunodeficient SCID mice). In 2007, Talaat et al. published a paper devoted to the analysis of M. tuberculosis gene expression in the lungs of Balb/c mice at later stages of the infection process (Talaat et al., 2007). In 2010, researchers of our group used a new approach to the enrichment of bacterial cDNA for analysis of M. tuberculosis gene expression in lung tissues of infected mice . The data obtained by us on quantitative and qualitative composition of the bacterial transcriptome were in good agreement with similar data of the Talaat's group (Talaat et al., 2007).

Examination of the M. tuberculosis transcriptome in human tissues
The report of Rachman et al. published in 2006 is thus far the only work in which M. tuberculosis gene expression was studied directly in the human lungs (surgical samples) (Rachman et al., 2006). In this work, the pathogen's whole genome gene expression profiles from granuloma, pericavitary lung tissue and morphologically normal lung tissue were obtained

Intracellular M. tuberculosis transcriptome
The data obtained by different authors in in vivo systems allow to single out a number of the main transcriptome features characteristic of mycobacteria persisting in macrophages. Those are primarily changes in expression of the genes involved in pathogen adaptation, as well as of the genes encoding different factors of immune response. In macrophages, mycobacteria are localized within phagosomes which makes a barrier for the immune system components, but, at the same time, complicates the access of the pathogen to nutrients and microelements. Changes in M. tuberculosis gene expression are aimed primarily at forming of the environment able to in vivo maintain the functional activity of mycobacteria.

Lipid metabolism
Lipid metabolism is a key process for M. tuberculosis which is directly or indirectly confirmed by the presence in the genome of a considerable number of lipid metabolism genes, indispensability of some genes as follows from data of transposon mutagenesis (Sassetti et al., 2003), decrease or lack of virulence observed in M. tuberculosis strains mutant for the genes of this functional category (Neyrolles & Guilhot, 2011), and by other factors.
One of the most characteristic changes in the expression of the genes involved in lipid metabolism is activation of the expression of the genes from clusters fadA, fadB, fadD, fadE and echA Munoz-Elias & McKinney, 2006;Schnappinger et al., 2003;Tailleux et al., 2008). These genes encode enzymes involved in -oxidation of fatty acids and the catabolism of cholesterol. The final products of their activity are acetyl-CoA and propionyl-CoA, the participants of the methylcitrate and tricarboxylic acids cycles.
Enhanced expression of the isocitrate lyase gene icl1 is also related to the methylcitrate cycle. Isocitrate lyase is a key enzyme of the glyoxylate cycle that is activated when the main carbon source are fatty acids. During this process, activated acetate (acetyl-CoA) is being stepwise converted into malate through the stage of glyoxylic acid formation. Malate can be converted into piruvate by the enzymatic activity of the pckA gene product, whose enhanced expression was also observed in vivo in mycobacteria (Marrero et al., 2010;Schnappinger et al., 2003;Tailleux et al., 2008). Apart from the maintenance of the glyoxylate bypass, the activity of isocitrate lyase is also needed to utilize cytotoxic propionyl-CoA accumulated during the life cycle of mycobacteria (Savvi et al., 2008). This utilization is possible due to the ability of isocitrate lyase to function as 2-methylisocytrate lyase that facilitates the conversion of propionyl-CoA into succinate. Propionyl-CoA can be also metabolized by a conversion into methylmalonyl-CoA and then into succinate, or included into certain components of the cell wall, such as phthiocerol dimycocerosate (PDIM) or sulfolipid-1 (SL-1) .
Among other genes of lipid metabolism expressed predominantly in vivo, there are desA genes encoding desaturases of fatty acids Li et al., 2010;Rachman et al., 2006;Schnappinger et al., 2003). The papA and pks genes, whose protein products are needed for synthesis of polyketides as components of the M. tuberculosis cell wall (Bhatt et al., 2007;Hatzios et al., 2009;Sirakova et al., 2001), are practically always transcribed in experiments. However, the level of their transcription in vivo varies, possibly reflecting variations in the lipid metabolism depending upon specific conditions Homolka et al., 2010;Rohde et al., 2007;Tailleux et al., 2008). Interestingly, transcription of these genes is decreased in the avirulent M. bovis BCG and M. tuberculosis H37Ra strains as compared with the virulent M. tuberculosis H37Rv strain (Li et al., 2010;Rohde et al., 2007).

Energy metabolism: Cell respiration
According to data obtained in studies of in vivo M. tuberculosis gene expression, energy metabolism of mycobacteria is undergoing a significant transformation during infection process. A characteristic of this transformation is a gradual decrease in the level of the type I NADH dehydrogenase (nuoA-N) gene expression and increase in expression of the nitrate reductase gene cluster narGHJI and of the narK2 gene, the product of which is a nitrate transporter protein Schnappinger et al., 2003;Tailleux et al., 2008). Such a metabolic shift most probably suggests that ETC is being reoriented to the using of nitrate electrons as a finite acceptor. Also, in most cases, aa3 type cytochrome c oxidase (ctaBECD) and cytochrome c reductase (qcrCAB) gene expression is downregulated (Garton et al., 2008;Schnappinger et al., 2003;Shi et al., 2005).

Protein biosynthesis and cell growth
Decreased expression of ribosomal protein genes (rpl, rpm, rps) indicates a reduced need for the synthesis of new proteins. Usually this phenomenon occurs in conditions non optimal for the pathogen (dendrite cells, activated macrophages) and correlates with decreased expression level of the ATP-synthase (atpA-H) gene and slowdown of M. tuberculosis replication Tailleux et al., 2008).

Defense mechanisms, DNA replication
The compartment (early phagolysosome) of M. tuberculosis residing at persisting in macrophages represents a rather non-aggressive environment with practically no hydrolytic activity and pH 6.2-6.4. Nevertheless, the mycobacteria are under the influence of many stress factors like active forms of oxygen and nitrogen or the apoptotic death of the host cell. The effect of stress factors induces an upregulation of genes of the DNA repair and recombination (dinF/G) systems (Rachman et al., 2006;Schnappinger et al., 2003;Talaat et al., 2004), as well as chaperon genes (groES, groEL1/2, dnaJ/K, hspX) (Fontan et al., 2008;Garton et al., 2008;Homolka et al., 2010;Karakousis et al., 2004;Rohde et al., 2007;Tailleux et al., 2008). Certain data indicate that this effect is not a specific reaction on intracellular conditions, but part of adaptive response to stress (Boshoff et al., 2004;Waddell et al., 2004).

Cell wall, membrane, and transport
The cell wall and inner plasma membrane are components of the complex cell envelope of mycobacteria. The inner plasma membrane contains a lot of transport systems. Some genes of these systems were observed to be upregulated, among them the irtA/B genes encoding carboxymycobactin transporters (Li et al., 2010;Schnappinger et al., 2003;Tailleux et al., 2008), the sugI, Rv2040c and Rv2041c genes of carbohydrate transporters Schnappinger et al., 2003;Tailleux et al., 2008), and the narK2 gene of the nitrate transporter protein Garton et al., 2008;Tailleux et al., 2008;Talaat et al., 2007).

Factors of Mycobacterium tuberculosis virulence
The M. tuberculosis virulence depends upon various metabolic processes that provide for successful infection process. The pathogen survival in host cells directly depends on avoiding host defence mechanisms and getting nutrient substances from host tissues for the existence and reproduction. A considerable damage of host tissues and organs at later stages of infection is also needed as it facilitates dissemination of the infection. We will consider the factors that directly affect the host organism in order to suppress host defence mechanisms. One of such immune response modulators is the ESAT-6 protein (Sorensen et al., 1995) encoded by the esxA gene and secreted by the type VII secretion system (T7SS) ESX-1 (Abdallah et al., 2007;Simeone et al., 2009). ESAT-6 is one of the immune modulation key factors suggested to be involved in lysis of phagolysosome membranes and macrophage outer membranes thus facilitating spread of mycobacteria from one host cell to another (de Jonge et al., 2007;Kinhikar et al., 2010). The in vivo expression of the esx cluster genes is under constant control and can be both decreased and increased depending on the conditions (Fontan et al., 2008;Rohde et al., 2007;Schnappinger et al., 2003;Tailleux et al., 2008). One more important system of immune response modulation is the expression of the PE/PPE family proteins that possess immunogenic properties Sampson, 2011;Voskuil et al., 2004). The level of their expression is also dependent upon specific conditions Fontan et al., 2008;Schnappinger et al., 2003;Tailleux et al., 2008;Voskuil et al., 2004).

Transcription regulation
The adaptation to changing conditions of the environment is mainly underlain by the activity of signal and regulatory systems of the bacterium, and therefore by changing expression of transcription regulatory systems' genes. Among genes of the regulatory systems, 13 genes of sigma factors (Manganelli et al., 2004;Rodrigue et al., 2006) are of special interest. Their differential expression was repeatedly observed in in vivo experiments. For instance, sigH was upregulated in artificial granulomas in mice and in human macrophages (Karakousis et al., 2004), whereas sigB and sigE were also upregulated www.intechopen.com Mycobacterium tuberculosis Transcriptome In Vivo Studies -A Key to Understand the Pathogen Adaptation Mechanism 135 in mouse phagosomes (Rohde et al., 2007), as well as in artificial mouse granulomas and in the mouse lung, respectively (Karakousis et al., 2004;Talaat et al., 2004).
Other transcription regulatory genes whose upregulation was observed in in vivo experiments are whiB3 (Fontan et al., 2008;Rohde et al., 2007), ethR, ideR, kstR and relA (Fontan et al., 2008;Schnappinger et al., 2003;Tailleux et al., 2008). In addition, M. tuberculosis has 12 two-component regulatory systems (Tucker et al., 2007). Two of them, phoPR (Gonzalo-Asensio et al., 2008) and dosRS (devRS) (Park et al., 2003), were studied more thoroughly than others. It was shown that functional activity of the phoPR system supports the M. tuberculosis virulence by regulating the metabolism of complex lipids and the work of the ESX secretion systems (Gonzalo-Asensio et al., 2008). The positive transcription regulator phoP was observed to induce genes under its control in a low-acid (pH 6.4) environment of mouse macrophage phagosomes (Rohde et al., 2007). Not less important is the two component regulatory system dosRS responsible for expression regulation of about 50 genes (Park et al., 2003). Enhanced expression of these genes was repeatedly observed in the course of mycobacterial infection of macrophages (Fontan et al., 2008;Rohde et al., 2007;Schnappinger et al., 2003;Tailleux et al., 2008). Moreover, dosRS regulon genes were expressed practically in all other conditions: in M. tuberculosis from artificial mouse granulomas, samples of mouse lung tissues and surgical samples of human lung, as well as in samples of sputum and in some experiments in vitro (Garton et al., 2008;Homolka et al., 2010;Karakousis et al., 2004;Li et al., 2010;Talaat et al., 2007;Timm et al., 2003). The functional role of this regulon is still not quite clear, but its activity was suggested to be important for M. tuberculosis adaptation to variations in the redox status during the infection process (Bacon et al., 2004;Bacon & Marsh, 2007;Rustad et al., 2009).

Profiling of Mycobacterium tuberculosis gene expression during infection in genetically different mouse models
We have carried out a comparative study of M. tuberculosis transcriptomes in order to reveal the features of expression profiles that correlate with progressing disease, and also to understand the difference between efficient and defective defence mechanisms at the level of bacterial gene expression. To this end, at different stages of the infection process, we performed a comparative quantitative and qualitative analysis of the sequences transcribed during infection of mice sensitive (inefficient immune response) and resistant (efficient response) to these bacteria .
We have compared transcriptomes of M. tuberculosis H37Rv in infected mice of two lineages, I/StSnEgYCit (I/St) and C57BL/6JCit (B6). These lineages have been earlier described in detail (Kondratieva et al., 2010), and the B6 lineage was shown to be more resistant to the infection by M. tuberculosis than I/St. In particular, the infection process in B6 mice was less aggressive, and the infected mice had a longer survival.
Female mice of both lineages were aerogenically infected with M. tuberculosis bacteria. In 4 and 6 weeks post infection, the infected mice were killed, and total lung RNA isolated. Samples of total RNA from lung tissues of I/St and B6 mice were used to synthesize cDNA enriched in fragments of bacterial cDNA using coincidence cloning procedure . As a result, three cDNA libraries were obtained, which represented transcriptomes of M. tuberculosis from lung tissues of I/St mice on week 6 post infection (СС6(SUS)) and from lung tissues of B6 mice on weeks 4 and 6 post infection (СС4(RES) and СС6(RES), respectively). The libraries were analyzed using the 454 pyrosequencing technology. A general scheme of the experiment is shown in Fig. 2, and general characteristics of the libraries analyzed are presented in Table 1. In total, sequences of 190031 cDNA fragments were determined: 73410 from CC4(RES), 75655 from CC4(SUS), and 40966 from CC6(RES). The results obtained demonstrate that the technology used allowed us to considerably enrich the cDNA samples with bacterial sequences. The distribution of the expressed genes between functional categories is shown in Fig. 3. Mobile elements (insertion sequences and phages) were excluded from the analysis.

Genes whose expression is enhanced during infection
The comparison of M. tuberculosis transcriptomes during infection in a genetically stable mouse lineage (СС6(RES) vs CC4(RES)) and in genetically different mice (СС6(RES) vs CC6(SUS)), described above, was aimed at the search of genes whose expression is enhanced in the course of infection, specifically in B6 mice on week 6 as compared with www.intechopen.com

Genes upregulated only in one of genetically different hosts
We found 17 genes with enhanced expression in CC6(RES) vs CC4(RES) and 44 genes in CC6(RES) vs CC6(SUS). Such a statistics probably reflects the fact that in the first case bacterial genes are expressed in one and the same microenvironment, whereas in the second case microenvironments are different resulting in a greater number of genes upregulated.
Genes, the expression of which is enhanced in the CC6(RES) sample only relative to CC4(RES), mostly belong to categories of cell wall and cell processes, intermediary metabolism and respiration, and lipid metabolism. The protein products of 12 out of 17 genes were detected in a fraction of cell membrane or cell wall where they mainly fulfil transport and defence functions. For example, the embA gene codes for indolylacetylinositol arabinosyltransferase EmbA involved in the synthesis of arabinan, and mutations in this gene cause resistance to ethambutol. Also, the Rv3273 gene encodes carbonate dehydratase that participates in sulfate transport (TubercuList, http://tuberculist.epfl.ch). In the analysis performed, we failed to detect metabolic pathways specifically activated at later infection times as compared to early stages.
Comparing the bacterial transcriptomes within different hosts we found greater variety of biochemical pathways. An increase in energy exchange is reflected in enhanced expression of the genes of three NADH-dehydrogenase subunits (nuoH, nuoI, nuoL), as well as in increased activity of the tricarbonic acid cycle and in upregulation of the Rv1916 gene. The Rv1916 gene is the second part of the aceA (icl2) gene which in M. tuberculosis H37Rv is divided into two modules, Rv1915 and Rv1916 (aceAa и aceAb), each of which can be expressed independently. Among other important differences, one can highlight enhanced expression of genes, the products of which are responsible for lipid and amino acid metabolism and catabolism (lipV, lipF, Rv2531c), and genes of DNA repair enzymes (recO, recB).

www.intechopen.com
Such a picture is quite predictable as the microenvironment in a resistant host is a hostile habitat which can explain the need in more active repair systems. Increased gene expression of lipolytic enzymes (lipF, lipV, plcA), enzymes of the tricarbonic acid cycle and aceAb may suggest a forced usage of lipids as the source of energy and carbon.

CUGs -genes needed for M. tuberculosis adaptation to different host defense mechanisms
We have revealed 209 genes upregulated in both comparisons. According to the results of transposon mutagenesis, the products of 44 out of these genes are essential in M. tuberculosis (Sassetti et al., 2003), the list of these genes is given in Table 2. Rv3569c, Rv3537, and Rv3563 were earlier shown to be essential for survival in mouse macrophages (TubercuList, http://tuberculist.epfl.ch) A bit less than one third of the 209 genes belong to the conserved hypothetical (59 genes) and unknown (2 genes) categories. In spite of unknown functions, the genes of these categories may be considered potential therapeutic targets, since their low homology to genes of other microorganisms suggests that they are characteristic just of mycobacteria or even M. tuberculosis.
The function of the PE/PPE family proteins is not quite clear, but they are suggested to be needed for antigenic variability in mycobacteria (Karboul et al., 2008). Nevertheless, the Rv0152c and Rv0355c genes had a high expression level in the CC6(RES) sample, and they were also expressed in the CC4(RES) and CC4(SUS) samples. Rv3135 encodes a protein essential in M. tuberculosis H37Rv that may suggest some additional functions beyond antigenic variability One more feature of CUGs is an increased activity of amino acid metabolism pathways. It is not clear if the stimulation of this metabolism enzyme expression is due to the absence of available amino acids (and the necessity of their synthesis) or their availability (and the possibility to use them). Poor nutrient conditions of the environment are supported by a high level gene expression of various systems of acquisition and accumulation of nutrients, e.g. such as phosphate (pstS1) or iron (irtA, mbtC, mbtE, mbtF). A shortage of phosphate is indicated by enhanced expression of the senX3 gene, a sensor component of the senX3 regX3 two-component system that activates the so called stringent response under phosphorus deficiency. The expression of lipid metabolism genes (fadD, fadE, lipU, lipJ) suggests a switch to using lipids as a major source of energy and carbon. Enhanced expression of the narH and narK3 genes implies a switch to anaerobic respiration characteristic of latent infection. Finally, the secA2 gene is also worth mentioning. This gene codes for translocase SecA2 which is a component of the M. tuberculosis secondary transport system Sec that provides for, in particular, secretion of superoxide dismutase SodA and catalase katG. A live vaccine based on an M. tuberculosis mutant for the secA2 gene (Hinchey et al.,2011) showed high efficiency and safety in animal trials. Summarizing, it can be said that CUGs reflect characteristic features of infection in a mouse model. An exception is increased expression of the atpF and atpH genes, although, according to some reports, their expression decreases in the course of infection as energy demand of the pathogen goes down upon transition to the state of latent infection.  Table 2. CUGs, found as essential according (Sassetti et al., 2003)

Conclusion
Infectious diseases caused by intracellular pathogenic bacteria represent a significant challenge to health care. The course of the infection depends not only on the protective mechanisms, but also on the specific expression of bacterial genes. Altered expression as a response to the immune reaction of the host organism is critical for the survival and functioning of pathogenic bacteria. Understanding of M. tuberculosis transcriptional responses to different stimuli and aggressiveness of the environment gives the opportunity to describe the adaptation mechanisms necessary for bacterial successful survival and colonization of the host.
M. tuberculosis transcription profiling obtained in different conditions allows to define the core set of adaptive genes (we called it "commonly upregulated genes"), which characterize different phases of M. tuberculosis intracellular life -from primary infection through latency to reactivation. The expression of these genes can be considered as a universal reaction of mycobacteria to various stress factors of the environment. Accumulation and analysis of such data is the surest way to proceeding and developing effective approaches towards diagnostics and treatment of tuberculosis.