The treatment of a number of diseases can be achieved through gene addition therapy, where curative transgenes are established within the patient’s cells after delivery with viral or non-viral vectors. The defective cells requiring treatment are typically differentiated; these cells or their progenitors can be targeted for therapeutic gene transfer. However, as the abundance of progenitor cells varies between different tissues and in the same tissue during the fetal, neonatal and adult stages of development, the scarcity of a particular progenitor cell pool, the paucity of spontaneous departures of progenitor cells down differentiation pathways and unclear differentiation induction conditions can complicate genetic therapeutic intervention via these cells. Nevertheless, gene transfer to progenitor cells can be a preferred option when differentiated cells are either poorly accessible for the vector or, once differentiated, are defective beyond repair by gene therapy. Genetic conditions with considerable value in therapeutic gene transfer to progenitor cells include cystic fibrosis (CF) and severe combined immunodeficiency (SCID).
The delivered transgenes can integrate into the chromosomal DNA, replicate episomally or persist as non-replicating episomal elements in non-dividing cells. Depending on the properties of the transgene expression cassette, particular features of specific transgene integration sites and the state of the individual recipient cells, the transgenes are expressed with varying degree of efficiency. On some occasions, the transgenes are permanently silenced immediately after introduction, on other occasions transgene silencing occurs only after a certain period of adequate expression and on still other occasions transgene expression varies dramatically among the individual clones of transgene-harbouring cells. Such variation is thought to be mainly due to the transgene’s interaction with its immediate genetic neighbourhood within the host genome; a phenomenon, which is similar to ‘position effect variegation’ in normal development caused by spontaneous clone-wise silencing of some resident genes . Typical position effect variegation is epigenetic instability and should be distinguished from variegation due to somatic mutations, e.g. due to variations in the length of polynucleotide repeat expansions  or due to the sorting of mitochondrial genomes in mitochondrial heteroplasmia . The element of randomness, which is inherently present in position effect variegation, should not come as a surprise. In fact, stochastic fluctuations of gene expression are typical both at the level of variation between different cells of tissue and at the level of temporal variation within one cell. Both of these modes of variation are essential for normal differentiation and tissue-patterning with the input of stochastic variation being decisive when a developmental signal is present at a near-critical level. For the gene therapist, it is important that the permanent silencing of transgene expression can occur both in postmitotic target cells and target cells undergoing clonal expansion, while variegation is typically associated with clones of dividing cells. Stable long-term transgene expression in differentiating cells is particularly challenging. In fact, the introduced genes are subject to the pre-existing and developing gene expression patterns in the target cells, which can override the signals from the transgenes’ own regulatory elements and, thus, can cause transgene expression shutdown. Indeed, at a transcriptional level, the changing scenery of transcription initiation factor pools, chromatin re-modelling and DNA methylation events during differentiation contribute to the transiency of transgene expression.
Genomes in general and, in particular, mammalian genomes have a mosaic organisation with functionally related genetic elements often being in close physical proximity. There are three teleological reasons for this: 1) expediency of genetic exchange; 2) straightforward temporal control of gene expression; 3) economy of energy, enzymes and other factors serving the genetic elements. The second and the third of these reasons are also sufficient for the existence of a finely patterned 3D-arrangement of DNA in interphase nuclei, simplifying the functional interactions between distant genetic elements, e.g. interactions regulating gene expression. It is intriguing to propose that the need to orchestrate gene expression in time and the economy need are also driving the astonishing interconnectedness of all gene silencing mechanisms, which we shall address in this chapter.
The gene therapist should take advantage of the pre-existing regulatory moduli present in the target cells and should also supply the transgenes with their own expression control elements. The regulatory elements required for reliable, long-term and tissue-specific transgene expression include minimal promoters, enhancers, regulatory introns and locus control regions. The functional arrangement of all these elements is ultimately achieved in 3D. This should be borne in mind, when 2D assemblies of regulatory elements are called ‘promoters’. Some ‘promoters’ are, in fact, motley artificial chimeras. For example, a fusion between a human cytomegalovirus (CMV) immediate-early enhancer and chicken beta-actin promoter, exon1 and intron1 is called ‘CBA promoter’ or ‘CAG promoter’ .
In general, in the majority of situations in gene therapy, transgene silencing and variegation are undesirable. We review here different factors, both host-dependent and vector-dependent, which are known to contribute to silencing and variegation of transgene expression and which should be taken into account where choosing or designing effective gene therapy vectors and strategies for their administration.
2. Host genetic factors of silencing and position effect variegation
Patterns for maintaining gene repression or activation are governed by regulatory machinery acting at multiple levels: 1) transcription; 2) mRNA processing, export from the nucleus, translation and degradation; 3) protein folding, modification, transport and degradation. Control of gene expression is well-coordinated and highly hierarchical, with the control of transcription initiation situated at the top of the regulatory ladder. A number of interacting instruments of transcriptional gene activation and silencing in mammals are known: DNA methylation (e.g. methylation within CpG-islands of promoters), amino acid sequence variants of histones, covalent modifications of histones, histone-binding proteins (e.g. powerful inhibitors of gene activity from the Polycomb Group) and combinations of transcription initiation factors specific for particular tissues and developmental stages. The pivotal point is the access of the transcription machinery to DNA, which is regulated via DNA methylation and chromatin remodelling. With some simplification, it can be generalized that ‘coarse tuning’ of gene expression (e.g. long-term silencing) is provided by DNA methylation, ‘medium tuning’ is provided by chromatin remodelling and ‘fine tuning’ is achieved via various transcription factors and a multitude of other regulatory devices.
The various branches of the regulatory machinery play their own particular roles and yet are inherently interconnected. As detailed below, a prime example of this is the deep involvement of the miRNA pathway both in mRNA degradation and in the establishment of chromatin methylation patterns .
2.1. The role of DNA methylation in silencing
DNA methylation is an important epigenetic mark involved in cell differentiation and organ and tissue development, which plays a crucial role in the establishment of genomic imprinting (parent-dependent silencing of alternative alleles) in both male and female germ lines. However, in gene transfer experiments, the methylation of transgenes was shown to be just one ingredient in the dynamic interplay of various factors responsible for silencing and variegation .
De-novo methylation patterns in humans are established mainly on implantation and in gametogenesis. Two DNA (cytosine-5)-methyltransferases, DNMT3A and DNMT3B, play an essential role in de-novo methylation while DNMT3A in cooperation with the auxiliary protein DNMT3L is responsible for imprinting. There is still much we do not know about the manner in which the inactive state of the imprinted chromosomal domains is achieved and what factors trigger this type of silencing. The available evidence indicates that ‘Smc hinge proteins’ can be particularly important in epigenetic silencing . Thus, in studies based on X-linked GFP transgene silencing, the SmcHD1 gene was shown to play a critical role in X-chromosome inactivation in mammals [8,9]. The recruitment of SmcHD1 to the X-chromosome may involve the non-coding Xist RNA, proteins from the Polycomb group and DNA methyltransferases .
2.2. The role of histone variants and histone modifications in silencing
There are two types of structural variations among histone molecules. Firstly, there are low abundance species of histones with unusual amino acid sequences, so-called histone variants. Secondly, histones are amenable to standard covalent protein modifications such as acetylations and methylations of specific amino acid residues. Both structural variations are known to play important roles in the regulation of gene expression activity.
Regions of constitutive heterochromatin are particularly prone to encroaching on the transgene in a variable pattern in different cells and, thus, to interfering with transgene expression. Different loci in human chromosomes have a variable tendency to become involved in heterochromatin structures. For example, chromosomes’ centromeres and telomeres are typical regions of heterochromatin, which are known to expand occasionally, inducing steady or intermittent silencing. In the case of centromeres, the silencing machinery might involve the histone variant CENP-A, which is found exclusively in centromeres. Other histone variants could also play a role in silencing. Thus, the histone variant macroH2A appears to be important in gene silencing on the inactive X-chromosome. In contrast, the histone variants H2A.Z and H3.3 are known to be conducive for transcription.
DNA methylation and histone modifications are closely linked to chromatin remodelling and are often jointly implicated in gene silencing and position effect variegation. Using an in vivo mammalian model for position effect variegation, Hiragami-Hamada and co-workers  extensively investigated the molecular basis for the stability of heterochromatin-mediated silencing in mammals. Comparison between two transgenic lines, containing different numbers of copies of human CD2 transgenes integrated within or close to a block of the pericentric heterochromatin, revealed that the variegation of CD2 expression is indeed associated with both genomic DNA methylation and histone modifications such as H3K9me3. However, DNA methylation was the key modification that accompanied the formation of an inaccessible chromatin structure and more stable gene silencing [12,13].
2.3. Silencing mediated by Polycomb proteins
Silencing can be mediated by proteins from the Polycomb group (PcG). These proteins can form giant complexes, which are tethered to histones and regulatory DNA sequences called Polycomb Response Elements (PREs). When the PcG proteins bind histones, they suppress all the gene expression activity in the respective area of chromatin. In mammals, PcG proteins are known to be involved in cell differentiation and tissue formation and also to contribute to tumorigenesis, genomic imprinting, stem cell maintenance and aging [14-16]. The emerging picture from fundamental research suggests that counteracting PcG repression can only be achieved by a combination of multiple inputs converging at chromatin . Besides the normal requirement for the recruitment of transcription factors and co-activators, the genomic targets of PcG proteins require the activity of specific demethylases and methyltransferases for the gene expression to proceed .
Importantly for gene therapy, PcG protein complexes have been recently demonstrated to be able to repress transcription activity in genomic repeats and some transgenes .
2.4. Tissue specific and developmental stage specific transcription factors
There are two types of transcription factors: 1) auxiliary proteins, which bind other proteins in the transcription complex; 2) DNA-binding sequence-specific transcription factors. The latter type can straightforwardly be recognised in silico by the observation of some distinct patterns within the DNA-binding domains of transcription factors, e.g. the zinc-finger motif, the helix-loop-helix motif or the leucine-zipper motif. In silico analysis, e.g. using Biobase software (http://www.biobase-international.com), is currently also a method of choice for pinpointing transcription factor binding sites and, therefore, for predicting gene expression activation patterns.
2.5. Silencing mediated by non-coding RNAs
It has become clear that non-coding RNAs have an important bearing on gene and transgene expression. In general, there are several mechanisms for the regulatory effects of non-coding RNAs in gene expression. The two most important control points appear to be the direct regulation of transcription initiation and the regulation of mRNA degradation through RNAi by miRNAs. Recent findings revealed that non-coding RNAs are critical factors in the recruitment of PcG members to the cell chromatin [20,21]. At the same time, the miRNA pathway turned out to be significant in establishing the DNA methylation and histone modification patterns [5,22].
In animals, small RNAs, namely piRNA species, which are typically 24-32 nucleotides in length, have been shown to mediate genomic DNA methylation. These non-coding RNAs associate with Piwi clade proteins from the Argonaut superfamily and act analogously to the well-documented RdMD complexes in plants. The primary role of piRNA in many animals appears to be the silencing of retrotransposones via DNA methylation in germ lines. In fact, the lack of transposons’ suppression in spermatogenesis often results in defects and the loss of germ cells with age. Although it is not clear whether the same mechanism is responsible for the protective silencing of viral genomes after viral infections of mammalian cells, the small RNAs are likely to be involved in de-novo methylation of viral DNA through a similar mechanism. Thus, small noncoding RNAs could potentially provide a flexible regulatory link between transgene recognition, PcG proteins recruitment and transgene silencing through DNA methylation, histone modifications and chromatin remodelling.
It appears that, in general, regulation via RNAi has a smaller long-term influence on gene expression than histone modifications and DNA methylation, acting rather as a rapid response system. Indeed, it would be too energetically inconvenient for cells to synthesize mRNA and then to destroy it on a permanent basis.
3. Gene vector properties, which are known to contribute to transgene silencing
Long-term transgene expression is highly desirable for most gene therapy applications. However, it is a relatively common occurrence for transgene expression to die out both in terms of the decrease of the efficiency of expression in individual cells and in terms of the reduction of the fraction of expressing cells.
A wide variety of vectors can be used for the delivery and establishment of transgenes and their control elements. Some of the vectors, so called ‘viral vectors’, are generated using a top-down approach by piggy-backing on the natural gene transfer machinery of viruses. In contrast, ‘non-viral’ vectors are either pure nucleic acids or synthetic nano-particles, which are generated using a bottom-up strategy. A pivotal feature of any gene therapy vector (with the obvious exception of cytoplasmic-only vectors such as mRNA-based vectors) is the final localization of the delivered transgenes in the nuclei of the target cells. In general, transgenes can be integrated into random chromosomal sites, integrated into pre-selected chromosomal sites and/or left to exist episomally. Specialized molecular machinery for efficient random integration is born by retroviral vectors , lentiviral vectors and eukaryotic transposon vectors. Although the bulk of the DNA delivered with non-transposable plasmid, minicircle and PCR-generated vectors stays episomally, some of the vector DNA also randomly integrates into the chromosomal DNA. The genetic neighbourhood at a transgene integration site has an important bearing on the temporal profile of transgene expression. Nevertheless, many factors that determine the susceptibility of transgene to silencing are defined by the properties of the employed vector, transgene and co-introduced expression control elements.
Multimeric transgene inserts were reported to induce silencing . Unfavourably, even if a gene vector delivers monomeric DNA, spontaneous chromosomal integrations often result in vector DNA multimers (it remains unclear whether the multimers are formed before or after the initial integration event). Silencing due to repetitive DNA was also demonstrated when the introduced DNA contained trinucleotide repeat expansions . This result has an implication for the gene therapy of recessive polyglutamine diseases, as therapeutic transgenes can contain triplet expansions of some minimal length. The precise mechanism for silencing through the recognition of multimeric transgenes and trinucleotide repeats in the host genomic DNA still remains unclear.
Transgene silencing is often blamed on the malfunction of foreign gene expression control elements. Indeed, this phenomenon is sometimes referred to as ‘promoter shut down’. Certainly, different promoters vary in their ability to maintain long-term transgene expression in specific cell populations. In particular, there is a clear tendency for some promoters to turn off in cells where they are not normally active. The mechanisms for such effects can be quite indirect. Thus, the ubiquitous CMV promoter can activate transgene expression in antigen-presenting cells with the ensuing immune response and elimination of all vulnerable transgene expressing cells .
Some bacterial plasmid backbones are known to cause transgene silencing [27-29]. In addition, bacterial plasmid backbones interfere with gene delivery into human cells after DNA administration in vivo because of the innate TLR9-receptor-mediated immune reaction to unmethylated bacterial ‘CpG-motifs’ within these backbones. In an attempt to alleviate the immune reaction, methylation of these sequences in vitro was attempted. Disappointingly, on some occasions the methylation of plasmid gene vector DNA resulted in increased silencing of transgene expression . The depletion or ablation of CpG motifs from bacterial plasmid backbones is known to substantially reduce their immunogenicity. The effects of CpG-depletion and ablation on transgene silencing are expected, but the available data on this issue are currently quite limited.
Bacterial lypopolysaccharides (LPS) often co-purify and contaminate plasmid gene vector DNA. These endotoxins can substantially reduce the efficiency of transfection in vitro [31,32] and in vivo, where LPS are known to induce a TLR4-receptor-mediated innate immune response. Bacterial endotoxins exhibit a profound effect on cellular regulatory networks . Therefore, it is possible that tilting cells towards ‘transgene-silencing mode’ is an important contributing factor in the endotoxin-mediated inhibition of transfection.
4. Therapeutic gene vectors and the strategies for their use, which are employed to avoid transgene silencing
Stable long-term transgene expression depends on the intertwined issues of reliable maintenance of transgenes in target cells and a robust policy to prevent undesired transgene silencing. In general, these two issues are to a large extent under the control of the gene therapist, as both of them can be addressed through the gene vector design and the delivery mode. The regulation of gene expression in eukaryotic cells is exceptionally complex and multi-faceted. As a result, the strategies used to achieve sustainable transgene expression should address multiple possible reasons for the transgene expression shutdown.
4.1. Employment of cytoplasmic-only (non-nuclear) vectors
As most silencing mechanisms are nuclear-based, gene vectors with direct cytoplasmic expression, which are not required to enter the nucleoplasm, are well-positioned to avoid silencing. Thus, non-viral mRNA vectors  or positive strand RNA-based viral vectors such as Sendai virus based vectors  can be employed. In addition to the escape from silencing, the advantages of extra-nuclear-delivery vectors include relatively fast transgene expression and the absence of potentially mutagenic genomic insertions. The downside is that transgene expression using such vectors is never long-term because of the eventual degradation of RNA in cells and because of RNA dilution in the dividing cells. Moreover, the fundamentally low fidelity of RNA replication undermines efforts to generate artificial vector systems with replicating RNA episomes. The key upside is that low immunogenicity and minimal toxicity of such vectors accommodate their repeated administration well.
4.2. CpG ablation, CpG depletion and minimized DNA vectors
The methylation of chromosomal DNA is one of the most powerful mechanisms for the shut-down of gene expression. Thus, the design of gene therapy vectors should take into account the amenity of the vector sequences to methylation. Firstly, the purposeful exclusion of entire methylation-prone CpG islands should be considered. Secondly, CpG-depleted or CpG-ablated modules, produced through the point-wise replacement or removal of CpG dinucleotides, should be taken advantage of. The generation of functionally active CpG-ablated sequences is fairly laborious; the CpG-ablated gamma replicon from the bacterial plasmid R6K and some antibiotic-resistance genes are available from Invivogen.
Clearly, as repetitive sequences are known to induce silencing, their use in therapeutic gene vectors should be avoided as far as possible.
A common way to reduce the chances of transgene silencing is to shorten the auxiliary vector sequences outside of the therapeutic transgene expression cassette. For example, the plasmid selection markers can be very short indeed . In fact, a plasmid replication origin can be re-utilised as a plasmid marker using the ‘plasmid addiction’ phenomenon .
The trend to exclude unwanted sequences from gene transfer vectors led to the generation of specialized minimized DNA vectors. The most tested versions of such vectors are DNA fragments amplified in vitro using polymerase chain reaction (PCR) , plasmid-derived linear terminally looped ‘midges’  or circular supercoiled ‘minicircles’ . Minicircle vectors are produced by intramolecular site-specific recombination within bacterial plasmids. The superior efficiency of gene delivery and the longevity of transgene expression achieved with minicircle DNA was observed in multiple studies (e.g. ). The production of minimized DNA vectors is a biotechnological challenge. For example, advanced methods and bacterial strains were developed for efficient bacteria-based minicircle DNA production. The generation of PCR amplicons with Taq-polymerase is relatively inexpensive. However, the load of Taq-polymerase-introduced mutations may make one consider alternative in vitro amplification methods for the large-scale synthesis of double-stranded DNA, e.g. ligase chain reaction (LCR), which is based on the ligation of preassembled oligonucleotides.
The usual aim in the production of minimized DNA vectors is the removal of sequences of bacterial origin, such as plasmid backbone sequences, as they can be immunogenic and some of them were reported to cause silencing [27,29]. It should be emphasized that transgene silencing through the co-delivery of specific plasmid sequences should not be generalized to all plasmid sequences and each plasmid sequence or bacterial sequence needs to be tested individually. More research is required to identify the affected bacterial replicons and to pinpoint the mechanism for the induction of silencing by bacterial DNA sequences. Another avenue is the development of novel specialized forms of minimized vectors, such as ‘minivectors’ for RNAi-based therapy .
4.3. Judicious choice of tissue-specific, inducible and ubiquitous promoters to control transgene expression
Promoters are the gene expression control elements, which are typically co-introduced with therapeutic transgenes. In scientific literature, the word ‘promoter’ is often an umbrella term, which in addition to a minimal promoter also incorporates other linked genetic elements such as enhancers, transcription factor binding sites and even regulatory introns. Promoter is a key element of the regulatory machinery required for long-term non-silenced transgene expression. Different promoters vary in their strength, tissue specificity, specificity for particular developmental stages and ability to react to external stimuli (inducibility). Each therapeutic setting requires a thoughtful choice of a transgene promoter. Thus, some ubiquitous promoters are appropriate for consistent long-term transgene expression in differentiating stem cells passing through a number of developmental phases . Ubiquitous promoters are also appropriate in situations where the resident homologue of the therapeutic gene is naturally expressed ubiquitously . Tissue-specific promoters have been known for a long time to be instrumental for long-term transgene expression in terminally differentiated cells in the liver, vascular tissue, muscle and central nervous system . Inducible promoters are appropriate where the constitutive expression of the therapeutic transgene is undesired and/or where bespoke activation of the therapeutic transgene is required. In addition to the heavily used tetracycline-sensing promoter systems, inducible promoters can be activated by heat, light and gas-born acetaldehyde . Clearly, the construction and determined exploitation of new hybrid promoters can resolve many issues in transgene silencing.
4.4. Multiple transgene insertions into random chromosomal sites
Random integration of transgenes into chromosomes is typical for a number of gene delivery systems. Spontaneous chromosomal integration of vector DNA within target cells is not efficient. Thus, enhanced random chromosomal integration of plasmid gene vectors can be attained using genetic elements of eukaryotic transposons, retroviruses or lentiviruses (lentiviruses form a subgroup of retroviruses with a somewhat larger genome and the ability to infect non-dividing cells). However, many integration events occur in unfavourable genetic neighbourhoods resulting in the silencing of the respective copies of the transgenes. Hence, position-dependent silencing means that individual transfected or transduced cell clones differ in terms of the longevity of the transgene expression. Random chromosomal integration of transgenes tend to occur in transcriptionally active areas of the genome where heterochromatin condensation and DNA methylation are unlikely to interfere with transgene expression. However, as cells differentiate, the pattern of heterochromatization and DNA methylation changes and some of the transgenes find themselves in transcriptionally silent areas of the genome. Therefore, the shutdown of transgene expression is particularly common in cell populations undergoing differentiation. In these circumstances, it is certainly possible to increase the chances of long-term transgene expression by increasing the number of randomly chromosomally integrated transgenes through a higher concentration of vector and/or repeated rounds of vector administration. Thus, the gene therapist can aim to generate multiple copies of transgenes, indiscriminately integrated within the target genome, hoping that at least one of the copies will reside in a suitable chromosomal site that will be immune to silencing.
The employment of transposable genetic elements for efficient random integration of therapeutic transgenes was complicated by the fact that mammals do not have their own active or easily re-activatable transposons. Therefore, a number of heterologous transposons were adapted for use in human cells. Recombination machinery from Sleeping Beauty, PiggyBac, Tol2 and Mos1 transposons was shown to be capable of directing chromosomal integration of transgenes . Genes for transposases were either included within the cargo gene vector plasmid or were delivered into human cells on a separate plasmid. Mutant transposases with enhanced activity for random DNA integration were developed.
A caveat of the anti-silencing strategy relying on multiple transgene insertions into random chromosome sites is a possibility of potentially deleterious or tumourigenic mutations due to insertional mutagenesis. However, this drawback is irrelevant for highly differentiated and non-dividing cells where, firstly, only a limited set of gene products is required for cell survival and functional competence and, secondly, only a minimal risk is present for the selection of malignancies. In fact, many terminally differentiated cells are either polyploid or polynucleated; both of these statuses can alleviate the impact of insertional mutagenesis.
4.5. Site-specific chromosomal integration
One of the ideal scenarios, where transgene silencing is avoided, involves the transgene DNA being site-specifically integrated into a ‘benign’, silencing-resistant chromosomal site where there is little chance of transgene consumption by heterochromatin. Thus, targeting transgenes to a continuously active chromosomal locus can resolve the transgene expression shutdown problem. In particular, sites could exist within chromosomal DNA, where an integrated transgene would be immune to chromatin re-arrangements and other regulatory events during differentiation. A possible candidate site is the human homologue of mouse Rosa 26 locus, which is being successfully used to express various transgenes in mouse transgenic studies.
In principle, both transposases and retroviral integrases can be re-engineered into site-directed recombination enzymes through their fusion with appropriate site-specific ‘tethering’ domains . In addition to tethered transposases and retroviral integrases, the site-specific integration of transgenes into human chromosomes can be achieved via the modification of bona fide site-specific recombination systems.
Site-specific DNA recombination systems are comprised of recombinase enzymes, their co-factors and their cognate recombination sites. Site-specific recombination systems can be classified into two general types: irreversible and reversible ones.
Site-specific recombination machinery for irreversible recombination is typically borrowed from the chromosome integration systems of temperate bacteriophages. In integrative recombination systems there are two types of recombination sites, which are normally referred to as attP and attB. An archetypical example is bacteriophage lambda integrase (Int) catalysing a one-off recombination event between the lambda’s attP site and the chromosomal attB site. The reverse reaction, excision of prophage, is often possible; however, a separate enzyme or a separate subunit of bacteriophage integrase is normally required to catalyze the excision. The attB sites are typically shorter than the corresponding attP sites. Thus, in the recombination system from the Streptomyces coelicolor bacteriophage phiC31, attP is 39 bp long and attB is 34 bp long. Similarly, the recombination system from the Lactococcus lactis bacteriophage TP901-1 has 50 bp long attP and 31 bp long attB. Consequently, in artificial recombination systems within the mammalian setting, higher specificity of integration is achieved with longer attP sites positioned within the chromosomal loci. It has turned out that the human genome contains a close analogue of the phiC31 attP site. Extensive mutagenesis of the phiC31 integrase gene has produced versions of the enzymes with very high specific activity towards this native human site . Cell-permeable and nuclear targeted versions of phiC31 integrase were also created, these recombinant enzymes can be used to create transient, ‘hit-and-run’, recombinase activity in human cells that is required for the stable integration of therapeutic transgenes.
The typical original in vivo function of the reversible site-specific recombination systems is to preserve the monomeric status of a plasmid, prophage or episome via the resolution of circular DNA multimers to monomers; monomeric status is important for the maintenance stability of many plasmid replicons. Commonly used reversible systems include bacteriophage’s P1Cre recombinase with its cognate loxP sites and FLP recombinase (flipase) with its cognate FTR sites from the yeast Saccharomyces cerevisiae ‘2-micron circle’ episome. Many reversible systems were successfully used for the chromosomal integration of transgenes in pre-engineered cells. However, it should be noted that some site-specific recombination systems are fundamentally unsuitable for chromosomal integration strategies. Thus, ParA resolvase and MRS sites from the plasmid RK2 constitute a reversible system for intramolecular recombination; however, in this system there is no molecular recombination between MRS sites situated on separate DNA molecules.
Of course, the employed bacterial recombination systems have to be functional in eukaryotic cells . A potential pitfall to be aware of is that some of the site-specific recombinases require an additional co-factor; e.g., IHF (integration host factor) is an obligatory element for lambda Int/attB/attP system. Unexpectedly and encouragingly, at least on some occasions mammalian cells are able to provide suitable co-factors .
The wild type human adeno-associated virus type 2 (AAV2) is the only known human virus capable of site-specific chromosomal integration. AAV2 uses the chromosome-tethering strategy for genomic insertions. Expression of the Rep gene is required for integration of the viral genome into a unique DNA sequence within specific chromosomal loci. The Rep proteins of this virus bind both several Rep Binding Sites (RBS) within the viral DNA and the RBS sites in the human genome (known as AAVS1, AAVS2 and AAVS3) leading to preferential integration of the viral DNA in the genomic loci 19q13.42, 5p13.3 and 3p24.3.
An important step forward in the exploitation of the site-specific integration system of AAV was achieved when the AAV Rep protein was used to direct the integration of integrase-defective retroviral vectors into human 19q13.42 locus . The transfer of the locus-specific chromosomal integration apparatus of AAV2 to other vector types, e.g., plasmid gene vectors, can be accomplished as well .
4.6. Episomal localisation of a transgene
Episomal maintenance of transgene expression cassettes is an attractive strategy to escape the control of some resident gene regulation systems, such as chromatin remodelling machinery, over transgene expression. The problem with this approach is that viral replicons, e.g., compact episomal replicons from SV40, polyoma, papilloma viruses, which are often completely adequate for the research use of gene vectors, are rarely acceptable for therapeutic applications. Indeed, the expression of the large SV40 T-antigen and, hence, the malignant transformation of the recipient host cells is required for SV40-origin-based replication. Similarly, EBNA1-oriP DNA segment of Epstein-Barr Virus (EBV) can be used to support the maintenance of plasmid gene vectors in the nucleoplasm of dividing laboratory cells. Although EBNA1 expression does not result in a typical malignant transformation, it can still tilt the cells towards undesired immortalisation .
Alternative benign episomal replicons are being sought. Encouragingly, the scaffold/matrix attachment region (S/MAR) from the human β-interferon gene was reported to support non-viral episomal replication when coupled to a promoter . Thus, episomal maintenance mediated by S/MAR elements might be the reason behind the well-established beneficial effects of these elements on transgene expression [41,55,56]. Non-viral episomal vectors also include mammalian artificial chromosomes (MACs), which can be generated through both top-down and bottom-up approaches [57,58]. However, current progress with MACs is limited because of prohibiting costs associated with the generation of these vectors.
4.7. Employment of the locus control regions within the vectors
Protection of integrated transgenes from encroaching heterochromatin can be achieved with chromatin insulators or other cis-acting locus control regions (LCRs) . The mechanistic details of LCRs action are currently not clear and so the terminology in this area is somewhat diffuse with, for example, ‘chromatin boundary elements’ and ‘chromatin insulators’ often being used synonymously [60,61]. Some enhancers have an important bearing on the state of chromatin and, therefore, can also be viewed as LCRs. Experiments with some known chromatin insulators show that their effects on transgene expression are not always positive and to a large extent depend on the cell context [62,63]. Nuclear ‘matrix attachment region’ elements (MARs) and the effectively synonymous nuclear ‘scaffold attachment region’ elements (SARs) are known to possess some LCR activity. Some authors are trying to avoid the confusion between MARs and SARs using the joined names ‘SAR/MARs’, ‘MAR/SARs’ or ‘S/MARs’. Promising results in terms of sustained transgene expression were achieved with MARs both within the scenario where two MAR elements are used ‘to protect the transgene from the flanks’ [64,65] and the scenario where a single promoter-MAR couple is driving the transgene’s episomal replication .
4.8. Top-up transgene administration to compensate for silenced transgenes
Normally, if the expression of therapeutic transgenes did die out, it is possible to perform a new round of gene transfer, thus achieving a new burst of transgene expression. Repeated vector administration can be particularly sought-after when the target cell population experiences programmed death, while the respective progenitor cell pool is poorly accessible for therapeutic gene transfer. This strategy can be used without hesitation in an ex vivo gene therapy setting where therapeutic genes are delivered in vitro to dividing cells derived from a patient’s biopsy prior to autologous transplantation. In contrast, in an in vivo gene therapy setting, the drawbacks of vector re-administration include not only the increased complexity and cost of treatment, but also the realistic possibilities that immunity to elements of the vector might develop and that the effects of the toxic elements of the vector might build up to an unacceptable level. That is why low immunogenicity, low toxicity and the biodegradability of auxiliary vector elements are important in the vector re-administration treatment format.
4.9. Selection of clones with stable non-silenced transgene expression
Reliable, robust and error-free site-specific integration into mammalian cells lacking pre-engineered integration sites is difficult to achieve. Simpler alternatives for attaining stable long-term transgene expression exist in the ex vivo gene therapy approach. In one of the treatment scenarios, transgenes are integrated randomly, e.g. using lentiviral vectors or naked DNA vectors. It is then possible to select the best clone with minimal initial transgene silencing and minimal propensity for transgene expression shutdown among a heterogeneous population of transfected or transduced cells. The preferred method for cell selection is antibody-based magnetic sorting, as this method allows processing of large numbers of cells without recourse to heterologous fluorescent proteins and mutagenic UV irradiation as in fluorescence activated cell sorting (FACS). Clearly, such a clone pre-selection strategy can be used in conjunction with some other counter-silencing strategies (e.g. multiple random transgene insertions or top-up transgene administrations).
4.10. Small molecule enhancers of transgene expression
It is extremely attractive to use small molecule compounds to counteract transgene silencing. Substances known to influence chromatin’s state are prime candidates for this role. Thus, histon deacetylase inhibitors Trichostatin A, 4-phenylbutyric acid, butyric acid, valeric acid and caproic acid were successfully used to enhance transgene expression after transient trasfection . Available data indicate that another histone deacetylase inhibitor, valproic acid, and also retinoic acid, which is known to act through a receptor-mediated mechanism, are epigenetically active substances and, therefore, in certain situations could be considered for use as transgene expression stimulants. Some small molecule enhancers could be specific for particular vectors used for gene transfer. Thus, hydroxyurea is known to boost transgene expression after delivery with AAV vectors . In this case, transgene expression is likely to be spurred not through the inhibition of standard silencing mechanisms but rather through the more active synthesis of the second DNA strand in the delivered single-stranded AAV vector DNA .
4.11. Selection of low immunogenic vectors and transgene products
The elimination of therapeutic gene vectors and transgenic cells by the immune system can imitate the silencing of transgene expression. Thus, the employment of low-immunogenic vectors is a preferred option. Vectors’ epitopes should mimic the native epitopes of individual patients and do not match their pre-existing immune profile. Coating vector particles with immunologically inert polymers like polyethyleneglycol is one of the strategies to escape immune surveillance. Alternatively, vector particles can be developed, which are able to mimic the immune-evasion strategy of some viruses that are capable of ‘hiding’ at the cell surface . Non-immunogenic transgene products, e.g. exclusively human versions of proteins, should be chosen to prevent cell elimination via immune reactions in vivo. If required, transgene products should be re-engineered to achieve the ‘stealth effect’ and to tailor them to the immunological profiles of individual patients.
Epigenetic control by the target cells can result in permanent transgene silencing or in the instability of transgene expression. Thus, one needs to pursue therapeutic strategies, which can achieve long-term transgene expression by taking advantage of, circumventing or overriding silencing favouritism of the resident gene expression control mechanisms.
There are many levels at which the longevity of transgene expression can be addressed through the gene vector choice, design and administration regimen, including: 1) employment of non-nuclear vectors, e.g. mRNA or Sendai virus based vectors; 2) control of transgene modules’ amenity to methylation (e.g. purposeful exclusion of methylation-prone CpG islands); 3) employment of minimised DNA vectors such as minicircle DNA to avoid transgene silencing by the bacterial portion of the plasmid vectors; 4) choice of a suitable promoter-enhancer combination with the judicious use of tissue specific, inducible and ubiquitous promoters; 5) achieving a high number of randomly integrated transgenes; 6) control of the chromosomal integration sites via artificial site-preferences of retroviral integrases, transposases or via harnessing of site-specific integration systems; 7) localisation of transgenes on nuclear episomes; 8) chromatin re-modelling control via cis-acting elements such as insulator elements and other LCRs; 9) repeated vector administration; 10) selection of individual cell clones with transgenes integrated into favourable loci; 11) use of chemical reagents influencing the epigenetic state to achieve higher and more long-term transgene expression; and 12) choice of non-immunogenic transgene products to prevent the elimination of transgenic cells via immune reactions in vivo.
Clearly, the future solutions to transgene silencing enabling stable long-term expression of therapeutic transgenes will depend on the determined implementation of the above strategies and their effective combinations.