Plasmodium spp. and Toxoplasma gondii present a conserved nucleosome composition based on canonical H3 and variants, H4, canonical H2A and variants, and H2B. One-off, the phylum has also a variant H2B, named H2B.Z, which was shown to form a double variant nucleosome H2A.Z/H2B.Z. These histones also present conserved and unique post-translational modifications (PTMs). Histone variants have shown particular genomic localization and PTMs along euchromatin and heterochromatin, including telomere-associated sequences (TAS), suggesting fine-grained chromatin structure modulation. Several other nonhistone proteins present remarkable participation in controlling chromatin state, especially at TAS. Based on that, we discuss the role of epigenetics (PTMs and histone variants) in Plasmodium and Toxoplasma gene expression, replication, and DNA repair. We also discuss TAS structures and chromatin composition and its impact on antigenic variant expression in Plasmodium.
- histone variants
- antigenic variation
- telomere-associated region
Apicomplexa is a large phylum of unicellular obligate intracellular protozoan parasites responsible for a range of human and animal diseases with considerable medical and economic impact worldwide . The phylum comprises several well-known genera such as Cryptosporidium, Eimeria, Babesia, and Theileria, but the most studied genera are Plasmodium and Toxoplasma.
Plasmodium genus is comprised by several species of which five infect humans: P. falciparum, P. ovale, P. malariae, P. vivax, and P. knowlesi. The infection due to Plasmodium genus is known as Malaria, a mosquito-borne infectious disease endemic in the tropical and subtropical zones of Asia, Africa, South, and Central America. Malaria also constitutes a serious problem for travelers as well as for people working in endemic regions. In 2016, an infection rate of 216 million cases was reported, causing some 445,000 deaths globally. Data show a stalling in declining burden of Plasmodium observed over the last decade (
Toxoplasma gondii is the only one species of the Toxoplasma genus, and it is able to infect birds and mammals, including human, and cause toxoplasmosis. The infection occurs worldwide and the chronic stage reaches more than 500 million people . During the first few weeks of infection, toxoplasmosis is either asymptomatic or causes a mild flu-like illness. However, those with a weakened immune system, such as AIDS patients, infected fetus during gestation or newborns with a congenital infection, may become seriously ill, and occasionally die. The parasite can cause encephalitis (inflammation of the brain) and neurologic diseases, and can affect the heart, liver, inner ears, and eyes (chorioretinitis). Recent research has also linked toxoplasmosis with neuropsychiatric symptoms such as attention-deficit hyperactivity disorder, obsessive compulsive disorder, bipolar disease, and schizophrenia [4, 5, 6, 7, 8]. The present chemotherapy for toxoplasmosis is efficient but, sometimes, it is not well tolerated by individuals with AIDS, and it is effective against the acute or active stage, but not against the chronic/latent stage.
2. Genome and nucleus
Both Plasmodium protozoan parasites and T. gondii present a highly complex life cycle, involving several stages along the cycle (Figure 1). The genome sizes are 23.3 Mb for Plasmodium and 80 Mb for Toxoplasma, being haploid (1 N) almost all their life cycle but diploid during sexual replicative stages (2 N) (Figure 1). Plasmodium genus and T. gondii were the first apicomplexan parasites to be included in genome projects [9, 10]. Since then, several other apicomplexan parasites genome projects were taken forward and the data uploaded at EuPathDB (
An interesting aspect of apicomplexan parasites is that they never lose the nuclear envelope during cell division, and their chromosomes do not present the higher order level of condensation observed in metaphase chromosomes of higher eukaryotes . So, the nucleus presents the same aspect along the cell cycle. However, it seems to be not homogenous: Toxoplasma gondii nuclear envelope and chromosomes seem to have a dynamic relocalization and/or rotation inside the nucleus during parasite budding as observed by epichromatin localization (Figure 1B). Epichromatin is a conformational epitope formed by DNA and histones H2A and H2B localized only at the exterior chromatin surface [17, 18]. More recently, it was observed that epichromatin forms superbead domains associated to DNA-A at the nuclear envelope . A 3D analysis also shows that P. falciparum nucleus presents a polarization of the nuclear pore complex: in the early multinucleated schizont, it clusters in the nucleus region facing the mother plasma membrane, whereas in the late stages, when prepared for budding, it clusters toward the cytoplasm of the incipient merozoite .
In addition to putative polarization of the genome inside the nucleus of Apicomplexan parasites, in T. gondii, it was observed that the centromeres (CenH3, see below) are localized at a single spot at the apical region of the nucleus, indicating that all of them are attached to the centrocone, a structure associated to the nuclear envelope, which is traversed by microtubules coordinating the cell division . Similarly, Chromo1, a T. gondii protein that binds to the telomere, presents a focalized localization in the nucleus, also suggesting a certain degree of chromosome organization within the parasite nucleus . In P. falciparum prior to replication, in late ring stages and young trophozoites, CenH3 localizes to a single nuclear focus suggesting that centromeres are clustered in a single spot that most likely continues to be attached to the mitotic spindle until the end of schizogony and the intraerythrocytic developmental cycle, similar to that observed in T. gondii .
3. H3 histones: a multivariant family
H3 histone family presents canonical forms: H3, H3.1, H3.2 and variants: H3.3 and cenH3 . H3.3 differs from canonical H3s in various aspects. Canonical H3s are expressed and associate to chromatin during the S-phase of cell cycle. Canonical H3s and H3.3 are highly identical differing in only four to five amino acids. The CAF-1 complex is involved in the incorporation of canonical H3s whereas CHD1/ATRX remodelers as well as HIRA chaperone complex are involved in the incorporation of H3.3 [25, 26, 27, 28, 29, 30]. In addition, H3.3 is enriched in transcribed genes, enhancers, regulatory elements, and also heterochromatic repeats, including telomeres and pericentromeric regions [31, 32, 33, 34]. In general, H3.3 is linked to gene activation or open chromatin. Moreover, it has been found to be methylated at K4, K36, and K79 and acetylated at K9 and K14, all being marks of active chromatin [32, 35]. H3.3 and H2A.Z were detected at active promoters generating nucleosomes that promote gene transcription [36, 37, 38]. Recently, it was found that H3.3 plays an essential biological role during mammal development since mice that lack H3.3 presented developmental retardation and early embryonic lethality . Rather than gene expression troubles, H3.3 depletion causes genome instability due to dysfunction of heterochromatin structures at telomeres, centromeres, and pericentromeric regions of chromosomes, leading to mitotic defects.
There is little information regarding H3 histone family and the variants H3.3 in Apicomplexan parasites. The first approach is from W.J Sullivan  who was able to clone the entire ORFs encoding H3 and H3.3 in Toxoplasma gondii and also in Plasmodium falciparum. In this work, it was confirmed that, like in most other organisms, there is not much difference between the two variants: only four amino acids in T. gondii and eight between the P. falciparum variants. In most other species, the critical residues that differ between H3 and H3.3, resulting in different roles of these histones, are a motif, which contains SAVM in H3 canonical histone, but changes to AAIG in H3.3 . However, while PfH3 has the typical SAVM motif, it changes to QAVL in PfH3.3, whereas in TgH3, the motif is SAVL and changes to QAIL in TgH3.3 . Besides, there is another difference in Apicomplexa, which seems to be exclusive: KF changes for RY at position 54–55 in H3.3 .
In Plasmodium, H3.3 had a similar expression pattern to another important histone variant, H2A.Z, namely localization to active chromatin  (see Figure 2). As observed in other eukaryotic cells, it has been recently demonstrated by ChiP-seq experiments that euchromatic regions in the genome are demarcated by the presence of the H3.3 variant histone . However, in P. falciparum, there is a particular AT versus GC content along the genome with euchromatic intergenic regions richer in AT-content compared to coding sequences with less AT content . Fraschka et al.  have seen a particular correlation between enrichment in PfH3.3 histone variant and GC content, with this variant mainly located not only in euchromatic GC-rich sequences, but also in subtelomeric GC-rich repetitive regions. Interestingly, this correlation with the nucleotide composition is also observed with the double-variant nucleosome H2A.Z-H2B.Z (see below), but in this case, it is just the contrary: the regions with more AT content show abundance of this nucleosome . However, GC-poor intergenic regions show the lowest H3.3 coverage, but the authors still argue that the incorporation of this variant to coding regions is more dependent on GC content than transcriptional activity.
It is well documented that P. falciparum depends on the var multigene family, encoding for a highly variable cytoadherence protein called P. falciparum erythrocyte membrane protein 1 (PfEMP1) to avoid host immunity [43, 44, 45, 46]. This is due to the expression of only one of the ~60 var gene family members in any given parasite.
Regarding this important gene family, H3.3 stably occupies the promoter region and coding sequence of the active var gene but is evidently less incorporated into the promoter and coding sequence of silenced var genes  (see Figure 3). Additionally, it has been demonstrated that the PTMs affecting histone H3 are extremely important in the regulation of var expression. Data from fluorescence in situ hybridization (FISH) suggest that the P. falciparum SETvs (P. falciparum variant-silencing SET gene), which encodes an ortholog of Drosophila melanogaster ASH1 and controls histone H3 lysine 36 trimethylation (H3K36me3) on var genes, is specifically involved in var gene silencing, and its knock-out results in the transcription of virtually all var genes in the single parasite nuclei . Besides, ChIP-qPCR analysis showed that the TSS occupancy of H3K36me3 is considerably higher in the silent var genes compared to the active one (see Figure 3) .
A detailed mass spectrometry study has been accomplished for P. falciparum histone PTMs by Trelle et al. , and it has been established that lysines 4, 9, 14, 18, 23, and 27 of both H3 and H3.3 are capable of being modified by acetylations and/or methylations. Also, arginine in position 17 may be mono or bimethylated. Some of these modifications had already been identified for H3 and H3.3 also by Miao et al. . More recently, a lysine residue in the core of H3, K56, was also indicated as a site of acetylation [49, 50]. In the same way, T. gondii histone H3 has many lysines and also arginines capable of being modified: lysines in the positions 4, 9, 14, 23, 27 and also 36, 37, 56, 115 and 122 can be acetylated, methylated and besides some of them receive formylation, ubiquitination, or succinylation . Besides, arginines 2, 17, 26, 40, and 83 can be methylated .
But not only acetylations and methylations are marking histones; with the development of improved acid and high-salt purification methods for P. falciparum histone phosphoprotein analysis, multiple phosphorylation sites have been found mostly at the N-terminal region of most histones, including H3 and H3.3 . These marks are frequently seen in combination with neighboring lysine acetylation (and methylation). In this work, they also described a Pf14-3-3 as a phosphohistone mark binding protein.
In parasites, among the most conserved modifications is histone 3 trimethylation of lysine 4 (H3K4me3), a marker of potentially active promoters. Opposed to that is H3K9 methylation, associated with silent genes and densely packed heterochromatin, although protozoan parasite histones are more highly enriched in the activation marks associated with euchromatin with lower abundance of histone modifications associated with heterochromatin . However, it has been shown that the epigenome in P. falciparum is highly dynamic, and dependent on the stage, and, for example, H3K4me3 and H3K9ac are cycle regulated at P. falciparum genes . This could also probably be true for T. gondii, where the tachyzoite to bradyzoite conversion is regulated at an epigenetic level. In this sense, it has been speculated that the H3R17me2 mark may have significance during the tachyzoite to bradyzoite differentiation process, as it was found only restricted to a subset of promoters, and taking into account the importance of arginine methylation during early development of mouse embryo . In this study, using ChIP-on-chip technique, they found that H3K9ac, H4ac, and H3K4me3 modifications co-localize at focused loci in the T. gondii genome and correlate with significant gene expression, while the H3K4me1 and the H3K4me2 modifications were found at equal amounts in active and inactive chromatin .
4. Centromeric H3
CenH3, the centromere-specific H3, has been observed in animals, fungi, and plants  and also in Apicomplexa, including T. gondii, Plasmodium spp., and N. caninum . This fact was recently confirmed by Fraschka et al.  who found the centromeres depleted of PfH3.3 and PfH3, but occupied by PfCenH3. In T. gondii, this histone variant was characterized with the aim to understand the way in which chromosomes are delivered to the daughter cells after mitosis, a process that is still intriguing . In this work, the authors labeled all the histone H3 variants, and used TgCenH3 as a marker of centromeres, to perform ChIP and microarray assays . They found a particular combination of histone PTMs surrounding centromeres; this region had a huge concentration of H3K9 di- and trimethylation, marks usually associated to heterochromatin and found in subtelomeric regions in P. falciparum but not in T. gondii. In this parasite, these modifications concentrate in two peaks directly flanking the center of the centromere in each chromosome, while H3K4me3 or H3K9ac are not present [21, 52]. In contrast, H3K9me3 and heterochromatin protein 1 (HP1, chromodomain protein that binds to H3K9me3) were not associated with centromeres in P. falciparum , but rather found in islands of the genome that contain transcriptionally silent members of multigene families . In this parasite, the enrichment of PfCenH3 on centromeres of all the chromosomes has also been demonstrated by genome-wide ChIP-seq analysis . Besides, it has been characterized that a region within the carboxy-terminal histone fold domain, which is also named CENP-A targeting domain (CATD), is essential for mediating centromere targeting, while the N-terminus is not .
5. H2A.Z-H2B.Z: the double variant nucleosome
H2A family also has a canonical H2A and several variants: H2A.Z, H2A.X, both exchangeable by H2A.Z-H2B or H2A.X-H2B, allowing the modulation of gene transcription, DNA replication, and/or DNA damage repair [58, 59]. In vertebrates, the H2A family has two more variants: H2Abd and macro-H2A. When talking about H2A-H2B and the incorporation of variants into such nucleosomes, there are vast differences if we take a glance at Apicomplexan parasites compared to most other eukaryotes. One of the most surprising discoveries in these parasites was the presence of a novel H2B variant (formerly named H2Bv, but recently reclassified as H2B.Z ), a histone, which is known to be deficient in variants, similar to H4 [58, 61]. Variants of this histone family, though, are not only found in these parasites, but also in Trypanosomatids (even though they are not evolutionary related), and some rare testis-specific variants in human and other mammalian species (reviewed in ).
Different studies performed in Toxoplasma have shown a nucleosome composition in which H2A.Z, but not H2A.X, dimerizes with H2B.Z, while H2A.X dimerizes with canonical H2B (H2Ba in T. gondii), but never with H2B.Z [62, 63]. This fact is also seen in P. falciparum, although this parasite lacks H2A.X variant  and has driven the hypothesis of a new double variant nucleosome exclusive of parasites with particular characteristics that will be described in this section [64, 65] (Figure 2A). As it can be observed in the sequence alignment of H2B.Z in many Apicomplexan species, this histone variant is quite conserved (Figure 2B), suggesting that this histone, and likely the double variant nucleosome H2A.Z-H2B.Z, may have had an important role in the expansion of the phylum.
Since H2B.Z is not represented in yeast, insects, or mammals, almost all the current knowledge about the double-variant nucleosome relies on H2A.Z studies. H2A.Z is so widespread that has been catalogued as “universal” because of its origin before the divergence of eukaryotes . The first observation that appears is the hyperacetylation of its N-terminal tail in most species [48, 49, 50, 67, 68, 69]. It is thought that this possibility gives H2A.Z the faculty of mediating responsiveness to the environmental changes, with so varied and seemingly contradictory effects as gene activation, heterochromatic silencing, transcriptional memory, and others, depending on the binding of activating or repressive complexes . H2A.Z containing nucleosomes mark active and bivalent promoters as well as enhancers, correlating with open chromatin [70, 71]. However, acetylation of H2A.Z is necessary for gene induction and is most often associated with active gene transcription [67, 68, 70, 71], whereas ubiquitylation, which can occur at the C-terminal tail, is linked to transcriptional repression and polycomb silencing [72, 73, 74, 75]. Acetylated H2A.Z composes nucleosomes flanking the nucleosome-depleted regions . Regulation of gene expression by acetylation of H2A.Z histone tail may be a result of the participation of other proteins as “readers” in the histone code; for example, the SWR-C chromatin remodeling enzyme and related INO80 family are well characterized to catalyze chromatin incorporation of the histone variant from yeast to human, and the acetylation of histone H3 on lysine 56 (H3-K56Ac) was said to lead to promiscuous dimer exchange in which either H2A.Z or H2A can be exchanged from nucleosomes, although this is in discussion [77, 78, 79, 80, 81, 82]. NuA4 acetylation activity, which is homologous to the TIP60/p400 complex, was found to be associated with SWR1-driven incorporation of H2A.Z into chromatin . Besides, bromodomain-containing proteins are known to be implicated in “reading” the acetylation patterns of H2A.Z: acetylated lysines in histones, and other proteins are recognized by this motif, common in remodelers [77, 78, 84, 85]. In fact, for SWR1, bromodomains have been studied to recognize a pattern of acetylation (including H3K14ac), which may influence the deposition of H2A.Z-H2B variant dimers into the appropriate nucleosome [77, 78]. By using Tetrahymena as a model, it could be observed that these protozoa cannot survive with all acetylatable lysines replaced by arginines, indicating that H2A.Z acetylation modulates a charge patch with an essential function in chromatin regulation [69, 75]. Unlike the histone code, these changes need not to be site-specific. If this hypothesis is true, modulation of the charge at any one of a number of clustered sites could inhibit nucleosome condensation, facilitating transcription .
T. gondii H2A.Z, together with H2B.Z, was enriched in the promoters of active genes in tachyzoites, while repressed genes were enriched with H2A.X-H2Ba nucleosomes  (Figure 2A). In addition, H2A.Z-H2B.Z was also recruited within the coding region of silent bradyzoite-specific genes and within promoter regions but not coding regions of actively expressed genes . It is tempting to speculate that the enrichment at active promoters or poised regions could be ruled by different PTM stages of these histone variants. In agreement with this, H2A.Z and H2B.Z have shown to be highly acetylated at the amino-terminal tail, in contrast to canonical H2A and H2B histones and the H2A.X variant . Considering that H2A.Z has shown to be essential in regulating the changing gene expression program during differentiation [79, 80, 81, 88, 89, 90], and recently, it was observed that overexpression of mutated version of H2A.Z, where all five potential acetylatable lysines on H2A.Z-GFP (K4, 7, 11, 13, and 15) were mutated to arginines, blocked myoblast differentiation through disruption of myoD expression , it may be that the H2B variant is involved in the T. gondii cell differentiation process as part of H2A.Z-H2B.Z nucleosome. Whether through a patch charge modulation and/or histone code remains an open question, considering that T. gondii presents several bromodomain-containing proteins that can recognize some of the acetylated lysine .
As stated above, the sequence alignment of H2B.Z in many Apicomplexan species reveals a high degree of conservation for this histone variant (Figure 2B). Interestingly, every lysine that has been proved to be acetylated in T. gondii and P. falciparum, H2B.Z was detected in the other Apicomplexan species, which is also true for H2A.Z [50, 53]. Maybe, the double-variant nucleosome is present in the phylum with same PTMs and similar biological role.
5.1. Double-variant nucleosome in var genes
In P. falciparum, H2A.Z-containing nucleosomes were proposed to demarcate intergenic/regulatory regions of the genome, serving as a scaffold for stage specific as well as transcription-coupled recruitment of histone modifying enzymes . H3K9ac and H3K4me3 were found preferentially placed/retained on or next to H2A.Z-containing nucleosomes . However, it was observed that P. falciparum intergenic regions, including promoters, display a global nucleosome depletion, while telomeres harbored the highest nucleosomal occupancy, except for the var gene with the highest expression level, which again showed the lowest nucleosomal occupancy . Apparently, the little amount of nucleosomes in these areas is composed largely of variant nucleosomes. Petter et al.  also showed an enrichment of PfH2A.Z in the promoter of a set of developmentally regulated genes in the euchromatin compartment, although not correlated with transcription levels nor with acetylation status. P. falciparum H2A.Z-H2B.Z promoter occupancy in var genes was found to be strongly associated with transcriptional activity, whereas silent or poised var genes would be depleted of double-variant nucleosome (see Figure 3) [65, 86]. The authors have speculated that it may function as a similar physical switch to control gene expression in response to temperature change (for example, during fever or as P. falciparum is transmitted between its two hosts), as a thermosensory response that was seen in Arabidopsis thaliana and yeast . This could be due to reduced DNA wrapping of H2A.Z containing nucleosomes at higher temperatures, resulting in a relaxed chromatin structure, although this variant histone has also been associated with a tighter relationship with DNA, especially in heterotypic H2A.Z/H2A nucleosomes . While heterochromatic intergenic regions showed to contain low levels of histone variant H2B.Z , it is interesting that double-variant nucleosomes are depleted from silent var gene promoters but not from silent promoters of heterochromatic invasion gene families, which have similar patterns of variegated expression . Besides, this correlation between double-variant nucleosome presence and expression was only seen in var genes, while this nucleosome was also found enriched in intergenic regions across the genome, associated with euchromatic histone modifications and not necessarily associated with transcription [64, 65, 86]. Moreover, long promoter containing intergenic regions that maintain higher variant histone levels as compared with 3’UTR containing regions, which are considerably shorter, presents higher AT content, so this correlation could simply be due to the minimal length of the AT-rich content in these short 3’UTR regions . As it was previously observed for H3.3 variant histone (see Section 3), a correlation between nucleosome occupancy and GC/AT content in the genome was observed, although contrary to H3.3 that was correlated with rich GC regions , here both H2B.Z and H2A.Z histone variant occupancy displayed a clear positive correlation toward genomic AT content . In Figure 2A, a schematic representation of P. falciparum and T. gondii nucleosome occupancy is proposed.
6. Heterochromatin, telomeres, and subtelomeres
The telomere-associated sequences (TAS), also named subtelomeres, are heterochromatic regions adjacent to the telomeric-end looking toward the centromere. The telomeres and the TAS regions are the final structures at the chromosomes and integrate with the centromere the constitutive heterochromatin in the genome. These TAS regions have been described in Plasmodium and Toxoplasma with a size of 20–40 and near 30 Kpb, respectively (Figure 3) [98, 99, 100]. In T. gondii, the structure contains three tandem repeated elements (TARE), separated by noncoding DNA and flanked at one end by the telomere and at the other, downstream TARE 3, by a Toxoplasma-specific gene family, the tsf gene, of unknown function . In general, there is only one tsf gene per TAS. Interestingly, based on predicted amino acidic sequence, TSF proteins present a high degree of conservation in the N-tail and middle regions while being highly variable at the C-terminal end. Up to now, only few studies were performed on chromatin modulation of T. gondii TAS.
The TAS element in Plasmodium, instead, has been deeply studied because of the presence of different families of genes associated to virulence and pathogenicity with a clonal pattern of expression [101, 102]. Telomeres are spatially restricted to nuclear periphery, where they form clusters of three to seven heterologous chromosome ends [103, 104, 105]. Plasmodium TAS is composed of six different TAREs, and the coding part of the genome is localized directly downstream TARE 6, and is characterized by members of multiple antigen gene families including var, rif, stevor, and pfmc-2tm genes [94, 95].
The telomeres and TAS regions are dynamic structures associated to a plethora of specific factors that not only give it structure, but also configures all the regions as constitutive heterochromatin that participates in an epigenetic way to regulate subtelomeric genes expression (Figure 3). This epigenetic mechanism is carried out by proteins that introduce, recognize, and implement a repressive state over the gene expression under normal environmental conditions. It has been reported that under nutritional or environmental stress, the repressed subtelomeric genes activate their expression in response to events promoting growth and survival [87, 106, 107, 108, 109].
It is important to highlight that the T. gondii TAS regions show a nucleosome composition enriched in H2A.X and heterochromatin markers . An in silico analysis using the Plasmodium and Toxoplasma databases reveals the presence of only some orthologs to the yeast and mammal’s telomeric-subtelomeric proteins as TRF1-2, HP1, KU70/KU80, and Sir2 proteins (Table 1). But interestingly, the principal actor in this scenario would be the histone deacetylase type III -Sir2. This NAD+ deacetylase-dependent has also been implicated in different signaling pathways. P. falciparum has two Sir2 paralogues, Sir2A and B; with overlapping but distinct roles that regulate different subsets of var genes, binding reversibly with the promoter regions of silent but not active subtelomeric var genes . PfSir2A is implicated in telomere length regulation . In T. gondii, two deacetylases containing the Sir2-domain were identified: TgSir2A and TgSir2B, but their function has not been characterized yet. Another protein that had been described in Plasmodium is PfOrc1 (origin recognition complex 1), which together with Sir2 promotes the epigenetic silencing in P. falciparum TAS . PfOrc1 has a role in DNA replication but also cooperates with Sir2 to coordinate the spreading of heterochromatin and regulation of var gene expression . In general, Sir2 proteins act by removing acetyl groups in cytosolic targets and at the nuclear level at H3K9, K14 and K56, but it also was described to act on the histone mark H4K16 promoting the deposition of methyl groups on H4K20, H4K20me3 being a chromatin mark associated with heterochromatin . Thus, Sir2 seems to play a very important role in linking signaling processes to gene expression and chromosome architecture.
|Yeast||Mammals||T. gondii (ToxoDB number)||P. falciparum (PlasmoDB PF3D7 number)|
|Sir2 (P06700)||Sir2 (Q8IXJ6)||Sir2A (227020)||Sir2A (1328800)|
|Sir2B (267360)||Sir2B (1451400)|
|Sir3 (P06701)||Sir3 (Q9NTG7)||ATPase, AAA family protein (283900)*||Orc1/Sir3 like activity (1203000)*|
|Sir4 (P11978)||Sir4 (Q9Y6E7)||NF||NF|
|RAP1 (P11938)||RAP1 (Q9NYB0)||NF||NF|
|RIF1 (P29539)||RIF1 (Q5UIP0)||NF||NF|
|Ku70 (P32807)||XRCC6 (B1AHC9)||Ku70 (248160)||NF|
|Ku80 (Q04437)||XRCC5 (P13010)||Ku80 (312510)||NF|
|Taz1 (P79005)||TRF1 (P54274)||NF||NF|
|NF||HP1 (Q13185)||Chromo1 (268280)||HP1 (1220900)|
|Pif1 (P07271)||Pif1 (Q9H611)||NF||NF|
Additionally, a member of the Alba protein family (PfAlba3) was demonstrated via ChIP assays to bind to telomeric and subtelomeric regions co-localizing with Sir2A in the periphery of the nucleus. PfAlba3 inhibits transcription in vitro by binding to DNA. PfSir2A was shown to interact with PfAlba3 deacetylating the lysine residue of N-terminal peptide of PfAlba3 specific for DNA binding  (Figure 3). In archaea, this interaction had been reported, in which Sir2 regulates silencing through deacetylation of the major archaeal chromatin protein Alba, highlighting an ancestrally conserved mechanism of gene regulation .
As stated above, heterochromatin protein 1 (HP1) is a very important protein that has been described to recognize the trimethylation on H3K9, a critical mark for the establishment, maintenance and silencing of centromeric and telomeric heterochromatic regions in various model organisms. In P. falciparum, it has been identified as PfHP1  and the H3K9me3 mark was mainly associated with var genes at TAS regions, as said before . Moreover, high levels of H3K9me3 correlate with genes localized to the nuclear periphery, implying chromosome loop formation. In addition, an association between PfSir2 and H3K9me3 was found, since the lack of the sirtuin deacetylases causes changes in H3K9me3 localization at the chromosome and generates disruption of the monoallelic transcription of var genes, suggesting the existence of perinuclear repressive centers associated with control of expression of malaria parasite genes involved in phenotypic variation and pathogenesis .
Flueck et al.  described the presence of an ApiAP2 family member in P. falciparum, designated as SIP2 that binds to TARE-2 and TARE-3 regions and the upstream regions of var upsB in vivo. Immunofluorescence and genome-wide high-resolution ChIP analyses demonstrated that P. falciparum SIP2 and HP1 proteins co-localize and associate with the same subtelomeric region, suggesting that both proteins participate in the assembly of telomeric heterochromatin. A recent report from Gupta et al.  has demonstrated that the protein CAF-1, a chaperone that loads the H3-H4 to the nucleosome assembly after DDR, co-localizes with PfHP1 at the same subtelomeric localization, in the nuclear periphery, and also demonstrated its binding to TARE1-3 and co-localization with H3K56ac, a signal of completion on chromatin reassembly after DDR . Interestingly, immunoprecipitation with PfCAF1 followed by LC-MS/MS analysis demonstrated that this protein would be interacting not only with PfHP1 but also with PfAlba3 among others .
In T. gondii, an HP1 protein was identified as TgChromo1, linked to the sequestration of chromosomes at the nuclear periphery and the process of cell division of the parasite . TgChromo1 has shown to localize at T. gondii telomeres but not subtelomeres. However, by that time, subtelomeric regions had not yet been described and, in some cases, the sequences in these regions were not correctly assembled. Also, the presence of H4K20me3 and H2A.X at some TARE sequences and a region near tsf gene, previously named TgIRE, was observed [62, 63, 100, 123].
7. Double-strand break repair: H2A.X and chromatin
Cells are exposed to DNA lesions produced by exogenous (e.g., chemicals, UV-irradiation, and ionization) or endogenous factors (e.g., DNA replication stress, meiotic recombination). One of the most deleterious forms of DNA damage is the double-strand break (DSB) . DSBs activate the signal transduction pathway to induce DNA damage checkpoints that delay cell cycle progression, which allows the cell to activate DNA repair mechanism . The phosphorylation of SQE/DФ motif (where Ф represents a hydrophobic residue) on histone H2A.X (referred to as γH2A.X) is one of the earliest responses to DSB [126, 127]. H2A.X seems to be incorporated randomly in the genome of resting cells , whereas γH2A.X is clearly observed forming foci, labeling the DSB and replication fork sites, spreading along the chromosome up to 2 Mb from the damaged site. In addition, chromatin is subjected to several changes at damage sites playing an important role in regulating DNA repair . DSB can be produced by various events, either external as ionizing and UV radiations or internal such as collapse of replication forks and transcription-associated damage, among others [130, 131]. DSB can be repaired by two main mechanisms: nonhomologous end joining (NHEJ) and homologous recombination repair (HRR); the first is an error-prone mechanism available along the cell cycle, and the second is an error-free mechanism active at S/G2 phases of cell cycle because of the requirement of sister chromatid as template [131, 132, 133, 134]. Both mechanisms were described in T. gondii [14, 135], but Plasmodium genus is thought to rely only on HRR [136, 137, 138].
Before the election of NHEJ or HRR mechanism, DSB triggers a cascade of events that starts with Mre11-RAD50-Nbs1/Xrs2 (MRN in mammals and MRX in yeast) complex binding to the damaged site, which recruits and activates ATM kinase (Figure 4A) . ATM is able to phosphorylate H2A.X at SQE motif as well as other DSB repair enzymes allowing the spreading of γH2A.X and a correct DNA damage response (DDR) at DSB site (Figure 4A). ATM kinase is present in T. gondii and P. falciparum . In T. gondii, the MYST family lysine acetyltransferase TgMYST-B has shown to mediate DDR induced by methyl methanesulfonate (MMS) and to stimulate the ATM expression at gene level . In addition to this finding, histone acetyltransferases (HATs) have a predominant role in DDR on the basis of chromatin modulation. Chromatin responds to DSB first by increasing the compaction stage by replacing H2A/H2A.X with the H2A.Z variant and by methylating H3K9 by suv39h1 methyltransferase, which is recruited after spreading the DDR response at both sides of DSB sites (Figure 4B) [141, 142]. The arrival of H3K9me3 allows its interaction with the HAT Tip60 and the acetylation of H4 on K16 together with the acetylation of ATM kinase, an important PTM for the activation of autophosphorylation and subsequent activation of ATM (Figure 4) [143, 144]. The H3K9me3 and H4K16ac marks were identified in T. gondii and P. falciparum by mass spectrometry analysis [48, 49, 50, 51]. However, in the case of T. gondii, an acetylated residue was also detected in H3K9 in a more frequent fashion than H3K9me1,2,3, suggesting that chromatin is preferentially in an open state and that this lysine PTM can be regulated . As it was stated before, H3K9me2/3 is also enriched in centromeres in T. gondii . In addition, T. gondii H4K16ac was one of the most abundant PTMs found in the mass spectrometry analysis . In the case of P. falciparum, the treatment with MMS has increased the level of H4K8ac and H4K16ac and reduction of H3K9ac . Both, T. gondii and P. falciparum present H3K9me1,2,3 and H3K9ac in normal conditions suggesting a conserved mechanism of chromatin modulation . The role of these histone marks on Apicomplexan histones and the connections with DNA repair remain to be elucidated.
As mentioned above, γH2A.X spreading is a crucial step to initiate a correct DDR at DSB sites. In T. gondii, this PTM mark is accompanied by other DDR marks such as H3K9me2,3 and H4K16 in normal conditions of growth, opening the question whether DSBs are being produced during parasite replication [51, 135]. The T. gondii tachyzoite replicates at high rates, in a range of 5–9 hours . So, a putative collapse of replication fork could be occurring in this stage. However, T. gondii ATM kinase could not be detected in normal conditions by Western blot, but it was detected by tachyzoites overexpressing MYST-B HAT .
The chromatin compaction that occurs early during DDR includes the remodeling of chromatin at DSB sites in which the H2A-H2B dimer is replaced by H2A.Z-H2B [142, 147]. This event is transient, allowing the recruitment of repressive kap-1(TRIM28)/HP1/suv39h1 complex that can be important to inhibit transcription. The presence of H2A-H2B dimer in the nucleosomal core particle produces a unique negatively charged region on the surface of the nucleosome, called the “acidic patch,” which is extended in H2A.Z (Figure 4B) [148, 149, 150]. The acidic patch favors the binding of H4 N-tail, resulting in an increase in the interaction between nucleosomes and chromatin compaction . Interestingly, this seems a necessary step to continue with a relaxed chromatin state, since this compaction and recruitment of kap-1(TRIM28)/HP1/suv39h1 complex lead to methylation of H3K9 and phosphorylation of KAP-1 by ATM kinase, which in turn promote H4K16 acetylation by Tip60 and release kap-1(TRIM28)/HP1/suv39h1 (Figure 4B) (see ). T. gondii and P. falciparum have the novel H2A.Z-H2B.Z double-variant nucleosome (see Section 5). However, T. gondii and P. falciparum H2A-H2B and variants conserve the acidic patch (Figure 4B). To note, T. gondii and P. falciparum do not appear to have KAP-1 protein at ToxoDB and PlasmoDB databases .
In higher eukaryotes, another important PTM mark associated to DDR is ubiquitination by E3 ubiquitin ligases RNF168 and RNF8 at DSB site after γH2A.X and MDC1 protein foci spreading (Figure 4A). MDC1 is also phosphorylated by ATM kinase allowing the recruitment of RNF168 and RNF8 . Ubiquitination on H1 and H2A recruits several BRCT domain containing proteins such as BRCA1 and 53BP1 . In the case of 53BP1, its binding requires the H2AK13/15ub and H4K20me2 and addresses the DDR to NHEJ pathway (Figure 4B). By contrast, the presence of H4K16ac impairs the 53BP1 binding to the nucleosome allowing the recruitment of BRCA, which addresses the DDR to HRR (Figure 4B) (see [141, 142]). As stated above, in T. gondii and P. falciparum, the mark H4K20me1,2,3 was found [48, 51, 53]. However, T. gondii and P. falciparum H2As did not contain H2AK15ub and lysine 13 ubiquitylation was not detected either . In addition, T. gondii and P. falciparum did not show the presence of orthologs of BRCA1 and/or 53BP1, though T. gondii presents three different BRCT domain containing proteins .
T. gondii and P. falciparum conserve several histone marks present in chromatin-associated DDR to DBS, as well as histone variants—in the case of T. gondii, the DDR, well studied H2A.X, is present, whereas Plasmodium has only canonical H2A , involved in the recruitment of several factors that spread and choose the DDR pathway in higher eukaryotes. Although, T. gondii and P. falciparum lack some key DDR regulators such as KAP-1, 53BP1, BRCA1, MDC1, RNF168 and RNF8 , both parasites present the HRR mechanism of DNA repair, whereas NHEJ is present only in T. gondii. So, the modulation of both DDR pathways is still an intriguing issue.
8. Concluding remarks
In protozoan parasites, the modulation of chromatin seems to be a key biological process to regulate gene expression, pathogenicity and DNA repair, the latter probably associated to DNA replication, ergo, the cell cycle. In Apicomplexa, highly evident in Plasmodium genus, the TAS or subtelomeric regions play an important role in the control of a group of genes essential in parasite pathogenicity. This fact suggests that subtelomeres have not a trivial impact in the evolution of these organisms, and their structure can influence the features of the cell. How this genomic domain has evolved within the Apicomplexa phylum remains to be elucidated. T. gondii, in which to date a scenario of variant antigenicity was not detected, has shown a someway conserved structure with the presence of tandem repeated boxes and a gene family of unknown function (tsf). Different from Plasmodium, which variant antigen-associated gene is represented by hundreds of members, T. gondii has only one gene per TAS. However, the predicted protein sequences show conserved N-tail and middle regions, with highly variable C-terminal ends. We believe that the elucidation of the localization, role, and antigenic potential of these gene family proteins will be of high impact in our knowledge of this parasite. Also, it could be interesting to know if the members of this gene family show a regulation of gene expression similar to Plasmodium variable antigen gene family.
In addition to the presence of PTM marks similar to other organisms but with currently less-well characterized readers and erasers, Apicomplexa chromatin presents a double-variant nucleosome based on the new histone variant H2B.Z. If considering the partitioned knowledge in these parasites, specially P. falciparum, where H3.3 variant has been found in the same regions as this double-variant nucleosome, but in different studies, it would be possible that a triple-variant nucleosome exists in Apicomplexa. Since the presence of H2B.Z arose early in Apicomplexa evolution, it is expected that the double-variant nucleosome could have been important in the expansion of the phylum, maybe modulating chromatin structure during the execution of different biological processes. Interestingly, the genome-wide analyses seem to indicate that Plasmodium and Toxoplasma double-variant nucleosomes do not have the same behavior. In T. gondii, it is enriched in active and poised genes, whereas in P. falciparum, it is localized in active and silent promoters, excepting the var genes, in which the presence of the double-variant nucleosome is associated to active promoters. The analysis of this novel nucleosome in the different genera of the phylum can give more information to elucidate the reason of the presence of this H2B variant.
Chromatin is also important to the DDR and has an important role in determining the different pathways of DNA repair after DSB. T. gondii seems to have every histone variant and histone mark as well as important proteins associated to every DDR pathway to repair a DSB: NHEJ (e.g., Ku70/Ku80) and HRR (RAD51). Different from T. gondii, Plasmodium does not present the histone variant H2A.X, whose phosphorylation (γH2A.X) is linked to the localization of DSB on DNA. Moreover, T. gondii has shown a basal level of γH2A.X, even without damage. Not expected, the proteins associated to the DDR pathway choice (NHEJ or HRR), which read the chromatin, were not detected in T. gondii nor in Plasmodium. So, it is unknown if these marks are associated to other proteins (T. gondii has three BRCT domain containing proteins) with similar roles and/or chromatin modulates DDR in another way.
Taken all together, these differences are not only interesting at the light of evolution but also can be analyzed in the context of the identification of new parasite-specific drug targets. Gene regulation, DNA replication, pathogenicity, and DNA repair are crucial biological processes, and all of them may offer new targets to exploit as future treatments against Apicomplexan pathogens.
SO Angel (Researcher), L. Vanagas (Researcher), and S.M. Contreras (Fellow) are members of National Research Council of Argentina (CONICET). SO Angel and L. Vanagas are professors of Universidad Nacional General San Martin (UNSAM). This chapter was supported by ANPCyT PICT 2015 1288 (S.O.A.) and National Institute of Health (NIH, NIAID 1R01AI083162) grants.