Open access peer-reviewed chapter

ENGRAILED 2 (EN2) Genetic and Functional Analysis

By Jiyeon Choi, Silky Kamdar, Taslima Rahman, Paul G Matteson and James H Millonig

Submitted: October 29th 2010Reviewed: May 20th 2011Published: September 6th 2011

DOI: 10.5772/18867

Downloaded: 2788

1. Introduction

Our autism research has focused on the homeobox transcription factor, ENGRAILED 2(EN2). Prior to the advent of genome wide association and re-sequencing analysis, we selected EN2as a candidate gene due to neuroanatomical similarities observed between individuals with autism and mouse En2mutants.

Animal studies have demonstrated that En2is expressed throughout CNS development and regulates numerous cell biological processes implicated in ASD including connectivity, excitatory/inhibitory (E/I) circuit balance, and neurotransmitter development. The relevance of these functions to ASD etiology is discussed.

Human genetic analysis by us determined that two intronic SNPs, rs1861972and rs1861973, are significantly associated with Autism Spectrum Disorder (ASD). We observed the common haplotype (rs1861972-rs1861973A-C) is over-transmitted to affected individuals while the rs1816972-rs1861973G-T haplotype is over-represented in unaffected siblings. Significant results were observed in 3 datasets (518 families, 2336 individuals, P=.00000035). 6 other groups have also reported association of EN2with ASD, suggesting that EN2is an ASD susceptibility gene. These results are discussed.

However if EN2contributes to ASD risk, we would expect the ASD-associated A-C haplotype to segregate with a polymorphism that is functional and affects either the regulation or activity of EN2. Linkage disequilibrium mapping, re-sequencing and additional association analysis was performed, and identified the A-C haplotype as the best candidate for functional analysis. Luciferase assays conducted in primary mouse neuronal cultures demonstrated that the A-C haplotype functions as a transcriptional activator and specifically binds a protein complex. Transgenic mouse studies have demonstrated that the A-C haplotype is also functional, increasing gene expression in vivo. Finally, human post-mortem studies indicate EN2levels are also increased in individuals with autism. Thus, the ASD-associated A-C haplotype is functional and increased EN2levels are consistently correlated with ASD.

Six significant CpG islands also flank human EN2. Preliminary studies indicate hypomethylation of these CpGs can also result in increased EN2levels, suggesting epigenetic alterations influenced by non-genetic environmental factors can affect EN2levels. To study how genetic and epigenetic changes may function together to influence EN2regulation and CNS development, we are creating a chromosomal engineered knock-in that will replace ~75kb of mouse En2with the human gene.

In summary EN2is consistently associated with ASD and functions in developmental pathways implicated in ASD. In addition, we have shown that the ASD-associated haplotype is functional, resulting in increased expression both in neuronal cultures in vitroand in transgenic mice in vivo. Increased levels are also observed in human post-mortem samples. Together these human genetic data along with our molecular, mouse and post-mortem studies indicate that EN2is an ASD susceptibility gene


2. Selection of ENGRAILED 2as a candidate gene

Before genome-wide strategies were available for identifying common and rare variants for ASD, my laboratory decided to test candidate genes based upon neuroanatomical phenotypes. When we started this work in 2003, two cerebellar neuroanatomical phenotypes were consistently observed in individuals with ASD: a decrease in cerebellar volume (hypoplasia) and fewer Purkinje neurons (Bauman and Kemper 1985; Bauman 1986; Courchesne, Yeung-Courchesne et al. 1988; Courchesne 1997; Amaral, Schumann et al. 2008). We knew of numerous mouse mutants that displayed similar morphological phenotypes so we decided to test these genes for association in the available Autism Genetic Resource Exchange (AGRE) dataset. A list of nearly 100 genes were compiled that displayed similar cerebellar phenotypes in the mouse and individuals with ASD. The list also included genes that at the time were expressed in the cerebellum in specific spatial-temporal patterns suggesting they were likely to contribute to development. These genes were then placed on the human genome to determine which ones mapped near polymorphic markers that displayed linkage to ASD.

Many of the genes mapped to possibly interesting locations so we prioritized our association analysis by the following criteria: i) distance to SSLP marker, ii) LOD score or statistical significance of marker, iii) whether segregation or linkage to the chromosomal region had been replicated in multiple studies, iv) whether the genomic region displayed linkage in the AGRE dataset which would be used for our association analysis, v) whether mouse mutants existed for the gene, vi) and the similarity between reported mouse and ASD cerebellar phenotypes

Based on these criteria we selected the homeobox transcription factor ENGRAILED 2(EN2) as a candidate gene. EN2belongs to a class of transcription factors that are homologous in their DNA binding domain called the homeobox. Homeobox transcription factors regulate gene expression by binding to AT-rich DNA elements, and play central roles in coordinating development. Many homeobox genes are evolutionarily conserved from Drosophila to humans. The engrailedgene was first identified in classical genetic screens for developmental regulators in Drosophila. Humans and mice have two Engrailedgenes, Engrailed 1(En1) and Engrailed 2(En2). Both En1and En2regulate important aspects of CNS development (see Section 4 – ENGRAILED 2function)

Human EN2maps to distal chromosome 7 (7q36.3), near markers that display linkage to ASD in several datasets (Liu, Nyholt et al. 2001; Alarcon, Cantor et al. 2002; Auranen, Vanhala et al. 2002). Two of these studies had been performed using AGRE families. In addition two different En2mouse mutations existed – a traditional knock-out or deletion of En2, and a transgenic misexpression mutant. In the knockout the cerebellum is reduced in size and cell counts have determined an ~30-40% reduction in all the major cerebellar cell types including Purkinje cells (Millen, Wurst et al. 1994; Kuemerle, Zanjani et al. 1997). In the trangenic En2is misexpressed in a subset of Purkinje cells and similar phenotypes were observed (40-50% reduction in cerebellar area; ~40% decrease in the number of adult Purkinje cells)(Baader, Sanlioglu et al. 1998).

Significant association of EN2with ASD was initially demonstrated by us and has now been reported by 5 additional groups (Brune, Korvatska et al. 2007; Wang, Jia et al. 2008; Yang, Lung et al. 2008; Sen, Singh et al. 2010; Yang, Shu et al. 2010). Prior to summarizing these data, we will first describe the known expression of mouse and human EN2as well as the cell biological processes regulated by En2 in the developing and adult brain.

3. Engrailed 2expression during development

Mouse En2expression has been evaluated primarily by in situhybridization and lacZ knock-in mice (see Table 1 for summary). In these studies En2expression is initiated at E8.0 at the junction between the midbrain and hindbrain. En2continues to be expressed in a majority of mid-hindbrain cells from E8.5 to E12.5. These En2expressing cells will generate the cerebellum and midbrain colliculi dorsally, as well as parts of the serotonin (raphe nucleus) and norepinephrine (locus coeruleus) neurotransmitter systems ventrally. By E17.5 En2expression becomes more spatially restricted. In the chick tectum En2is expressed in a rostral to caudal gradient, while in the cerebellum it is stripe-like. By post-natal day 6 En2transcripts are restricted to the differentiating cells in the external germinal layer and developing inner granule cell layer of the cerebellum. In the adult En2continues to be expressed in mature cerebellar granule cells. Finally, QRTPCR studies indicate En2is also expressed at low levels in adult hippocampus.

Developmental StageExpressionFunction
E8.0-E12.5Mid-hindbrain junctionA-P patterning,
Neurotransmitter development
E12.5-E15.5Developing cerebellum,
colliculi, ventral mid-
hindbrain nuclei including
LC and RN, periaqueductal
Retinal-tectal mapping,
Neurotransmitter development
E15.5-P0Developing cerebellum,
Retinal-tectal mapping,
cerebellar connectivity
P0-P12Cerebellum (differentiating
Granule cells)
Cell cycle and differentiation
AdultMature granule cellsUnknown

Table 1.

Summary of En2expression and function from animal studies

A limited number of human ENGRAILED2expression studies have been performed. One analysis conducted on 18-21 weeks post-conception fetuses demonstrated widespread expression for both ENGRAILED 1and 2genes throughout the mid-hindbrain region including the cerebellar cortex and deep nuclei. Expression was also observed in several ventral hindbrain nuclei (inferior olive, arcuate nucleus, caudal raphe nucleus)(Zec, Rowitch et al. 1997). Western blot analysis conducted on cerebellar samples at later gestational ages (40 weeks) indicated abundant expression for both EN proteins (Logan, Hanks et al. 1992). Interestingly, recent microarray analysis performed by The Allen Institute for Brain Science demonstrates abundant expression throughout the cerebellum (cortex and deep nuclei) but also in numerous forebrain and midbrain structures (basal ganglia, amygdala, thalamus)(Figure 1). A complete developmental analysis of human EN2expression has not been reported. These data suggest human adult brain EN2expression is more widespread than mouse En2,and in fore- and mid-brain structures relevant to ASD phenotypes.

Figure 1.

HumanEN2expression. Microarray data of microdissected brain regions performed by The Allen Institute for Brain Science indicate thatEN2is expressed in the basal ganglia (purple), amygdala (pink), thalamus (green) as well as cerebellum and brainstem (blue). A) sagittal, B) horizontal, and C) caudal views

4. ENGRAILED 2function

Molecular studies have determined that En2 functions as a transcriptional repressor. The protein regulates numerous cell biological pathways during CNS development but has a well-characterized function in establishing connectivity maps. Emerging data also supports En2 function in E/I circuit balance as well as serotonin and norepinephrine neurotransmitter development. All of these cellular processes have been implicated in ASD etiology.

4.1. Transcriptional repressor function of En2

Molecular studies indicate the Engrailed 2 protein primarily functions as a transcriptional repressor, which is mediated by several different protein domains (Figure 2). DNA binding occurs through the homeodomain to a generic AT rich cis-sequence recognized by homeobox transcription factors. Two domains (engrailed homology region 1 (EH1) and EH5) contribute to Engrailed repressor activity. EH1 is located in the N-terminal portion of the protein while the EH5 domain is immediately 3’ of the homeodomain in the C terminal portion of the protein. Both domains bind the co-repressor Groucho, while EH1 is sufficient to confer repression activity when transferred to a transcriptional activator. Engrailed repressor function is mediated by two different mechanisms. The protein can actively block the trans-activation of activators by binding to nearby cis-sequences. Alternatively, the engrailed proteins compete for the binding of the basal transcriptional machinery to TATA box sequences (Ohkuma, Horikoshi et al. 1990; Jaynes and O'Farrell 1991; Tolkunova, Fujioka et al. 1998). Finally, two other domains (EH2 and EH3) bind the Pbx family of homeodomain transcription factors, which affect DNA biding specificity (van Dijk and Murre 1994; Peltenburg and Murre 1997).

Figure 2.

En protein domains. The En protein structure is illustrated and the different En interaction domains are demarcated. EH1-5 indicate engrailed homology domains 1 through 5.

4.2. En2 regulates mid-hindbrain patterning

Mouse and chick studies have determined that En2 coordinates multiple cell biological process throughout development. From E8.0-E12.5, En2and En1are spatially overlapping at the mid-hindbrain junction and both genes function to restrict progenitors to a midbrain and hindbrain lineage (Joyner 1996). En2temporal expression commences a few hours after En1transcripts are first detected and because of this difference, the En1knock-out mouse displays a more severe phenotype with a deletion of mid-hindbrain structures (Wurst, Auerbach et al. 1994). Knock-in experiments where En2is targeted to the En1locus are sufficient to rescue this phenotype, demonstrating that En2 is functionally redundant to En1 at this early stage of development (Hanks, Wurst et al. 1995).

4.3. Engrailed genes and 5HT and NE neurotransmitter system development

Previous studies have demonstrated that the Engrailedgenes are important in the development and maintenance of substantia nigra neurons in the dopamine neurotransmitter system. These data are reviewed elsewhere (Simon, Saueressig et al. 2001; Alberi, Sgado et al. 2004; Simon, Thuret et al. 2004; Gherbassi and Simon 2006; Sgado, Alberi et al. 2006). Instead we focus on the role of the Engenes on serotonin (5HT) and norepinephrine (NE) development, since abnormalities in these neurotransmitter systems have been more consistently implicated in ASD.

Mutations in the Engrailed genes affect the development of ventral mid-hindbrain nuclei that synthesize NE and 5HT: the locus coeruleus (LC) and raphe nuclei (RN) respectively. The LC is generated early in development (E9-E10 in the mouse) from the dorsal mid-hindbrain junction. The LC is deleted in the double En1-/-En2-/-knockout mice but appears relatively normal in the single knockouts suggesting the genes compensate for each other during development. The RN is generated in the ventral mid-hindbrain and express 5HT by E11.5. Several transcription factors including Pet1, Lmx1b and Gata3 are important in the generation of RN. Recent analysis indicates that both Engenes are expressed in the progenitors of RN at E11.5 and to continue to be expressed in post-mitotic rostral 5HT neurons. In addition an ~50% loss of neurons is observed in the dorsal RN by E16.5 in the double En knockouts. Like the LC phenotype the RN is relatively normal in the single knockouts suggesting the genes compensate for each other during development (Simon, Saueressig et al. 2001; Simon, Scholz et al. 2005; Sgado, Alberi et al. 2006; Fox 2010). Neurochemical data from our collaborator, Emanuel DiCicco-Bloom MD, have demonstrated abnormal levels of NE and 5HT in both the fore- and hindbrain structures of the En2knockout (Lin 2010). These data indicate that the development of the 5HT and NE neurotransmitter systems are regulated by the Engrailed proteins.

Numerous studies have implicated the 5HT and NE pathways in ASD. The 5HT pathway regulates mood, eating, body temperature and arousal, some of which are often perturbed in individuals with ASD. Abnormalities in the 5HT pathway have been consistently observed in individuals with ASD. Blood platelet hyperserotonemia has been reported since the 1960s in ~30% of affected individuals (Ritvo, Yuwiler et al. 1970; Campbell, Friedman et al. 1975; Takahashi, Kanai et al. 1976; Anderson 1987; Anderson, Freedman et al. 1987; McBride, Anderson et al. 1989; Cook, Rowlett et al. 1992; Lam, Aman et al. 2006). However, several studies suggest 5HT functioning is depressed in the CNS of individuals with autism. For example, serotonin reuptake inhibitors (SSRIs) can improve some of the symptoms of ASD (Cook, Rowlett et al. 1992; Gordon, State et al. 1993). In addition, the rate-limiting step of 5HT synthesis is the hydroxylation of tryptophan and acute depletion of tryptophan worsens ASD symptoms (McDougle, Naylor et al. 1996). The NE neurotransmitter system regulates attention, stress, anxiety, and memory, some of which are also affected in individuals with ASD. Unlike the 5HT system, the peripheral and central NE systems are tightly coordinated. Five studies have revealed increases in NE in the blood (Lake, Ziegler et al. 1977; Launay, Bursztejn et al. 1987; Leventhal, Cook et al. 1990; Leboyer, Bouvard et al. 1992; Minderaa, Anderson et al. 1994). However since plasma NE has a very short half-life, it remains possible that this increase is due to arousal at the time of blood drawing.

4.4. En2 regulates connectivity

From E15.5-P0, En2is expressed in a stripe-like pattern in the cerebellum. En2is one of many patterning genes that are expressed in this stripe-like pattern at this age (En1, Shh, Pax2and Wnt7b)(Millen, Hui et al. 1995). Interestingly, these stripe-like expression domains are coincident with the innervation of cerebellar afferents (mossy and climbing fibers), suggesting that these patterning genes regulate the topographic mapping of axons. Consistent with this possibility, En2mouse mutants display connectivity phenotypes disrupting the innervation of mossy fibers (Herrup and Kuemerle 1997; Baader, Sanlioglu et al. 1998; Baader, Vogel et al. 1999; Sillitoe, Stephen et al. 2008; Sillitoe, Gopal et al. 2009; Sillitoe, Vogel et al. 2010). Thus En2 is important in establishing the cerebellar connectivity map during development.

Several studies indicate the Engrailed proteins are secreted and function as axon guidance proteins for retinal-tectal mapping. Initial EM and protein studies from the Prochiantz group indicated that a subset of the Engrailed proteins are associated with caveolae-like vesicles (Joliot, Trembleau et al. 1997). Subsequent work demonstrated that ~5% of the Engrailed protein are secreted and they are internalized by neighboring cells. A protein sequence embedded in the homeodomain called the penetratin domain is responsible for this activity (Joliot, Maizel et al. 1998). In addition, in vitrocultures demonstrated that exogenous En2 acts as a guidance cue for isolated retinal axons transected from the nucleus. Imaging studies indicate En2 is endocytosed by these growth cones. The protein then interacts with the eukaryotic initiation factor 4E (eIF4E), and En2 mutations that prevent eIF4E interaction fail to cause axon turning. En2 also results in the phosphorylation of eIF4E and its binding protein, 4E-BP1, in axons, which is typically associated with translation initiation (Brunet, Weinl et al. 2005). Recent antibody experiments that block exogenous activity cause significant connectivity defects in the tectum (Wizenmann, Brunet et al. 2009). Interestingly, several other developmentally important transcription factors (Pax6, Otx2) also display non-cell autonomous phenotypes (Lesaffre, Joliot et al. 2007; Sugiyama, Di Nardo et al. 2008), suggesting this phenomenon is not specific to the Engrailed genes.

Thus, a small proportion of the Engrailed 2 protein is secreted and is important in regulating connectivity through local translation. The FMR protein, which is mutated in Fragile X Syndrome (FXS), also regulates local synaptic translation. Approximately one-third of individuals with FXS are diagnosed with ASD, suggesting synaptic translation defects could contribute to ASD etiology.

En2transcripts are also observed at low levels in the adult hippocampus. En2knock-out studies revealed a decrease in the number of inhibitory GABA interneurons in the CA3 pyramidal layer and stratum lacunosum moleculare of the adult hippocampus. The knock-out mice also display an increase in the susceptibility of kainic acid-induced seizures. These data suggest an imbalance in excitatory/inhibitory (E/I) connectivity, which has been postulated to be a contributing factor to ASD etiology (Tripathi, Sgado et al. 2009).

Post-natally, En2is expressed in differentiating and mature granule cells. Studies by Emanuel DiCicco-Bloom’s group demonstrated that En2 functions to promote cell cycle exit and differentiation in developing granule cells (Rossman 2008). The function of En2 in mature adult granule cells has not been investigated but it is likely to regulate the expression of genes needed for synaptic plasticity and other mature neuronal functions.

In summary although EN2was initially selected as a candidate gene based upon similar cerebellar neuroanatomical phenotypes, En2 coordinates multiple developmental processes. In particular the protein plays an important role in regulating connectivity and neurotransmitter system during CNS development, both of which are relevant to ASD etiology.

5. Engrailed 2genetic analysis

5.1. rs1861972-rs1861973association in AGRE and NIMH datasets

Human EN2is encoded by two exons in ~8.5kb. In collaboration with Linda Brzustowicz’s group at Rutgers University, association analysis was initially performed in 167 Autism Genetic Resource Exchange families (AGRE I dataset- 745 individuals). Positive association with ASD was observed for the common alleles of two intronic SNPs, rs1861972and rs1816973. Significant association was detected under a narrow (autism) and broad (ASD) diagnosis for both SNPs individually and as a haplotype (A-C rs1861972-rs1861973)(Table 2)(Gharani, Benayed et al. 2004). These results were then replicated in two additional datasets (AGREII –222 families, 1102 individuals; NIMH – 129 families, 566 individuals)(Table 2). When all three datasets were combined (518 families, 2413 individuals) more significant results were observed (Table 2)(Benayed, Gharani et al. 2005).

Many factors may contribute to the lack of replication in association studies of complex genetic traits. These include inadequate statistical power, the intrinsic complexity of a disease such as unknown gene-gene and gene-environment interactions as well as locus and allelic heterogeneity in different datasets. Given these limitations, replication of rs1861972and rs1861973association supports EN2as an ASD susceptibility gene.

Risk for the haplotype was then determined. Individual relative risk (RR) estimates the risk the haplotype confers to a given individual, and is calculated by the degree to which the haplotype is over-transmitted from heterozygous parents to affected children. Population attributable risk (PAR) estimates the risk of the haplotype to the general population and takes into account the degree of over-transmission and frequency of the haplotype. For the 518 families individual RR was estimated as approximately 1.42 and 1.40 under the narrow and broad diagnosis respectively. Because the frequency of the rs1861972-rs1861973A-C haplotype is ~67% in the combined sample, this modest individual RR corresponds to a significant PAR of ~39.5% and 38% for the narrow and broad diagnosis of ASD respectively (see Benayed et al 2005 for more details). These data imply that as much as 40% of ASD cases in the population are influenced by the risk allele responsible for rs1861972and rs1861973association

SNPDiagnosisAGRE I
(167 families,
750 individuals)
(222 families,
1071 individuals)
(129 families,
515 individuals)
Combined datasets
(518 families,
2336 individuals)
A-C haplotypeautism.0018.0168.0321.0000205
All haplotypesautism.0009.0048.0463.00000065

Table 2.

Summary of rs1861972and rs1861973association data

5.2. Additional EN2association studies

Prior to our association analysis for EN2, a case-control study was performed using 100 control and affected individuals from Western/central France. Significant association was observed for a PvuII RFLP that we later mapped to ~2.5kb 5’ of the promoter (rs34808376)(Petit, Herault et al. 1995; Benayed, Gharani et al. 2005). Since our association analysis, 5 separate studies have reported positive results for rs1861972or rs1861973either individually or as part of a haplotype (Brune, Korvatska et al. 2007; Wang, Jia et al. 2008; Yang, Lung et al. 2008; Sen, Singh et al. 2010; Yang, Shu et al. 2010). These studies were performed in datasets recruited by the authors and represent various ethnicities (Northern/Western European, Chinese, Indian). However differences have also been observed. Additional polymorphisms have been reported to be associated and the allele for rs1861972and rs1861973that is over-transmitted to affected individuals can vary. These results are summarized in Table 3. These differences could reflect variations in LD blocks for the different ethnicities. It is also possible that different risk alleles exist in various populations.

ASD associated
Petit et alWestern/central French200PvuIICG
Brune at alPrimarily Western/
Northern European
Wang et alChinese630rs3824068A
Yang et al (2008)Chinese502rs1861973
Yang et al (2010)Chinese551rs1861972
Sen et alIndian281rs1861973C

Table 3.

Summary of additional EN2association studies. a- number of individuals recruited

In summary EN2association with ASD has been reported by 7 different groups. These data are consistent with EN2being an ASD susceptibility gene. However if EN2contributes to ASD risk, then we would expect these genetic associations to be due to the co-inheritance of an allele that affects either the regulation or activity of EN2. The identification of an associated allele that is also functional would provide additional support for EN2being an ASD susceptibility gene.

5.3. EN2LD mapping and re-sequencing analysis

The next step in our analysis was to identify candidate common risk alleles by performing linkage disequilibrium (LD) mapping. LD indicates the degree to which alleles in the human population segregate with each other. Two measures for LD are commonly used: D’ and r2. D’ takes into account recombination rate while r2 includes recombination rate and the frequency of the alleles in the population. For common risk alleles responsible for rs1861972-rs1861973association, we expected candidates to display the following criteria:

  • Candidates must display strong LD (D’ and r2 >.75) with rs1861972and rs1861973

  • Candidates must be consistently associated with ASD

LD mapping was then performed for 24 additional polymorphisms that were situated throughout the EN2gene (Figure 3). These polymorphisms were typed in the AGRE I dataset and we found that only the intronic SNPs were in significant LD (D’ >0.72) with rs1861972and rs1861973. We then re-sequenced the intron from individuals with ASD that had inherited the A-C haplotype from at least one heterozygous parent. This identified only 1 additional polymorphism (rs28999108). Rs2899108has a minor allele frequency of 1%, indicating that additional more common polymorphisms are likely not to be identified and ss38341503does not fit the criteria of a common risk allele. Association analysis of all intronic SNPs demonstrated that none of them were as consistently or significantly associated as the rs1861972-rs1861973A-C haplotype (Benayed, Gharani et al. 2005; Benayed, Choi et al. 2009).

However, it was equally possible that rs1861972and rs1861973was in strong LD with a polymorphisms situated further 5’ or 3’ of EN2that was not tested for association. If this were the case, we would expect these flanking SNPs to be in strong LD with rs1861972or rs1861973and therefore display r2 values similar to.767 that is observed between rs1861972and rs1861973. To identify other polymorphisms that fit these criteria, publicly available

Figure 3.

Genomic structure ofEN2. The exonic/intronic structure ofEN2is illustrated. The position of 18 polymorphisms tested for association inBenayed et al 2005is demarcated by arrows below the gene. Numbering refers to the following polymorphisms: 1-rs6150410, 2- PvuII (rs3480837), 3-rs1345514, 4-rs3735653, 5-rs3735652, 6-rs6460013, 7-rs7794177, 8-rs3824068, 9-rs2361688, 10-rs3824067, 11-rs1861792, 12-rs1861973, 13-rs28999108, 14-rs3808332, 15-rs3808331, 16-rs4717034, 17-rs2361689, 18-rs3808329, 19-rs1895091, 20-rs12533271, 21-rs1861958, 22-rs3071184, 23-rs10259822, 24-rs10233570, 25-rs11976901, 26-rs10243118. Red labeling denotes ASD association in published studies.

Hapmap data was analyzed. The Hapmap project determined the LD relationship of over 1 x 106 SNPs in four human populations (CEU– Utah residents with ancestry from northern and western Europe; JPT- Tokyo, Japan; CHB- Han Chinese Beijing, China; YRI-Yoruba in Ibadan, Nigeria). r2 and D’ values were first examined for 4 SNPs (rs1861973, rs1861973, rs6460013and rs1861958) typed in both the Hapmap and ASD datasets. The values were found to be nearly identical, justifying this approach to identify candidate risk allele. The inter-marker Hapmap r2 values with rs1861973were then determined in all four Hapmap datasets for SNPs within 2 Mb of EN2(1Mb 5’ and 1 Mb 3’). Because 70.3% of the AGRE datasets tested for association were of Northern/Western European descent, the CEU Hapmap data were analyzed first and all SNPs within the 2 Mb region were found to be in weak r2 with rs1861973(r2<.370). Similar results were observed for the other datasets (Benayed, Choi et al. 2009). These data identified the A-C haplotype as the most appropriate common variant to test for functional differences.

It is also possible that rare variants on the A-C haplotype contribute to ASD risk and the genetic association of the haplotype with ASD. Re-sequencing over 100 individuals did not identify any non-synonymous coding polymorphisms (Benayed, Gharani et al. 2005, Rahman and Millonig, unpublished results). For all these reasons, we decided to focus our research on determining whether the ASD associated A-C haplotype was functional. Our molecular and mouse genetic studies are summarized below and demonstrate that the A-C haplotype functions as a transcriptional activator both in vitroand in vivo. These data provide molecular genetic support for EN2being an ASD susceptibility gene.


6. A-C haplotype functional studies

6.1. In vitromolecular genetic analysis

To investigate potential function of the ASD associated A-C haplotype, luciferase (luc) assays were conducted. The luc reporter system measures quanta of light, which is a sensitive and reproducible methodology for detecting transcriptional changes. Human EN2intron was cloned 3’ of a basal promoter and luc gene but 5’ of the polyA sequence (Figure 4). The construct also included the EN2splice acceptor and donor sequences. In this way the intron is transcribed and spliced like the endogenous gene. Constructs were generated for both the A-C and G-T haplotypes and are ~8kb in length. The only sequence difference between the constructs is the rs186972-rs1861973haplotype.

Both constructs were transfected into primary cultures of cerebellar granule cells. We chose this cell type to test the function of the A-C haplotype for the following reasons. One, cerebellar granule cells are the most abundant neuronal cell type in the brain and because of its small size they can be isolated to near homogeneity. Two, the cells can undergo various steps of development in culture including proliferation, migration, and differentiation. Three, endogenous En2is expressed at high levels in cerebellar granule cells.

When we transfected our constructs, the A-C haplotype resulted in significantly higher luc levels compared to the promoter control after 1 day in culture. The G-T haplotype did not display any activity compared to the promoter (Figure 4). Electrophoretic Mobility Shift Assays (EMSAs) were then performed to detect DNA-protein interactions. Granule cell nuclear extract was employed along with a 200bp fragment encompassing either the A-C or G-T haplotypes. A protein complex binds significantly better to the A-C than the G-T haplotype (data not shown). These data demonstrate that the A-C haplotype functions as a transcriptional activator in vitro. The A-C haplotype is one of two ASD associated alleles for which function has been ascribed.

Figure 4.

ASD-associatedrs1861972-rs1861973A-C haplotype increases gene expression. (A) Luciferase (luc) constructs used for transfections are diagramed: TATA – pGL3pro vector driven by SV40 minimal promoter, A-C and G-T – pGL3pro vector containing full-length humanEN2intron with ASD-associated A-C haplotype (A-C) or unassociated G-T haplotype (G-T). The intron was cloned 3’ of luc gene and 5’ of poly A signal so it is transcribed and spliced as the endogenous gene. (B) Equimolar amount of the three constructswere transiently transfected into P6 mouse cerebellar granule neurons and cultured for 24hrs. Luciferase activities were then measured and normalized to the levels ofRenilla reniformis. Relative luc units are expressed as percent of TATA control. Note the A-C haplotype significantly increases luc levels. N=4, *P<.05, two tailed paired Student’s T test.

6.2. In vivotransgenic analysis

Because ASD is a neurodevelopmental disorder, we then generated transgenic mice to determine the developmental cell types and ages in which the A-C haplotype is functional. Our constructs include ~10kb of 5’ evolutionarily conserved sequence, the intron, and ~10kb of 3’ evolutionarily conserved sequence. Exon 1 of EN2was replaced with the Ds-Red fluorescent reporter and exon 2 with the polyadenylation sequence. Like our luc constructs, the intron also includes EN2splice acceptor and donor sequences so the intron is transcribed and spliced as the endogenous locus. Transgenes for both the A-C and G-T haplotypes were generated with the only nucleotide difference between the ~25kb transgenes being the rs186172-rs1861973haplotype (Figure 5).

Figure 5.

Transgenic QRTPCR results. Top) Structure of the A-C and G-T transgenes is illustrated. Exon 1 ofEN2is replaced with the Ds-Red reporter and exon 2 with the polyA sequence. ~20kb of flanking evolutionarily conserved sequence (green bars) drives expression of Ds-Red. The only difference between the transgenes is the two nucleotides representing the A-C and G-T haplotypes. Bottom) QRTPCR using adult cerebellar RNA was performed for Ds-Red-E5 and Gapdh in two pairs of lines with similar copy numbers: A) A-C, line F, 5 copies; G-T, line N, 6.5 copies, B) A-C, line E, 32 copies; G-T, line I, 37 copies. *** P<.001 T-test

We have begun our analysis by examining the expression of the transgenes in the adult cerebellum because En2is expressed specifically in granule cells. Thus we might expect to observe a similar difference in expression as observed for our in vitroluc analysis. Taqman QRTPCR was performed for Ds-Redand Gapdhon the adult cerebellar RNA isolated from A-C and G-T lines with similar copy numbers. These assays were performed in quadruplicate on three A-C and 3 G-T littermates. The A-C haplotype results in ~250%% increase in normalized Ds-Redlevels compared to the G-T haplotype in the adult cerebellum (Figure 5). These results demonstrate the A-C haplotype functions as a potent activator in vivo. These data determined that the ASD A-C haplotype functions as a transcriptional activator both in vitroand in vivo, providing molecular genetic evidence that EN2contributes to ASD risk.

We are now examining levels and spatial expression at additional time points (E12.5, E17.5, P6 and adult) relevant to various described functions of En2(see Table 1). These studies will determine when, where, and how the A-C haplotype is functional during CNS development, providing the first in vivofunctional analysis of any common associated allele with ASD.

6.3. Post-mortem and epigenetic analysis

To investigate whether EN2levels are also increased in individuals with ASD, post-mortem analysis has been performed. 78 age and sex matched cerebellar samples have been obtained from NICHD Brain and Tissue Bank for Developmental Disorders or Harvard Brain Tissue Resource Center via Autism Tissue Program (49 control, 29 affected). These samples have been genotyped for rs1861972and rs1861973, and Taqman QRTPCR has been performed for EN2and GAPDH. Normalized EN2mRNA levels display a significant increase in affected compared to controls (Figure 6). Further examination of these data suggests that the increase is due to both the rs1861972-rs1861973genotype and affection status. A more detailed statistical analysis is ongoing but these results are consistent with EN2levels being increased in individuals with ASD. Together our in vitro, in vivo, and post-mortem studies have demonstrated that increased amounts of EN2are consistently associated with ASD, suggesting that elevated levels of the protein alter CNS development to increase risk for ASD.

Figure 6.

EN2levels are elevated in ASD individuals.EN2mRNA levels were measured in 29 ASD and 49 control post-mortem cerebellum using Taqman qRT-PCR.EN2levels were normalized to GAPDH internal controls and average delta Ct values were obtained from triplicates of qPCR. AlteredEN2levels of ASD individuals are presented as percent of control values. Fold difference was calculated using the formula 2-(EN2deltaCt-ControldeltaCt). Error bars indicate standard errors. **p<.01, T-test, two-tailed, unpaired with unequal variance.

The previous data indicate the A-C haplotype results in increased gene expression. However, increased EN2levels could also be achieved by epigenetic mechanisms. Environmental factors can affect gene regulation through epigenetic modifications such as differential methylation. Epigenetics likely plays an important role in ASD for the following reasons. One, epigenetics provides an interface between environmental factors and genetic susceptibility. Numerous common environmental factors (e.g. bis-phenol, arsenic, certain antibiotics) affect CpG island methylation and gene expression (Villar-Garea and Esteller 2003). Thus differential environmental exposures could cause variations in epigenetic modifications and gene expression. This model provides a possible explanation for the phenotypic variability observed in ASD and other polygenic disorders (Bjornsson, Fallin et al. 2004; Feinberg 2007). In addition, the methyl-CpG binding proteins, MeCP2 and MBD2, are mutated in Rett Syndrome and ASD, pointing to the importance of epigenetic regulation in ASD (Amir, Van den Veyver et al. 1999; Li, Yamagata et al. 2005; Coutinho, Oliveira et al. 2007; Loat, Curran et al. 2008).

CG dinucleotides are clustered in regions called CpG islands that are regulated by epigenetic mechanisms. CpG dinucleotides are the substrates for cytosine methyl transferases and DNA methylation often leads to decreased expression. Six CpG islands flank human EN2with 3 in the gene. Interestingly, a vast majority of these CpG islands are not observed in mouse or rat, indicating they have evolved since rodent radiation to possibly regulate EN2expression. To investigate whether EN2is epigenetically regulated, we treated two human neuronal cell lines (Daoy, SH-SY5Y) that express EN2with the methylation inhibitor, 5-aza-2'-deoxycytidine (AZA), and a methyl group donor, S-adenosylmethionine (SAM). Preliminary bisulfite sequencing demonstrated methylation of CpGs with SAM treatment while the same dinucleotides are unmethylated in AZA treated cells (Figure 7). Importantly this difference in methylation is correlated with EN2mRNA levels. AZA treatment results in increased expression; SAM treatment with decreased levels (Figure 7). Thus, these data are consistent with EN2being epigenetically regulated

Figure 7.

EN2epigenetic analysis. A and B) Treatment of Daoy and Sh-SY5Y cells with AZA (A) resulted in increasedEN2mRNA levels (expressed as percent difference relative to untreated). Treatment with SAM (B) resulted in decreasedEN2mRNA levels (expressed as percent difference relative to untreated). *** P<.001 T-test. Bottom) Bisulfite sequencing of PCR products demonstrated theEN2promoter is hypomethylated in untreated Daoy cells but methylated upon SAM treatment. An unaffected post-mortem sample is methylated while an affected sample is methylated at the same nucleotides.

We have bisulfite sequenced the promoter in a few post-mortem samples. In affected individuals none of the CpG dinucleotides are methylated while in unaffected individuals the same CpGs were methylated. These CpGs are the same dinucleotides methylated after SAM treatment in vitro(Figure 7). In sum, these results are consistent with epigenetic differences contributing to the increase in EN2mRNA levels observed in the post-mortem samples. High-throughput epigenetic platform analysis is ongoing to investigate this hypothesis further.

7. Future studies

One important next step is to identify the downstream molecular and cell biological effects of increased EN2expression. For this analysis we are generating a humanized EN2knock-in

Figure 8.

RMGR humanizedEN2knock-in. A) Genomic structure of the recombineered BACs is drawn to scale. HumanEN2G-T haplotype BAC was obtained and recombineered to generate the ASD-associated A-C haplotype. IRES:GFP was then introduced downstream of theEN2coding region for the A-C BAC. IRES:Cherry was introduced in the same location for the G-T BAC. Heterotypic loxP (orange triangle) and lox511 (blue triangle) sites were also recombineered into both BACs, ~35kb upstream and downstream ofEN2. This genomic region does not include any other genes. Empty boxes depict the next flanking genes. The human CpG islands are also illustrated as blue boxes. B) The mouseEn2locus is illustrated with one small CpG island (blue box). LoxP and lox511 sites have been sequentially targeted onto the sameEn2chromosome. C) The recombineered BACs will then be transfected into our cis loxP-lox511 double-targeted ES cells. Cre recombinase will then be expressed in the ES cells. Since the heterotypic lox sites do not recombine with each other but still recognize both the cre recombinase, the mouse sequence will be replaced with the human locus via cre-mediated recombination through the flanking loxP and lox511 sites. Both A-C and G-T knock-ins will be generated, which will also contain the human CpG islands

mouse whereby we are replacing ~75kb of mouse En2with the human sequence. This sequence will also contain the flanking CpG islands. To accomplish this goal we are using a strategy called Recombination Mediated Genome Replacement (RMGR) developed by Andrew Smith PhD (Figure 8)(Wallace, Marques-Kranc et al. 2007). In this way we will be able to determine the molecular and cell biological effects of the A-C haplotype throughout development. Because the human sequence will include the flanking CpG islands, we will also be able to expose the mice to various non-genetic factors that affect epigenetic regulation and investigate how these environmental compounds can either improve or worsen the A-C associated phenotypes.

8. Summary

We have demonstrated that the EN2rs1861972-rs1861973A-C haplotype is significantly associated with ASD in 3 datasets. 6 additional groups have reported EN2ASD association, suggesting it is an ASD susceptibility gene. If this possibility is correct, then we would expect the associated alleles to segregate with common or rare variants that functionally alter EN2expression or activity. To address this question, we decided to use a combinatorial approach that included human genetics, molecular biology, mouse transgenesis, and human post-mortem analysis. In the three datasets that we studied, LD mapping, re-sequencing, and additional association studies identified the A-C haplotype as the best candidate to test for function. In vitroluc assays demonstrated that the A-C haplotype functions as a transcriptional activator, resulting in elevated levels. Importantly transgenic mice have recapitulated these results in vivoand will determine when, where, and how the A-C haplotype is functional throughout CNS development. EN2levels are also increased in individuals with ASD. Thus elevated amounts of EN2seem to be correlated with increased ASD risk. Our preliminary studies indicate that EN2 is also epigenetically regulated, suggesting exposure to environmental non-genetic factors may also increase EN2 expression. Future experiments are directed at identifying downstream molecular and cell biological pathways affected by increased EN2 levels. Finally, En2 regulates developmental processes implicated in ASD, including the establishment of connectivity maps. In sum, our combinatorial approach has provided evidence that EN2 is an ASD susceptibility gene.


We thank NICHD Brain and Tissue Bank for Developmental Disorders, the Harvard Brain Tissue Resource Center for the post-mortem samples, and all participating families for the post-mortem samples. We thank the Autism Tissue Program and especially Jane Pickett for all their help. We acknowledge the funding agencies that have supported this research: NIH (MH076624, MH080429, MH083509), Department of Defense (W81XWH-09-1-0286), NAAR/Autism Speaks, and New Jersey Governor’s Council for Medical Research and Treatment of Autism

© 2011 The Author(s). Licensee IntechOpen. This chapter is distributed under the terms of the Creative Commons Attribution-NonCommercial-ShareAlike-3.0 License, which permits use, distribution and reproduction for non-commercial purposes, provided the original is properly cited and derivative works building on this content are distributed under the same license.

How to cite and reference

Link to this chapter Copy to clipboard

Cite this chapter Copy to clipboard

Jiyeon Choi, Silky Kamdar, Taslima Rahman, Paul G Matteson and James H Millonig (September 6th 2011). ENGRAILED 2 (EN2) Genetic and Functional Analysis, Autism Spectrum Disorders - From Genes to Environment, Tim Williams, IntechOpen, DOI: 10.5772/18867. Available from:

chapter statistics

2788total chapter downloads

1Crossref citations

More statistics for editors and authors

Login to your personal dashboard for more detailed statistics on your publications.

Access personal reporting

Related Content

This Book

Next chapter

Antipsychotics in the Treatment of Autism

By Carmem Gottfried and Rudimar Riesgo

Related Book

First chapter

A Probable Etiology and Pathomechanism of Arousal and Anxiety on Cellular Level - Is It the Key for Recovering from Exaggerated Anxiety?

By András Sikter and Roberto De Guevara

We are IntechOpen, the world's leading publisher of Open Access books. Built by scientists, for scientists. Our readership spans scientists, professors, researchers, librarians, and students, as well as business professionals. We share our knowledge and peer-reveiwed research papers with libraries, scientific and engineering societies, and also work with corporate R&D departments and government entities.

More About Us