Advances in Autism Research – The Genomic Basis of ASD

Many of the recent advances in autism research that have provided fundamental insight into this condition have come from the application of genetic/genomic approaches; these advances have been fomented by the advent of new technologies to interrogate the en‐ tire genome, such as array comparative genomic hybridization (aCGH), single nucleotide polymorphism (SNP) microarrays, transcriptome sequencing, and whole genome or whole exome sequencing (WGS/WES) [1]. With the recent advancement of these technol‐ ogies over more traditional, lower-resolution technologies such as cytogenetic analysis, came the ability to interrogate the entire genome at a high-resolution. With the improve‐ ment of next-generation sequencing technology, as well as the reduction in the cost of this technique, WGS is becoming more commonplace in the search for novel diseasecausing variants in individual patients. Alternatively, many studies have utilized WES, as it is less costly than sequencing the entire genome and coding simple nucleotide var‐ iants (SNVs) can often be more readily interpreted given knowledge provided by the ge‐ netic code. While the reduced cost and more readily interpretable variation have proven to be distinct advantages of this method over whole-genome sequencing, it is well known that many other variants in non-coding or regulatory regions can be pathogenic, and they typically cannot be discerned by whole-exome sequencing, which requires a targeted-capture step to enrich for and focus analysis on the coding sequences of all an‐ notated protein-coding genes [2, 3]. Furthermore, repetitive or G-C rich regions or highly homologous sequences are often excluded by WES, and copy number variations (CNVs) usually cannot be accurately called due to the use of PCR-based sample preparation methods. Nonetheless, the utility of WGS/WES in individual patient diagnosis and man‐ agement has been demonstrated by several recent reports [4-6].


Introduction
Many of the recent advances in autism research that have provided fundamental insight into this condition have come from the application of genetic/genomic approaches; these advances have been fomented by the advent of new technologies to interrogate the entire genome, such as array comparative genomic hybridization (aCGH), single nucleotide polymorphism (SNP) microarrays, transcriptome sequencing, and whole genome or whole exome sequencing (WGS/WES) [1]. With the recent advancement of these technologies over more traditional, lower-resolution technologies such as cytogenetic analysis, came the ability to interrogate the entire genome at a high-resolution. With the improvement of next-generation sequencing technology, as well as the reduction in the cost of this technique, WGS is becoming more commonplace in the search for novel diseasecausing variants in individual patients. Alternatively, many studies have utilized WES, as it is less costly than sequencing the entire genome and coding simple nucleotide variants (SNVs) can often be more readily interpreted given knowledge provided by the genetic code. While the reduced cost and more readily interpretable variation have proven to be distinct advantages of this method over whole-genome sequencing, it is well known that many other variants in non-coding or regulatory regions can be pathogenic, and they typically cannot be discerned by whole-exome sequencing, which requires a targeted-capture step to enrich for and focus analysis on the coding sequences of all annotated protein-coding genes [2,3]. Furthermore, repetitive or G-C rich regions or highly homologous sequences are often excluded by WES, and copy number variations (CNVs) usually cannot be accurately called due to the use of PCR-based sample preparation methods. Nonetheless, the utility of WGS/WES in individual patient diagnosis and management has been demonstrated by several recent reports [4][5][6].
Some of the first studies providing a high-resolution view of the entire genome have revealed that a large number of CNVs are present in the genomes of healthy individuals, and that CNVs account for a greater proportion of the nucleotide variation between two given individual genomes than can be attributed to SNVs [7][8][9]. These structural alterations can reach up to several megabases in length, but a much higher frequency is observed for smaller (<1 kb) CNVs [2]. And, as one would expect, the likelihood of CNVs becoming pathogenic rises when they have an increased size and/or occur in gene-dense regions of the genome [8]. Traditionally, structural variation (CNV) was not considered to play a causative role in autism or ASD. However, recent studies have revealed that not only single-gene alterations, but also CNVs can lead to autism or ASD. In fact, it is now becoming increasingly evident that CNVs account for a larger proportion of new autism diagnoses than single-gene disorders. Recurrent CNVs at specific genomic loci have been associated with autism, including 15q11-q13, 16p11.2, 17p11.2, 22q13.3, 7q11.23, and 2q37, among others [1,[10][11][12][13][14][15][16]. While several of these loci are associated with known Centers for Mendelian Genomics, numerous CNVs have also been observed in idiopathic autism, underscoring the importance of these structural variations in the future of all types of autism research [17].
The application of next-generation sequencing technology to evaluate CNVs has also recently been described in a report that utilized whole-transcriptome sequencing analysis of the genomes of a cohort of patients with autism spectrum disorder (ASD) [18]. This approach allows for the evaluation of CNVs and overcomes some of the problems associated with CNV-calling in WES. With several large-scale projects currently underway, the future of next-generation sequencing and whole-genome analysis in the study of autism will most definitely provide many new insights into the etiology of this disease. Currently, Autism Speaks is working in collaboration with the company BGI to generate the largest database of sequenced genomes of individuals with ASD, a project known as the "Autism Genome 10K." Similarly, the National Institute of Mental Health in the US has funded another large-scale "Autism Genome Project." Mendelian/ syndromic forms of autism are also currently being studied by the Genomic Disorders consortium in the US by WES.
Among the variants identified in the large-scale studies of patients with autism reported to date, many gene networks/pathways have been implicated, including genes for neuronal adhesion [18,19], ubiquitin degradation [19], chromatin remodelling [5,20], sodium channels [13], proteolysis [21], cytoskeletal organization [21], signal transduction [18], neuropeptide signalling [18], neurogenesis/synaptogenesis [18], neuronal migration [22], basic metabolism, and RNA splicing [22], among others. While these pathways may seem diverse, repeated "hits" in these networks support the "many genes, common pathway" hypothesis [22]. Importantly, although the biological function of ASD susceptibility genes identified via these whole-genome studies do not appear to lie within the same network, they likely converge to disrupt neuronal function in brain regions that support language, social cognition, and behavioral flexibility, resulting in the phenotypes commonly associated with ASD [22].

Chromosome engineering mouse models for ASD
Since ASD is known to be a highly heterogeneous disorder with both genetic and environmental components, modeling the disease in rodents, where environmental and background effects can be largely controlled and systematically manipulated and studied, is of great advantage in the study of the pathomechanisms underpinning autism. Furthermore, numerous tools for genetic manipulation and for behavioral analysis that are currently available and developed for genetic studies in rodents can be leveraged to facilitate this avenue of research. Importantly, behavioral assays have been developed and validated to objectively quantify the phenotypes relative to autism, including both core and associated autistic-like phenotypes, such as abnormal social behavior, communication deficits, and repetitive behaviors, as well as autism-associated anxiety-like behaviors, motor defects, learning and memory deficits, sleep disorders, sensory hypersensitivity, and seizures, among others [23] (Table 1; adapted from Crawley et al, where a full description of these behavioral tests can be found; [23]).
With the importance of CNVs in the etiology of ASD established, chromosome-engineered mouse models were then generated for the study of autism or ASD. The first such mouse strains were developed over a decade ago using a chromosome-engineering approach [24]. This technique allows for the creation of a targeted duplication or deletion in the desired location by first generating the rearrangement in mouse embryonic stem (ES) cells which can then be established as mouse strains via Cre/loxP site-specific recombination [25,26]. To generate the desired rearrangement, two gene-targeting steps are required to prepare each end point for a selectable recombination event ( Figure 1). Importantly, the type of rearrangement (deletion, duplication, inversion) depends on the relative orientation of the loxP sites; if the sites are in the same orientation, the region between them can be deleted or duplicated, but if they are in opposite orientation, an inversion results [25]. The cis or trans configuration is also relevant; trans insertion (insertion in each chromosome homologue) of loxP sites enables generation of both deletion and duplication in the same ES cells. Transient transfection of the ES cells with a vector expressing Cre recombinase facilitates the recombination between the targeted loxP sites, and cells containing the event can be selected for using hypoxanthine aminopterin thymidine (HAT)-containing media due to the reconstitution of a functional Hprt cassette as a result of the recombination [25]. The resulting mouse models harbor either a chromosomal duplication or deletion of a defined region that is syntenic to the copy number variable region in humans. Importantly, chromosome-engineered mouse models are distinct from monogenic animal models in that they harbor structural chromosomal rearrangements resulting in specific, targeted CNVs with genomic intervals that may span several megabases and contain numerous genes, many of which may be of unknown function. In contrast, monogenic animal models primarily utilize a reverse-genetics approach to knock-out or transgenically-overexpress the specific single gene of interest, limiting the study to that one particular gene. A publicly-available resource can facilitate chromosome engineering for the targeted manipulation of the mouse genome; the mutagenic insertion and chromosome engineering resource (MICER) can be utilized to access vectors to create chromosomal rearrangements or to study gene disruptions in a high-throughput manner [27]. (http://www.sanger.ac.uk/resources/mouse/micer/).  We will discuss specific examples of chromosome-engineered mouse models of ASD for which the neurobehavioral phenotypes have been described using established rodent behavioral assays, including mice harboring duplication or deletion of the genomic interval syntenic to human chromosome 16p11.2, 15q11-q13, and 17p11.2, which all model CNVs that have been identified in patient populations with autism [28][29][30]. Most of the CNV-based animal models of ASD described to date have focused on syndromic forms of autism, as the underlying genetic cause of these genomic disorders is often well-described.
By definition, genomic disorders result from structural changes in the genome, wherein the genomic instability often reflects a susceptible genome architecture, that leads to disease traits [31]. These structural rearrangements or CNVs commonly cause the disruption or complete loss or gain of dosage-sensitive gene(s). Alternatively, CNVs can cause gene fusion, position effects, transvection effects, or the unmasking of a recessive allele or functional polymorphism [32]. Genomic disorders are therefore distinct from traditional genetic syndromes, which are typically caused by DNA sequence-based changes [33]. Within the past decade, several technologies, including array comparative genomic hybridization (aCGH), next-generation sequencing, and single nucleotide polymorphism (SNP) genotyp-ing platforms, have been utilized to detect and analyze CNVs in the genome and to investigate the mechanism by which these CNVs are generated [34]. CNVs can be formed by several mechanisms, such as non-allelic homologous recombination (NAHR), non-homologous end joining (NHEJ), or fork stalling and template switching (FoSTeS) [35]. NAHR, which is often mediated by low copy repeats (LCRs) with high (~95%) sequence similarity flanking the rearranged region, is the most common mechanism by which recurrent CNVs are created. Often this mechanism can result in recurrent genomic rearrangements that are observed in multiple patients with the same disorder, as in Charcot-Marie-Tooth disease type 1A, Prader-Willi syndrome, and Smith-Magenis Syndrome, among many others [32,33]. The genomic architecture rendering genomic instability at three loci that are enriched for LCRs are shown in Figure 2.

ASD associated with CNVs on chromosome 16p11.2
In recent years, whole genome analyses of cohorts of patients with autism have identified recurrent CNVs on chromosome 16p11.2, effectively linking reciprocal microduplications and microdeletions at this locus with increased susceptibility to ASD [12,[36][37][38][39]. Importantly, further studies have revealed that CNVs at this locus are responsible for ~1% of all ASD diagnoses, making it the most common CNV to be associated with ASD identified to date [13]. The 16p11.2 locus is flanked by two directly repeated segmental duplications of ~145 kb, which mediate the NAHR that results in the loss or gain of ~600 kb intermediate region containing ~27 protein-coding genes [9,12,40].
Interestingly, the microduplication of this region has also been linked to schizophrenia, suggesting the presence of an underlying biological link between these two disorders [41,42]. This phenomenon also gives a potential genetic basis for the hypothesis of Crespi et al, which states that autism and schizophrenia represent diametric disorders of the social brain [43]. Thus, schizophrenia and autism might reflect mirror traits of the opposing extremes of behavioral phenotypes reflecting evolution of the social brain [43]. The phenotypes caused by CNV at the 16p11.2 locus are extremely heterogeneous, and, in addition to ASD, they have been reported to include metabolic disorders [44][45][46][47], cardiac anomalies [40,48], depressive disorder [49], speech delay [50], mental retardation [40,51,52], vertebral anomalies [52], syringomyelia [53], abnormal head size [36], and epilepsy [36,40], as well as other various congenital anomalies and behavioral abnormalities [44]. As the phenotypes of many more patients harboring CNVs in this genomic region are delineated, the full phenotypic spectrum associated with this locus will likely become more well-defined, and the critical genomic interval and dosage-sensitive genes responsible for the phenotypes will be determined. Indeed, a more recent study described a patient pedigree for a family with multiple generations of autism or ASD that also carry a smaller-sized deletion within the common deletion of 16p11.2, thereby reducing the "critical" interval for ASD to a 118 kb region containing only 5 genes: MVP, CDIPT1, SEZ6L2, ASPHD1, and KCTD13 [54]. To date, none of these genes have been significantly associated with an elevated risk for ASD, which indicates that the situation is likely much more complex [37,55]. Furthermore, correlation between the phenotypes of patients harboring different-or similar-sized CNVs is confounded by extreme heterogeneity and variability of symptoms. For example, a family with three affected members harboring identical 16p11.2 deletions was recently described to have minimal symptom overlap between family members [56]. Subsequent studies have aimed at using model organisms to identify the key dosage-sensitive genes within this region that give rise to the abnormal phenotypes [29,57,58]. Among these, chromosome-engineered mouse models harboring reciprocal deletion or duplication of the mouse chromosome syntenic to human chromosome 16p11.2 have been generated to study the physiological and behavioral phenotypes associated with these chromosome abnormalities [29].

Animal models for human 16p11.2 CNVs
Mouse models were generated through a chromosome engineering approach for the study of human 16p11.2 deletion and duplication CNVs [29]. It was observed that ~50% of mice harboring the deletion CNV die shortly after birth, while duplication mice survive to adulthood, suggesting that the deletion CNV results in a more severe phenotype than the duplication [29]. A similar phenomenon has been observed in other genomic disorders caused by reciprocal CNVs, including Smith-Magenis and Potocki-Lupski syndromes [15]. Expression of the genes within the 16p11.2 region corresponds to gene dos-age in four brain regions that may be relevant for autism, including the olfactory bulbs, cortex, cerebellum, and brainstem [29].
In-cage neurobehavioral phenotypes were assessed in these mice to determine what, if any, affect these CNVs had on autistic-like behaviors. As expected, deletion mice displayed the most abnormal phenotypes, while duplication mice had fewer and milder symptoms. Interestingly, reciprocal phenotypes were sometimes observed for mice harboring reciprocal CNVs. For example, the amount of time spent resting in the cage was lower in deletion mice but higher in duplication mice relative to controls, indicating that 16p11.2 CNVs affect the rate and timing of specific behaviors in a dosage-dependent manner. Deletion mice displayed an abnormal ceiling-climbing behavior where they demonstrated marked stereotypic and nonprogressive motor behaviors, similar to what is often observed in patients with autism or patients with lateral hypothalamic and nigrostriatal lesions in the brain. These abnormal behaviors were accompanied by volumetric and morphological changes in several brain regions, including the lateral hypothalamus. Importantly, the difference between deletion mice and duplication mice was greater than that between deletion mice and controls, indicating that these effects are reciprocal or opposing in nature.
No significant abnormal social behavior was observed in these animal models in the 3chamber test for sociability, indicating either that these animals do not display social abnormalities, or that further investigation into the social behavior of these animals is required. Indeed, with the subtle nature of many social interactions in rodents, it is quite possible that social abnormalities exist in these mice but have not yet been described. It is also distinctly possible that the 'in-cage' environment does not elicit a social deficit that might perhaps be observed in the wild or natural environment of the animal. An extensive battery of tests for social behavior will be required to rule out the possibility of further abnormalities.
Another study was able to identify homologs of 21 of the known 16p11.2 human genes in the zebrafish genome by family tree comparisons [58]. These genes were then targeted for loss of function studies by injecting antisense morpholino oligonucleotides into early embryos [58]. Interestingly ~79% of the genes tested by this method were required for proper brain, eye, or nervous system development, and two of the genes were determined to be dosage-sensitive, with abnormal phenotypes present with a ~50% reduction in gene expression [58]. The results of this study suggest that at least two genes, aldolase a (aldoaa) and kinesin family member 22 (kif22), are highly dosage-sensitive and are required for proper brain function, making them likely candidates for future studies of the ASD associated with CNV of this region [58].
These two studies indicate that while kctd13, kif22, and aldoaa are all potentially interesting dosage-sensitive candidate genes, further investigation is needed to determine whether these genes act together in a epistatic manner to contribute to the full neurological phenotype, whether they modify each other via cis interactions, or whether other genes or genetic elements in the 16p11.2 locus are also contributing to the phenotype. The current data, which could not have been obtained without these important studies in model organisms, cannot distinguish between these possibilities, but they provide a starting point for research into the function of these genes and the molecular pathways underpinning the phenotypes associated with 16p11.2 CNVs.

Prader-Willi and Angelman syndromes
Chromosome 15q11-13 is enriched for LCRs, providing a mechanism for LCR-mediated NAHR, and generating a series of recurrent breakpoints along this chromosome. As a result, interstitial deletion or duplication of this region is common. LCRs can also mediate triplications, or, alternatively, the presence of supernumerary isodicentric chromosome 15 (idic (15)) can lead to crossing over between these LCRs, and ultimately, duplication of the region. A bipartite imprinting center lies at this locus and directs the expression of a number of genes, resulting in a tissue-specific parent-of-origin effect. As a result, many of the phenotypes caused by these structural rearrangements also display parent-of-origin effects.
Paternally-or maternally-inherited deletions of human chromosome 15q11-13 result in Prader-Willi syndrome (PWS) or Angelman syndrome (AS), respectively. Alternatively, these disorders can be caused by uniparental disomy (UPD), or by balanced translocations involving this region. Less frequently, imprinting errors, leading to aberrant methylation of the PWS imprinting center can also cause PWS, and mutations or deletions in the gene UBE3A can cause AS [11,59]. The critical region for AS lies 35 kb telomeric to the PWS critical region [60]. PWS is characterized by intellectual disability, hypotonia, hyperphagia, obesity, compulsive and repetitive behaviors, skin picking, tantrums, irritability. In addition, congenital abnormalities are often observed, including hypogonadism, facial dysmorphism, and small hands and feet, among others. PWS can also be associated with psychosis, mood disorders, and ASD [61].
AS is a neurodevelopmental disorder that is characterized by severe developmental delay, intellectual disability, microcephaly, seizures, lack of speech, ataxia, and dysmorphic facial features [11,59]. Patients with AS are often described as having happy demeanors, however hyperactivity, attention deficits, aggression, and repetitive or stereotypic behaviors have also been described [59]. AS has been associated with ASD in several studies, however, the severity of the cognitive impairments in most patients with AS may preclude an accurate diagnosis [11,59].

Duplication of 15q11-13
It has been estimated that up to ~5% of cases of ASD can be attributed to maternal duplication of the genomic region reciprocal to the PWS-AS critical region on chromosome 15q11-13, making it one of the most common chromosomal abnormalities observed in patients with ASD [10,62]. Due to the presence of imprinting at this locus (discussed above), parent-of-origin effects are seen, and, for interstitial duplications, maternal origin confers an increased risk for clinical phenotypes. Paternal duplications are much less common, and do not appear to lead to ASD, as familial cases have been described where a seemingly normal carrier mother transmits a paternally-derived duplication to their child [63]. However, a small number of subjects with paternal duplication of 15q11-13 and various clinical phenotypes have been described [64]. Phenotypes are dosage-sensitive at this locus; one extra maternal copy of 15q11-13 results in partial autism penetrance, while two extra copies (caused by idic15 or interstitial triplication) result in a much higher penetrance of autism as well as additional phenotypes that are typically more severe than those seen in patients with duplications [62]. In the case of triplications, parent-of-origin effects are no longer observed, and both paternal and maternal duplications are associated with poor clinical outcomes [65]. This loss of parent-of-origin effects is interesting, and it may be an indication of the underlying mechanism that may give rise to a predisposition for these phenotypes. Many heterogeneous and complex phenotypes can be associated with increased copy number of this region including intellectual disability, apraxia, dyslexia, seizures, hypotonia, developmental delay, gait abnormalities, hyperactivity, schizophrenia, and ASD [66]. Patients with ASD due to duplication of 15q11-13 also display several stereotypic and repetitive behaviors, including rocking, licking, and hand-flapping, among others, that are often directed towards sensory stimulation, suggesting that the underlying cause of these phenotypes may be due to a disregulation of sensory inputs or signaling [63,64].
Interestingly, recent post-mortem evaluation of the brains of patients harboring maternal duplication of 15q11-13 suggested that accumulation and deposition of abnormal intracellular and extracellular amyloid β protein (Aβ) in the specific regions and neuron types in the brains of patients with maternal duplication of 15q11-13 may underlie or contribute to some of the neurobehavioral phenotypes associated with ASD [67]. However, further studies are needed to confirm this hypothesis. Several animal models for the CNVbased syndromes associated with chromosome 15q11-13 have been developed to facilitate research into these disorders.

Modeling 15q11-13 CNVs in mice
The first mouse model for PWS was generated by targeted deletion of part of the imprinting center on chromosome 15q11-13 [68]. While these mice model several aspects of PWS, including hypersensitivity to sensory input in the form of increased acoustic startle response and decreased prepulse inhibition, it is not known whether these mice exhibit other autisticlike behaviors, such as impaired social interactions or altered communication [69]. Further behavioral characterization is needed to determine whether these mice accurately recapitulate the neurobehavioral phenotypes seen in patients with PWS and/or ASD.
Mouse models for Angelman syndrome were initially generated by disrupting maternal expression of Ube3a; these mice exhibit increased anxiety-like behavior that may be due to the disruption of a glucocorticoid receptor transactivation in the brain [70]. They also display various motor defects [71], abnormal cerebellum-driven licking behavior [72], sleep disturbance [73], abnormal EEG patterns, and cognitive defects in the conditioned fear and Morris water maze tests [74]. However, these mice displayed hypoactivity and normal social seeking behavior, in contrast to what is seen in human patients with AS and/or ASD [75]. Importantly, human patients with mutations in UBE3A typically have a milder phenotype than those patients harboring interstitial deletions, so this mouse model may not accurately reflect the majority of AS patients, who have a deletion containing this gene, as well as many other genes at this locus. Indeed, when a larger 1.6 Mb deletion model was generated by chromosome engineering that encompasses the genomic region from Ube3a to Gabrb3, the phenotype of these mice was more severe than that of mice with deletion of Ube3a alone [76]. Similar to Ube3a deficient mice, large deletion mice had significant motor impairment, anxiety-like behavior, and abnormal EEG, but they also had learning and memory defects and abnormal communication [76]. These larger deletion mice may be an appropriate model for CNV-associated ASD, however further investigation into the social neurobehavioral phenotypes is necessary.
Chromosome-engineered mouse models harboring a duplication of 6.3 Mb on mouse chromosome 7 syntenic to the duplication of human chromosome 15q11-13 associated with ASD were developed to study the underlying mechanism behind the phenotypes associated with this CNV in humans [28]. The core features of autism, including abnormal social interactions, stereotypic or ritualistic behavior, and impaired communication were all evaluated in this mouse model, and patDp/+, but not matDp/+ mice were determined to have reduced sociability compared to wild-type mice. Ultrasonic vocalizations were evaluated in pups separated from the dam, and an abnormal USV pattern was observed in patDp/+ but not matDp/+ mice. Specifically, patDp/+ mice appeared to have delayed development of communication, they emitted a greater number of USVs, and some pups emitted vocalizations at abnormally high frequencies (>70 kHz). In order to evaluate communication between older mice, pairs of mice 7-8 weeks of age were observed in a resident intruder paradigm and the pattern of USVs was measured during this interaction. Vocal communication between pairs of pat/Dp/+ mice was significantly reduced compared to the vocalization recorded between WT pairs, giving further support to the notion that patDp/+ mice have a defect in social communication. However, no defects in olfactory communication or function were observed in these mice. Restricted behaviors and inflexibility were evaluated with a battery of behavioral tests, including the Morris water maze and the Barnes maze. These tests revealed that patDp/+ mice do not respond as flexibly to a change in situation as wild-type or matDp/+ mice. No overt defects in learning and memory were observed in either the patDp/+ or matDp/+ mice by the Morris water maze or conditioned fear test, although patDp/+ mice did display generalized fear and elevated anxiety-like behavior. The abnormal neurobehavioral phenotypes observed in this CNV-based model of ASD could not be attributed to gross morphological or histological changes in the olfactory bulb, cerebral cortex, hippocampus, amygdala, corpus callosum, or cerebellum. Nor were any abnormalities in the number of Purkinje cells detected in the cerebellum, suggesting that the underlying pathomechanism responsible for these phenotypes is likely due to aberrations in molecular pathways that remain to be determined.
The gene Ube3a (also known as E6-AP) codes for E3 ubiquitin-protein ligase, which belongs to a family of E3 ligase genes that are involved in synaptogenesis and have recently been linked to the pathogenesis of ASD [62]. Given the known association of the gene UBE3A with AS and its maternal-specific expression pattern in neurons, a mouse model that overexpresses this gene was generated to test the hypothesis that Ube3a is responsible for many of the phenotypes associated with duplication of 15q11-13 [62]. Transgenic overexpression of Ube3a via bacterial artificial chromosome (BAC) recombineering in mice resulted in autisticlike neurobehavioral phenotypes, including defects in communication, abnormal social behavior, and increased repetitive or stereotypic behavior [62]. Similar to the phenomenon observed in human patients, these effects were also determined to be dosage-sensitive, as the phenotypes were more penetrant in mice with three-fold overexpression of Ube3a than in those observed in mice with two-fold overexpression of this gene. Furthermore, it was determined that glutamatergic synaptic transmission was suppressed, providing a potential mechanism underlying the neurobehavioral phenotypes.

Smith-Magenis and Potocki-Lupski syndromes
Smith-Magenis syndrome (SMS) and Potocki-Lupski syndrome (PTLS) are two prototypical genomic disorders caused by reciprocal deletion (SMS) or duplication (PTLS) resulting in gene copy number variation on chromosome 17. SMS results from a de novo, recurrent, 3.7 Mb deletion in 17p11.2 -del(17)(p11.2p11.2) in ~73% of cases, that is a consequence of NAHR mediated by LCRs flanking the region [77][78][79]. However, many of the pleiotropic features of SMS appear to result from haploinsufficiency of a single gene lying in the middle of the SMS critical region, retinoic acid induced 1 (RAI1), as determined by the identification of non-deletion SMS patients with heterozygous point mutations in RAI1 [80][81][82]. Patients with mutations in RAI1 manifest most of the phenotypes observed in subjects with a chromosomal SMS deletion, demonstrating that the reduced dosage of the RAI1 gene alone may cause much of the SMS phenotype [81,83].
Both SMS and PTLS manifest a broad range of opposing or overlapping phenotypes. SMS is characterized by multiple congenital anomalies, including otolaryngologic, ophthalmologic, brain, cardiac, craniofacial, and renal abnormalities, as well as intellectual disability (ID), brachydactyly, sleep disturbance, hearing impairment, obesity, scoliosis, and other neurobehavioral abnormalities [84,85]. Specifically, SMS patients display aggressive and self-injurious behaviour, including polyembolokoilamania [84], as well as characteristic repetitive behaviors, including autoamppexation or "self-hugging," which is an identifying feature of the disorder [86,87]. More recently, SMS patients have also been described as meeting the criteria for autism spectrum disorder (ASD) [88].
The PTLS duplication was the first predicted reciprocal duplication to be described [89]. PTLS was identified and initially defined much later than SMS, ( [89] versus Smith et al. 1986); as a result, fewer PTLS patients have been medically examined and fewer studies of the clinical phenotypes are available in the literature. The clinical features that have been observed in patients with PTLS are distinct from those seen in SMS [15], although cognitive and neurobehavioral abnormalities are present in both disorders. PTLS patients lack the self-injurious behaviors, abnormal facies, and sleep disturbance, as well as some of the congenital anomalies found in most individuals with SMS. The features observed in greater than 90% of PTLS patients are developmental delay, neurobehavioral abnormalities, language impairment, cognitive impairment, poor feeding, hypotonia, and oropharyngeal dysphasia [15,90]. When evaluated by objective clinical assessment, the majority of PTLS patients have autistic features such as decreased eye contact, atypicality, withdrawal, anxiety, and inattention, meeting criteria for a diagnosis of autistic spectrum disorder (ASD) or pervasive developmental disorder not otherwise specified, and making ASD the most common and consistent feature observed in PTLS. [14].
Most PTLS patients have no distinctive facial abnormalities but they can have a triangularshaped face. The other clinical features present in over half of patients include sleep apnea, abnormal EEG, attention deficit, hypermetropia, and cardiovascular abnormalities [15]. These cardiovascular abnormalities can typically include both structural and conduction defects, such as atrial or ventricular septal defects, bicuspid aortic valve, dilated aortic root, dilation of the pulmonary annulus, patent foramen ovale, or hypoplastic left heart [15,[91][92][93].
Upon molecular analysis, most [22 of 35] PTLS patients included in the first multidisciplinary study were determined to carry a common recurrent 3.7 Mb duplication in 17p11.2 mediated by the same proximal and distal SMS-REPs which also mediate the reciprocal common recurrent SMS deletion [15]. Others have uncommon and sometimes complex genomic rearrangements, all of which involve duplication in 17p11.2 [94]. The smallest PTLS duplication identified to date occurred in a single patient and is 1.3 Mb in size. This duplicated segment contains 14 genes, including both RAI1, the major contributing gene for the reciprocal deletion causing SMS, as well as the steroid-metabolism regulating gene, SREBP1. This patient demonstrates all typical PTLS phenotypes [94]. Whether, or to what extent, PTLS results from RAI1 gene over-dosage still remains to be elucidated, although mouse studies (described below) have shown that it is likely responsible for at least some of the symptoms [95].

RAI1: Retinoic acid induced 1
Spanning over 120 kb, the RAI1 gene consists of six exons, of which the third is the largest, containing >90% of the coding region [80,85,96]. All of the point mutations identified in SMS patients to date lie within this exon. Most of the mutations are frameshift or nonsense mutations occurring in a specific heptameric poly C tract hotspot region within RAI1, and thus likely cause loss-of-function alleles [82,97].
The RAI1 transcript is 7.6 kb, encoding a 1906-amino acid, ~200 kDa protein with several known domains, including an extended plant homeodomain (PHD) zinc finger in the carboxyl-terminus (residues 1832-1903; [80]), a polymorphic polyglutamine (CAG) tract in the N-terminus that is associated with the severity of the phenotype and medication response in patients with schizophrenia, as well as the age-at-onset of spino-cerebellar ataxia type 2 (SCA2) [96,98], two polyserine tracts, two transactivation domains [99], and two bipartite nuclear localization signals (NLS). Importantly, the PHD in Rai1 is highly conserved in the trithorax family of nuclear proteins involved in transcriptional regulation as well as in the formation of a chromatin remodeling complex, suggesting that Rai1 may also function as a transcriptional regulator [100]. Further strengthening this connection, Rai1 is known to be located in the nucleus and have transactivation activity [99], and it shares a similar genomic structure (>50% shared identity and similar zinc finger domains) with another gene, TCF20, or stromelysin1 platelet-derived growth factor (PDGF)-responsive element-binding protein (SPBP), which is known to act as a nuclear transcriptional cofactor [101].
In the human brain, RAI1 is highly-expressed in the hippocampus and the cerebellar cortex, and it is globally-upregulated in the occipital, temporal, and parietal lobes according to expression data from the Allen Brain Atlas (Allen Institute for Brain Science). In contrast, it appears to be down-regulated in the cerebellar nuclei, corpus callosum, dorsal thalamus, and frontal lobe, suggesting that its expression is confined to specific brain regions. Similar to what is seen in humans, Rai1 is also upregulated in the hippocampus and cerebellum of adult mice [102]. Rai1 is critical for development, and the majority of Rai1 -/-mouse embryos are resorbed during development by E15.5 [99]. While Rai1 expression is certainly necessary early in fetal development, according to expression data from mouse embryos, peak Rai1 expression occurs at E18.5 and persists until P4 (Allen Brain Atlas), indicating that it is also required for post-natal development. Although the precise function of RAI1/Rai1 is not currently understood, it is known to be part of a dosage-sensitive pathway that most likely regulates neuronal development and organogensis, that, when perturbed, results in many of the phenotypes observed in both SMS and PTLS. Importantly, RAI1 has been identified in a reconstructed human genenetwork (Prioritizer) as an important candidate gene for involvement in idiopathic autism, suggesting that this gene may function in a common pathway that may influence ASD phenotypes in non-syndromic patients as well [103].

Modeling SMS and PTLS in rodents
Several mouse models interrogating the critical region for SMS and PTLS have been generated in the past decade in order to have an appropriate animal model system to evaluate the phenotypes in SMS and PTLS and to further study the molecular mechanism underlying these disorders. The first of these strains was developed in 2003 using a chromosome-engineering approach described earlier in this chapter [24]. The resulting mouse models harbour either a chromosomal duplication (Dp(11)17, modeling PTLS) or deletion (Df(11)17, modeling SMS) of ~2 Mb that is syntenic to the SMS/PTLS critical region. Soon after, several smaller deletion strains (~590 kb -1 Mb) were created using retroviral insertion of loxP sites in ES cells with one fixed end, with the intent to determine which other genes in the critical region may contribute to the complex phenotypes in SMS [104,105]. Once SMS patients with point mutations in RAI1 were identified, a mouse model harbouring a truncated null allele for Rai1 was generated via gene targeting to further study the function of this dosage-sensitive gene and to compare the phenotype of this model with that of the deletion strains [99,102]. Likewise, a mouse model harbouring the Rai1 transgene (TgRai1), and globally over-expressing Rai1 at steady-state levels similar to those seen in Dp(11)17/+ mice, was constructed to further study the function of this gene in PTLS [106].
Initial studies of Dp(11)17/+ mice determined that they have reduced weight, reduced abdominal and inguinal fat, and reduced spleen weight [24]. Upon analysis of some of the behavioral traits of these mice, it was determined that they display anxiety-like behaviors, have reduced maximum startle response during the pre-pulse inhibition test, and defects in contextual fear conditioning [107], as well as several other abnormal social behaviors, including decreased nesting, abnormal sociability, and dominant behaviour [108]. Further investigation into the neurobehavioral abnormalities in these mice found that they also have decreased preference for social novelty, motor defects, and increased activity levels in the open field [109]. Many of these behavioral phenotypes are reciprocal or opposing to those seen in Df(11)17/+ mice, underscoring the dosage-sensitive nature of these disorders [109]. For example, a recent study investigating cerebellum-driven licking behavior in Dp(11)17/+ and Df(11)17/+ mice found that many of the quantitative licking behavior parameters analyzed were altered in a directly-opposing manner [110]. Specifically, the interval between visits to the waterspout, number of licks per visit, and variability in the number of licks per lick-burst were all altered in duplication and deletion animals in opposite directions compared to wild-type mice (ex: longer versus shorter intervals, etc).
Recently, an extensive battery of behavioral tests were performed and Dp(11)17/+ mice were observed to display complex social abnormalities, including defects in social recognition, dominant and aggressive behavior, as well as abnormal response to social odors [30]. Furthermore, these mice were shown to have altered communication, anxiety-like behavior, disordered circadian rhythm, learning and memory deficits, motor defects, and stereotypic, repetitive behaviors, confirming that these mice model both the core and associated features of autism. In addition, rearing these mice in an enriched environment mitigated or rescued certain neurobehavioral abnormalities, suggesting a role for gene-environment interactions in the determination of copy number variation-mediated autism severity [30].
The phenotypic changes observed in Dp(11)17/+ and Df(11017/+ mice are accompanied by changes in gene expression; on average, transcripts in the critical interval are expressed at 138 + 29% of wild-type levels in Dp(11)17/+ mice, and at 66 + 15% of wild-type levels in Df (11)17/+ mice [109]. The expression level of these genes can be normalized to roughly that of wild-type mice by crossing Dp(11)17/+ and Df(11)17/+ mice to create a double heterozygote carrying two copies of the genes within the critical region in cis, (as opposed to the typical trans orientation in wild-type mice). However, the presence of the structural variation itself affects expression of genes outside the affected interval, resulting in "genome regulation" that may ultimately contribute to the phenotype. As a result, these double heterozygous mice display some abnormal behaviors, including elevated activity levels and decreased preference for social novelty [109].
When Rai1 is over-expressed in mice (Rai1-Tg, modeling PTLS), these mice have growth retardation, are underweight, display anxiety-like behavior, social dominance, motor abnormalities, and have increased motor activity in juvenile mice. Furthermore, there is a dosagedependent exacerbation of this phenotype [106]. These mice also display abnormal maternal behavior, altered sociability, reduced reproductive fitness, and impaired serotonin metabolism [111]. Together these results suggest that Rai1-Tg mice display a complex neurobehavioral and metabolic phenotype similar to that of mice harboring the Dp(11)17 CNV, suggesting that RAI1 is likely responsible for some, if not many of the phenotypes identified in PTLS. Further support for this hypothesis is indicated by studies of Dp(11)17/Rai1 doubleheterozygous mice with normalized copy number of Rai1, but increased dosage of the surrounding interval; this study revealed that normalization of Rai1 copy number was able to correct weight differences, and at least partially rescue phenotypes on behavioral tests for locomotor activity, anxiety, and learning and memory [95].

Gene-environment interactions
While genetic defects play a part in the etiology of ASD, environmental effects have long been thought to contribute to these disorders. For example, although the majority of SMS/PTLS patients present with either deletion or duplication of the same ~3.7 Mb genedense region, there is significant variability in the clinical phenotype [112]. Furthermore, while there are some significant differences in the incidence of the abnormalities in patients with the common deletion/duplication compared to those patients with smaller or larger-sized CNVs, a clear distinction between these sub-groups of patients cannot be made; many of these phenotypes are therefore likely strongly influenced by genetic background as well as environmental effects [83,94]. While gene-environment interactions may potentially explain the source of the variability seen in these syndromes, investigation into the specific environmental factors that may affect outcomes for these genomic disorders has yet to be undertaken.
Chromosome-engineered mouse models for ASD are ideal for the study of complex disease, as they are mechanistically similar to human patients (targeted duplication/deletion syntenic to human critical interval), they are polygenic (numerous genes are affected), the observed phenotypes equate with common, clinically described features (neurobehavioral phenotypes, sleep disorder, etc), and they can be influenced by environmental factors. In addition, autism is known to be highly variable, and it is suspected to be dependent on both genetic and environmental factors, such as low birth weight and gestational age, prenatal exposure to various agents, parental age at birth, diet, infection, xenobiotic and pesticide exposure, among others [113]. Many of these environmental insults are amenable to study using mouse models, as the interaction of these environmental factors with CNVs can be directly tested in congenic mouse models to control for the effects of genetic background.
Molecular analysis of these mouse models, as well as patient samples, can also be utilized to dissect the role of specific genes or CNVs responsible for the susceptibility to the influence of environmental factors in these autism-related syndromes. Most importantly, the results of these types of studies can provide useful insights as to how genes/CNVs can interact with environmental factors in the context of complex human diseases; this may lead to strategies to alleviate symptoms of not only rare genomic disorders, but also more common idiopathic forms of autism or ASD. Furthermore, these models represent an important resource for future studies of the pathomechanisms underlying ASD, as well as potential treatments for ASD. They may also foster further investigation into the genomic basis of autism and complex behavior, as well the underlying genetic mechanisms leading to these pathogenic CNVs. In studying CNV-based models for complex genomic disorders and ASD, we have come to realize that the ideal animal models of ASD should not only phenocopy relevant human symptoms, but the phenotypes should also be based on similar underlying physiological and genetic mechanisms.