Autism is a neurodevelopmental disorder of complex etiology and is amongst the most heritable of neuropsychiatric disorders while sharing genetic liability with other neurodevelopmental disorders such as intellectual disability (ID). Autism spectrum disorders (ASDs) are defined more broadly and include autism, Asperger syndrome, childhood disintegrative disorder and pervasive developmental disorder not otherwise specified. Under the Diagnostic and Statistical Manual of Mental Disorders, 4th Edition Revised (DSM-IVTR), these disorders are grouped together with Rett syndrome (“Rett’s disorder”) as pervasive developmental disorders. However, Rett syndrome has a reportedly distinct pathophysiology, clinical course, and diagnostic strategy (Levy & Schultz, 2009) and will likely be removed in the impending publication of DSM-V (APA, 2010). The new diagnostic manual will formally adopt the single diagnostic category “ASDs”, which is used here. Reported prevalence rates for ASDs range from 20 (Newschaffer et al. 2007) to 116 (Baird et al., 2006) per 10,000 children, and vary in accordance with diagnostic, sampling, and screening criteria. The Centers for Disease Control and Prevention (CDC) suggest that in the United States, the prevalence of ASDs is 1 in 110 (1/70 in boys and 1/315 in girls) (ADDM, 2009). The three primary characteristics of ASDs are communication impairments, social impairments, and repetitive/stereotyped behaviors. The DSM-IVTR, ICD-10, and many other diagnostic instruments require impairment in each of these domains for a diagnosis of autistic disorder.
Within the last decade, a number of major technological developments have transformed our understanding of the genetic causes of autism, and the field continues to evolve rapidly. In this chapter, we will review three approaches to identifying genetic factors that contribute to the pathogenesis of ASDs: 1) common variants and genome-wide association studies (GWAS); 2) rare variants and copy number variation (CNV) studies, and 3) familial forms of autism and the role of next-generation sequencing (NGS) methods. Data from all three approaches underscores the conclusion that autism is a highly complex and heterogeneous disorder, involving a multifactorial etiology. Moreover, it is becoming increasingly apparent that autism is not a unitary disorder, and that the spectrum may consist of any number of different autisms that share similar symptoms or phenotypes. This conclusion has important implications for evaluation and treatment, which are discussed in the conclusion.
2. Heritability of ASDs
Although Skuse (2007) cautions that heritability estimates of ASDs may have been skewed by the co-inheritance of (low) intelligence or other variables, there is little doubt that genetic factors play a key role in autism. In the most widely-cited twin study, Bailey et al. (1995) report that monozygotic twins are 92% concordant on a broad spectrum of cognitive or social abnormalities, compared with only 10% for dizygotic twins. Parents and siblings of individuals with ASDs often exhibit subsyndromal levels of impairment (Piven et al., 1997), and having an affected sibling is the single biggest risk factor for developing an ASD. In an analysis of 943,664 Danish children (Lauritsen et al., 2005), the strongest predictors of autism were siblings with ASDs, who conferred a 22-fold increased risk, while Fombonne (2005) suggested that this risk may be even greater.
3. Early insight from Rett syndrome and fragile X
Early efforts to identify the genetic causes of ASDs utilized linkage and association approaches. Linkage studies, more prominent in the 1980s and 1990s, typically focus on families or larger pedigrees and is well powered to identify rare genetic variants. The most common linkage approach is the affected sib-pair design (see O’Roak & State, 2008), which examines the transmission of genomic segments through generations. Linkage studies helped define the locus containing
Association studies take the opposite approach, scanning the genome from the top down, with the goal of determining
4. Genome wide association and common variants
Aside from notable successes with fragile X and Rett syndrome, early linkage and association studies have been inconsistent in resolving more complex genetic correlates of ASDs, with candidate genes often not being replicated between studies. These challenges may in part accounted-for by their relatively low resolution, which makes it difficult to detect candidate loci other than those of major effect. In the past decade however, association studies have become increasingly more sophisticated, with the whole-genome approach, allowing us to examine thousands of individuals on a mass scale, using hundreds of thousands of markers.
Genome-wide association studies (GWAS) examine the frequency of single nucleotide polymorphisms (SNPs) in cases
GWAS test for common variants (>1% population frequency), with the assumption that ASDs are at least in part caused by the coinheritance of multiple risk variants, each of small individual effect (odds ratios between 1:1 and 1:5). This assumption is known as the common disease-common variant (CDCV) model (Risch and Merikangas, 1996).
A 2009 paper by Wang et al. (2009) from our laboratory was the first to identify common variants for ASDs on a genome-wide scale. We examined 780 families (3,101 individuals) with affected children, a second group of 1,204 affected individuals, and 6,491 controls, all of whom were of European ancestry. We identified six genetic markers on chromosome 5 in the 5p14.1 region that confirmed susceptibility to ASDs. The region straddles two genes,
The study design utilized two independent replication cohorts and the key SNP at this locus has also been replicated in two additional independent cohort studies (Ma et al., 2009; St Pourcain et al., 2010), lending further support that genetic factors at the 5p14 locus, which is flanked by two relevant cadherin genes, represent strong candidates for aligning molecular function with known neural deficits in ASDS. The original report by Wang and colleagues, also demonstrated that there was an enrichment in Catherin associated genes in ASDs in general, based on gene-pathway analysis (Wang et al, 2009). Cadherins represent a large family of transmembrane proteins that mediates calcium-dependent cell–cell adhesion and,
4.1. Replicated common variants from candidate gene studies
Other common variants from candidate gene studies include CNTNAP2, EN2, and MET, which are reviewed briefly below. A more in depth review of these genes can be derived from catalogs at http://www.genome.gov/26525384 and http://w w w.ncb.nlm.nih.gov/o mim/209850.
Located on chromosome 7q35, Contactin Associated Protein 2 (
Engrailed 2 (
4.2. Unexplained variance
For the most significant discovery SNP identified in the Wang
The possibility that common variants are not the major cause of ASDs is also gaining increased support from the preponderance of copy number variation (CNV) studies, which are identifying rare variants with a stronger causal impact.
5. Copy number variation in ASDs
Copy number variations (CNVs) are insertions, deletions, or translocations in the human genome that are universal in the general population but more commonly found in genic regions in individuals with neuropsychiatric disorders (e.g. Pinto et al., 2010). CNVs can be detected by the same SNP arrays used in GWAS, and vary in length from many megabases to 1 kilobase or smaller. They are often not associated with any observable phenotype.
One of the most widely-known CNVs is Down syndrome, which is characterized by an extra chromosome 21. Rett syndrome is also caused by a CNV, which includes a deletion in
Sebat et al. (2007) provided some relevant early insights into the genomic features of CNVs. Firstly, they noted that
A number of subsequent studies have greatly expanded the number of candidate loci. Our laboratory (Bucan et al. (2009)) reported 150+ CNVs in 912 ASD families that were not found in 1,488 controls. Critically, 27 of these loci were replicated in an independent cohort of 859 ASD cases and 1,051 controls. Some of the rare variants we identified had previously been associated with autism, including
Glessner et al. (2009) identified and reported CNVs in two major gene networks, including neuronal cell adhesion molecules (such as
In a similar approach, Pinto et al. (2010) further confirmed the importance of rare CNVs as causal factors for ASDs. Interestingly, the group did not observe a significant difference between cases and controls in terms of raw number of CNVs or estimated CNV size. However, the number of CNVs in genic regions was significantly greater in ASDs compared to controls. Again, loci enriched for CNVs include a number of genes known to be important for neurodevelopment and synaptic plasticity, such as
Finally, Gai et al. (2011) took a slightly different approach, focusing exclusively on inherited CNVs. While underlying loci were not necessarily common to those identified by the Glessner and Pinto groups, enrichment in pathways involving central nervous system development, synaptic functions and neuronal signaling processes was again confirmed. The Gai
Collectively, these CNV studies suggest that certain hotspots on the genome are particularly vulnerable to ASDs, which include loci on chromosomes 1q21, 3p26, 15q11-q13, 16p11, and 22q11. These hotspots are part of large gene networks that are important to neural signaling and neurodevelopment and have additionally been associated with other neuropsychiatric disorders.
In particular, a number of CNV studies in schizophrenia have highlighted structural mutations incorporating chromosomes 1q21, 15q13, and 22q11 (e.g. McClellan and King 2010; Glessner et al., 2010), which are significantly enriched in cases versus controls, with
6. Sequencing familial forms of ASDs
To this point, we have focused primarily on the complex interactions of polygenic networks as the major cause of ASDs. However, this is not exclusively the case. Paralleling the recent spate of CNV is a renewed focus on rare disorders, including familial forms of complex diseases that potentially are monogenic or with less complex inheritance pattern. At the outset of this chapter, we emphasized the overlap with fragile X syndrome, where one third of cases are co-morbid for ASD. As mentioned, fragile X is caused by a failure to express the protein coded by
X-linked genes encoding neurologins
The identification of monogenic or possibly oligogenic autisms is likely to accelerate in the next several years as next-generation sequencing becomes more widely available. We recently encountered a family of two parents, six healthy siblings, and two siblings with severe autism suggestive of autosomal recessive inheritance. Unsuccessful attempts using linkage and CNV approaches failed to identify a causal locus, but whole-exome sequencing at 20x coverage identified four genes, including one with a non-synonymous SNP in the protocadherin alpha 4 isoform1 precursor (
Known syndromes with ASD features include fragile-x, neurofibromatosis type 1, down syndrome, tuberous sclerosis, neurofibromatosis (which confers a 100-fold increased risk for ASDs Li et al. (2005), Angelman, Prader-Willi and related 15q syndromes, and at least several dozen others (see Zafeiriou et al., 2007 for a comprehensive review). Table 1 from Volkmar et al. (2005) lists the most commonly associated syndromes with median rate and range. It is likely that many more unidentified rare syndromes with Mendelian causes have ASD phenotypes. As of March 2011, the Online Mendelian Inheritance in Man (OMIM) database listed 6,727 known or suspected Mendelian diseases (MD), with 2,993 (44%) of these having an identified molecular basis. Since OMIM derives its data from published reports, these figures likely under-represent rare disorders, which may go unreported. It has been proposed that as many as 30,000 genetic disorders may exist, suggesting that many Mendelian disorders have no genetic etiology identified to date. Given the large-representation of autism phenotypes in known syndromes, we can assume a similar trend in unreported disease.
It remains to be determined whether rare variants will account for the majority of autisms. Irrespective, as with many other aspects of scientific inquiry, the study of rare variants will continue to play an important role in explicating the pathogenesis of ASDs. El-Fishawy and State (2010) point to hypercholesterolemia and hypertension (Brown, 1974; Lifton et al., 2001) as examples where rare mutations have been successful in driving a molecular understanding of the disease as opposed to identifying risk factors in the general population. Rare mutations, particularly when they are Mendelian, carry large effects and are typically in genic regions. These characteristics make the resolution of underlying networks distinctly less complex and, moreover, are amenable to modeling in other systems.
Recent groundbreaking studies by Marchetto et al. (2010) and Muotri et al. (2010), who created a cell culture model of Rett syndrome, are potentially exciting developments in this regard. Here, the researchers used skin biopsies from four Rett’s patients, each carrying a different
ASDs are clearly highly heritable disorders and advances in gene-finding technology in the past decade have rapidly accelerated gene discovery. As is typically the case, successive developments have made the problem more complex such that there are dozens of candidate genes, many of which remain to be replicated. In spite of this complexity, we can observe a number of patterns beginning to unfold 1) the relative scarcity of causal common variants, 2) the growing list of causal rare variants, and 3) the emergence of monogenic disorders with primary and secondary ASD phenotypes.
The monogenic autisms are particularly interesting from a treatment perspective, as they provide a mechanism for studying ASD phenotypes in model systems and an obvious target for drug intervention. They are also amenable to clinical testing and the decreasing cost of research technologies means that this capacity is more widely available to clinicians. In fact, as the resolution of clinical instruments becomes more sophisticated, it is likely that the clinic will become a primary workplace for syndromic discovery.
A key requirement in driving gene discovery is the necessity of high-quality phenotype data. ASDs are notoriously heterogeneous, and are fractionated in terms of symptoms and trajectory. Mandy & Skuse (2008) reviewed seven factor analysis studies of ASDs symptoms, and found that all but one dissociated social and non-social factors. In a non-clinical sample of 3,000 twin pairs, Happé
The converse, of course, is also true with a large number of candidate genes contributing to the majority of known ASDs. With ~80% of genes expressed in the brain it is likely that this number will continue to grow, and here again careful phenotyping is critical to identifying functional consequences. Ultimately, the primary goal is not to determine the frequency of variation/mutation in cases