Autism is a behaviourally defined developmental disorder characterised by impairments in social communication, restricted interests and repetitive behaviours . Abnormalities in these three developmental areas tend to cluster together in affected individuals. In DSM-IV, Autism is part of a larger continuum of disorders collectively called Pervasive Developmental Disorders. Autism spectrum disorders (ASD) refer to Autism, Pervasive developmental disorder, not otherwise specified, and Asperger syndrome. All individuals with ASDs have qualitative abnormalities of social development in combination with disorders of communication and/or stereotyped repetitive interests and behaviors. The social skills that develop naturally in typically-developing children do not do so in children with ASD. In addition, there are several behaviors and co-morbid symptoms that relate to each of the three classical impairments. Recent studies have reported rates of co-occurring intellectual disability in the range of 25-50%. Neither developmental delay nor cognitive impairment are required for an ASD diagnosis.
Fombonne and colleagues recently estimated the prevalence of strictly–defined autism at approximately 15-20 per 10,000 people . When the definition of autism is relaxed to include Autism Spectrum Disorders, the prevalence estimated expands to approximately 60 in in 10,000 children [2, 3].
Little is known of the biological basis of ASD and the future development of rational knowledge based treatments will depend on a comprehensive understanding of innate biological predisposition and its interaction with environmental factors. The identification and characterisation of the genetic variation and genes involved in ASD is a route towards this goal. This chapter outlines the various approaches that have been applied to this task, in the context of rapidly evolving technology and human genome resources, and summarises the state of knowledge at this time, anticipating future developments, and outlining the implications for clinical management.
2. Autism is a heritable disorder
There is unequivocal evidence to support the role of genetic liability as the basis of familiality in the aetiology of ASD. In their seminal work, Folstein and Rutter  observed that 4 of 11 (36%) pairs of identical twins were concordant for strictly defined autism – whilst none of 10 pairs of fraternal twins were concordant. In a large follow-up study, which included these original twins, Bailey and colleagues observed a more striking concordance rate in identical twins (60%) with no concordance in the fraternal twins . This highly cited study underlined ASD as the most heritable of the neurodevelopmental and psychiatric disorders with heritability estimates of 91 - 93%. A recent review of over 30 twin studies of ASD  noted that the early twin studies, which relied upon a strict diagnosis of autism, show a median identical and fraternal concordance of 76% and 0% respectively. The expansion of these and other studies to include a broader definition of ASD revealed a median concordance rate of 88% and 31%. In both cases, these data reveal that ASD is underpinned by a strong genetic component. More recently, Hallmayer and colleagues  using more stringent clinical diagnostic tools, such as the Autism Diagnostic Interview (ADI; ) and the Autism Diagnostic Observation Schedule (ADOS; ) in a group of 202 twin pairs from the California Twin Registry, identified similar concordance rate of ASD as previously attained; with identical twin boys showing a concordance rate of 77% and fraternal twin boys 31%.
3. Genetic linkage and association
The heritability estimates above calculated from twin studies imply that there is unknown genetic variation in children with ASD that confers risk. Several models may fit the observed family and twin data including, at the extremes, a ‘common disorder-common-variant’ model, with many DNA variants of small effect combining together to create risk; and a multiple rare variant model where ASD is in reality a large collection of different and individually rare disorders. The true picture is only slowly emerging, and as with many other common disorders, common low penetrant and rare higher penetrant DNA variants combine to jointly contribute to risk. Molecular genetic methods have developed exponentially over the last few decades with ever increasing knowledge of the structure and function of the human genome. For disorders with largely unknown aetiology, such as ASD, using human genetic studies to identify risk genes is a gateway to an understanding of underlying biology. Such studies may focus on individuals or families each using different but related genetic strategies to track down risk or causative genetic variation. These methods include genetic linkage, genetic association, and direct examination of structural and copy number variation.
The basis of genetic linkage is the physical co-location of genes or other DNA variation. Such DNA variation is more likely to be transmitted together (or linked) than those that are further apart. Linkage studies take advantage of this non-random assortment of genetic variation. A linkage study calculates whether a known genetic variant and a disease mutation (represented by the disease trait) are linked and if so, roughly localises the causative mutation. Following the successes of using these approaches in the discovery of multiple loci implicate in Mendelian disorders, researchers were encouraged to apply linkage methodology to more complex traits, such as ASD where Mendelian principles may apply, at least in a proportion of families. However, in ASD, only a few scans have highlighted loci with
In 1998, the International Molecular Genetic Study of Autism Consortium (IMGSAC)  reported modest evidence for linkage. This included
Few individual families exist where ASD segregates in an obvious Mendelian fashion that is large enough to provide significant evidence for linkage by themselves. Most linkage studies required the assumption that a significant proportion of the families in the sample might be linked to a given locus and few were sufficiently large to accommodate even modest locus heterogeneity. Under a ‘common disorder - common variant’ model, multiply affected families will occur but linkage methods would be considerably underpowered. The effects of DNA variation with low penetrance are more easily identified using a genetic association study design in a sample of cases drawn from a population.
Fine-mapping and candidate gene association studies at implicated regions on 7q have implicated a number of potential susceptibility genes including
The Paris Autism Research International Sibpair (PARIS) study , that supported the findings on 7q, noted additional loci at 2q31-q32, 16p13 and 19p13, that overlapped to some extent with regions identified in the IMGSAC analyses. Fine mapping of the 2q31-q32 led to additional evidence implicating the genes SLC25A12, STK39 and ITGA4 as putative risk genes for ASD. As with most candidate genes examined in small samples, these associations were validated by some [40-42] but not all studies [43-46]. The positional candidature of the 2q31-q32 was further supported by a case study of a young Irish male with high-functioning autism with a complex translocation traversing chromosome 2q32 (46,XY,t(9;2)(q31.1;q32.2q31.3)) . Fine mapping and mutation analysis identified an association with a polymorphism within the splice donor sequence of exon 16 of the ITGA4 gene and ASD (rs12690517; P = 0.008) .
In a study of Finnish families, Auranen and colleagues [49, 50], who included individuals with autism, infantile autism, Asperger syndrome (AS) and developmental dysphasia reported a
Interestingly, the advent of large collaborative studies by the Autism Genome Research Exchange (AGRE), the Autism Genome Project (AGP) and the National Institute of Mental Health (NIMH) has not yielded stronger evidence in favour of specific loci. Liu and colleagues using data from 110 multiplex families from the AGRE collection  observed only
Early association studies examined whether specific variation in genes was associated with the disease and focused on candidate loci identified from linkage and cytogenetic studies (positional candidate genes) as well as in genes within biological processes that we perceived as having a role in ASD. In the mid-1990’s Risch and Merikangas  demonstrated that where genetic variants have only small effect on risk the association study is a more powerful approach than linkage to identify genetic risk. However, the transition from candidate gene to genome-wide association studies was not realised until technological advances firstly enriched the maps of common variation across the genome and secondly enabled the interrogation of these variants
Wang and colleagues  performed one of the first GWAS on individuals in European ancestry individuals from the AGRE collection, the Autism Case Control Collection and unaffected controls from the Children’s Hospital of Philadelphia control collection. Neither family-based nor case-control analysis alone yielded genome-wide significant findings. However, in a combined analyses the authors identified genome-wide significant association on chromosome 5p14.1 (rs4307059;
A second independent GWAS was reported by Weiss and colleagues . In the initial scan the authors did not find any GW-significant associations. However, as with the previous GWAS, additional supplementation of their family-based studies with a case-control set derived from 90 probands without parental data garnered some additional signal for the top hits. A replication consortium of greater than 2000 trios was genotyped for 45 SNPs across all of the top associated regions. The only marker that showed evidence of replication resides on the short arm of chromosome 5 at 5p15. Although, like Ma and colleagues, this report has considerable overlap with the AGRE families reported by Wang and colleagues, the authors did not observe strong association at 5p14.1. The chromosome 5p association lies in close proximity to
Two additional GWAS followed by the Autism Genome Project (AGP) [39, 58]. In the first report of 1369 families, Anney and colleagues identified a single GW-significant finding on chromosome 20 at position 20p12 within the
The strongest association signal observed by the AGP combined analyses was rs1718101 (
4. Structural variation: Chromosomal abnormalities in autism
Autism has been frequently associated with chromosome abnormalities, such as deletions, duplications, inversions, or translocations; with abnormalities on the long arm of Chromosome 15 and with numerical and structural abnormalities of the sex chromosomes. Reddy in a survey of chromosomal abnormality in autism found one in 14 of 421 individuals (3.33%) . These fourteen cases broke down into 4 supernumerary chromosome markers, 4 deletions, 3 inversions and 3 duplications. Several reviews have since confirmed that such abnormalities can be identified in ~3-5% of these patients [72, 73]. The regions commonly reported include 2q37, 5p14, 5p15, multiple locations on chromosome 7, 11q25, 15q11-q13, 16q22.3, 17p11.2, 18q21.1, 18q23, 22q11.2, 22q13.3 and Xp22.2–p22.3.
Many studies have converged on particular chromosomal abnormalities in autism, the most common of which are maternally inherited duplications at 15q11-13. These duplications are found in as many as 1–3% of patients diagnosed with autism.
Another relatively common chromosomal variant is the 22q11.2 microdeletion syndrome, or velocardiofacial syndrome (VCFS) that occurs in ~1/4000 live births. This syndrome is also identified in the context of learning disability or in schizophrenia and has a complex phenotypic expression affecting multiple organs. The physical features include a typical facial appearance, (long face, narrow palpebral fissures, flattened malar eminences, prominent nose and small mouth); anatomical and/or functional abnormalities of the palatal shelves such as cleft palate and velopharyngeal insufficiency; lymphoid tissue hypoplasia and heart defects. A wide range of childhood onset developmental symptoms and disorders are described in association with the VCFS mutation  including attention deficit hyperactivity disorder (ADHD), oppositional defiant disorder, phobias, anxiety, obsessive compulsive disorder and autism spectrum disorders.
5. Copy number variation
The human genome has both sequence and structural variation, the majority of which has no functional consequences and is unrelated to any disorder. Structural variation may be balanced, such as is the case for inversions and balanced translocations or may alter DNA copy number. The later, referred to as copy number variants (CNVs) extend from the duplication or deletion of a single base pair to whole chromosome abnormalities. The term CNV is generally used to indicate the larger changes. Initially, large CNVs were identified with classical chromosomal staining and light microscopy. As many of these abnormalities were initially identified in sub-telomeric regions, there was interest in knowing if such deletions, duplications and other re-arrangements might occur throughout the genome. This hypothesis was confirmed with the development of high resolution array based comparative genomic hybridization (aCGH) .
In the arrays, comparative hybridization is performed with DNA immobilized on a platform such as a glass slide. Initially the DNA arrays consisted of human DNA cloned into bacterial artificial chromosomes (BAC arrays) representing the human genome at approximately 1Mb intervals. The present arrays consist of 25-base pair oligonucleotides (probes). In the last decade the resolution of the arrays have improved to the extent that the number of smaller CNVs identified in the genome that were previously invisible to microscopy has increased enormously. Deletions and duplications at least as small as 10kb are now known to occur throughout the genome.
Interestingly, some CNVs appear to differ from single nucleotide polymorphisms (SNPs) in terms of locus specific mutation rates. Rates for genomic re-arrangements range from 10-4 to 10-5, a rate that is considerably more frequent than point mutations . This high mutation rate, coupled with reduced fecundity in some carriers, and the fact that comparatively, they affect a larger proportion of the genome, make CNVs a potentially important source of new and recent mutation in neurodevelopmental disorders.
6. CNVs in autism
As discussed above, autism is a phenotypic feature of many genomic disorders. Betancur , in her review, lists 103 disease genes and 44 genomic loci where autism or autistic like behaviors have been described. ASD is diagnosed in ~30% of males with Fragile X syndrome and in reverse, Fragile X mutations are found in as many as 7–8% of individuals with ASD . Similarly, mutations in MECP2, the Rett Syndrome gene, have been found among cases of autism that do not have the classical Rett phenotype and autism patients have an increased risk for neurofibromatosis and other rare monogenic diseases like tuberous sclerosis and Joubert’s Syndrome, again inversely patients with these disorders have an increased risk for having autism [79, 80].
The genes and loci listed in the Betancur study are all causally implicated in learning disability (LD), indicating that these two neurodevelopmental disorders share some genetic risk factors. Early use of aCGH in non-syndromic autism suggested the method had promise in detecting hitherto unrecognized CNVs. For example Jacquemont et al  identified 6 deletions and 2 duplications in 29 patients presenting with syndromic ASD where previous high resolution karyotyping was reported as normal. Another study showing the potential for CNV analysis was a large linkage study  using a 10k SNP array where intensity data was used to determine copy number. The authors highlight some individual findings including a family with two sisters with ASD, both of whom had a ~300kb deletion on Ch. 2p16 that included the coding exons of the neurexin 1 gene (
A key development was the report of de novo copy number variants in autism  using aCGH. These authors showed that individually rare CNVs, and in particular ones that affect neurodevelopmental genes, were enriched in cases. They further suggested that the rate of de novo CNVs differed between simplex cases, where they occurred in 10% of families in the sample, and familial cases where they occurred in 3% of families, suggesting that sporadic and familial cases of ASD might have different underlying genetic mechanisms, although not all studies since then have found this distinction. Several studies of CNVs in large autism case and family series have followed. Marshall et al.  examined 427 ASD families using a 500k SNP array and karyotyping by standard clinical diagnostic method. A de novo rate of 7.1% and 2.0% in simplex and multiplex families respectively was observed, supporting the previous findings . Families occasionally showed more than a single de-novo event where both may combine to produce risk. A further set of loci were identified in two or more unrelated families, increasing the evidence supporting a pathogenic role. As with the LD literature, at some loci, both deletions and duplication were found suggesting a more complex mechanism than simple over or under-expression of gene products. Of the 196 inherited CNVs confirmed experimentally, 90 were of maternal and 106 of paternal origin. The authors list numerous potential ASD candidate genes where a structural change was either de-novo, found in two or more unrelated ASD cases, or, in the case of the X-chromosome, transmitted from an unaffected mother. Given their rarity, very few individual CNVs in this study provided statistical evidence to support their role in autism. For example, 4 CNVs from 427 cases were found at the
As the resolution of the probe arrays improves, smaller CNVs will be detected, and the boundaries of previously identified CNVs will become more refined . Nord et al. , examined genomic DNA of 41 children with autism and 367 healthy controls for rare CNVs using a very high-resolution aCGH platform. They found that cases were more likely than controls to have CNVs as small as ~10 kb, likely to affect genes involved in transcription, nervous system development, and receptor activity. They found that expression of
Larger samples, particularly those based on families, will also enable the improved estimation of the overall effects of de-novo mutations and the assessment of rare recurrent events as disease associated mutations . These authors studied 1124 autism families containing probands, unaffected parents and an unaffected sibling using the Illumina 1M SNP array. In a related paper  they were able to confirm de novo CNVs identified using the SNP array with those detected using a Nimblegen 2.1M aCGH platform. A combined total of 58 rare de novo CNVs were identified across the two studies with each array type identifying 95% of the total. However, the sensitivity for smaller CNVs was low for both arrays. Overall, the burden of rare de novo CNVs in the Sanders et al  study was greater in probands than in siblings for total number, size and gene content. Using the rate in siblings as a control to evaluate findings in the cases, there was strong individual statistical support for recurrent de novo duplications at 7q11.23, the locus at which deletions cause Williams-Beuren syndrome; deletions at 16p11.2 and duplications at 16p11.2. In addition the authors observed 8 loci at which rare transmitted CNVs, present only in probands, overlapped with one of the 51 regions in probands containing one or more rare
7. Mutation rates and models of risk in autism
Given the replicated finding that de-novo mutations are more frequent in simplex cases compared to familial cases, Zhao et al  have suggested a model of autism risk in which families fall into two groups; those in which the overall risk for autism is low, representing the majority of families, and those in which the risk is higher due to a disease mutation with a dominant mode of transmission with greater penetrance in males compared to females. Under this model, sporadic cases of autism occur in low-risk families due to a de novo mutation of relatively high penetrance, whereas familial autism occurs due to the inheritance of an existing mutation from a clinically unaffected or asymptomatic parent. In another model, Girirajan et al.  proposed the necessity in some families for a second mutation to lead to severe neurodevelopmental disorder. In this study, individuals with childhood developmental delay are enriched approximately fourfold for a rare 520-kb 16p12 deletion. In nearly all cases examined (22/23), the deletion was inherited. Thus, 16p12 deletions appear to be an example of inherited predisposition to neurodevelopmental disorder with dominant transmission. However, these individuals were more likely to carry a second large (>500 kb) CNV compared to matched controls, and clinical features of those with a second large CNV were typically more severe than those with the 16p12 deletion alone. Itsara et al.  suggest that multiply affected autism pedigrees segregate an existing inherited mutation of low penetrance which by itself is rarely sufficient to cause disease. Secondary mutations, such as de-novo mutations are required to manifest as disorder. Whether or not these second hits are disease specific remains to be examined. The authors propose that the excess of
8. Next-generation sequencing studies
There is an increasing literature describing the occurrence of single nucleotide mutation in ASD. Unlike the common polymorphisms described earlier, these variants are often rare or private to an individual and considered to have a large impact on the gene's function. Early studies have focused on candidate genes with strong
In April 2012, three key manuscripts were published in Nature describing a considerable exome sequencing effort of 622 family trios and 250 unaffected siblings [98-100]. In these studies the focus was on de novo mutation in the exome of individuals with ASD.
9. Implications of genomic studies for clinical practice
Genetic findings to date in autism include known genetic syndromes such as Rett and Fragile X; chromosomal structural variation such as duplications on ch15 and deletions on ch22 and increasing numbers of CNVs, many de novo, in a small proportion of cases, especially those with co-morbid developmental abnormalities. Common variants are only slowly emerging from genome-wide studies but sample sizes are still relatively modest compared to other disorders. The diagnosis remains as a clinical one, but genetic testing has diagnostic, prognostic and family planning implications, and in the coming years may also be of importance with respect to potential treatments for specific conditions such as Rett and Fragile X. Therefore, a clinical genetics evaluation should be considered in ASD children in order to identify syndromic forms of autism, identify familial cases, and drive diagnostic testing. Genetic testing using aCGH has been recommended by the American Academy of Pediatrics  and the American College of Medical Genetics [102, 103], particularly to evaluate aetiological heterogeneity and identify syndromes. For the vast majority of cases with an identified genetic cause, there is no specific benefit in terms of treatment, but the diagnosis itself may be of value to parents seeking an explanation for the symptoms they see in their child. Acceptance of the disorder may be easier in the presence of a known cause as echoes of social causation and blame can still be heard. Family investigation may also be warranted, particularly where the proband has an identified genetic causation as recurrence risk in siblings may have implications for family planning. If a genetic cause of pathogenic significance is identified, the recurrence risk for sibs may be established according to the particular genetic diagnosis. In the majority of cases, no genetic alteration is found, and an empirical risk for siblings of 10-20% can be provided. Establishing a clear de novo origin for a given mutation may enable a low risk estimate in siblings although there are caveats relating to the exact origin of the mutation. Finally, and re-stating the main goal of genetic research in ASD, a better understanding of underlying biology may lead to novel and unexpected treatment possibilities. Early signs of this are emerging in human trials for treatments aimed at the specific mutations found in Rett and Fragile X syndromes.