2.1. What is CNV?
Using array-CGH, a combination of microarray and comparative genomic hybridization (CGH) technologies, two pioneering groups of scientists have identified wide-spread CNVs in apparently healthy, normal individuals in 2004 (Iafrate et al., 2004; Sebat et al., 2004). CNV is defined as any type of genetic variant that alters the chromosomal structure, including duplications and deletions (Iafrate et al., 2004; Sebat et al., 2004; Redon et al., 2006) and now known to be one of the most prevalent types of genetic variations in the human genome (Feuk et al., 2006; Hurles et al., 2008; Carter, 2007; Estivill & Armengol, 2007). In addition to SNPs, CNVs in normal individuals have been widening our understanding of genetic heterogeneity (Iafrate et al., 2004; Sebat et al., 2004; Redon et al., 2006). Commonly used working definition of CNV was a copy number change involving a DNA segment sized 1 kilobases (kb) or larger (Freeman et al., 2006; Feuk et al., 2006). Nowadays, definition of CNVs includes any DNA structural variants including duplications, deletions and inversions (Hurles et al., 2008). When the frequency of CNV is common (>1%) in the population, CNV is also called copy number polymorphism (CNP). However, due to lack of standardized technologies to define CNV, the size and frequency of CNV have not been well defined in human populations. Since the two pioneering studies discovered the evidence of the existence of CNVs (Iafrate et al., 2004; Sebat et al., 2004), more than 66,000 CNVs and 34,000 InDels have been identified in various populations (Redon et al., 2006; Simon-Sanchez et al., 2007; de Smith et al., 2007; Perry et al., 2008; Díaz de Ståhl et al., 2008; Yim et al., 2010; Conrad et al., 2010; Park et al., 2010) and catalogued in the public database, Database of Genomic Variants (http://projects.tcag.ca/variation/) (Feuk et al, 2006). More CNVs have been uncovered using the NGS analysis (Mills et al., 2011; Kidd et al., 2010; Kim et al., 2009).
CNVs can affect gene functions in several ways and have a potential to affect gene expression levels presumably larger than that of SNPs. Deletion or duplication may disrupt the genes located inside those regions, resulting in changes in the gene structure, which can affect the gene expression. Alternatively, disruption of the transcription regulatory regions and the enhancers can also affect the gene expression. During the recombination which is thought to be an important mechanism of CNV development, novel fusion products may be generated, which may exert positive or negative effects on gene expression and epigenetic regulations (Feuk et al, 2006; Zhang et al., 2009; Hampton et al., 2009; Przybytkowski et al., 2011; Reymond et al., 2007). Taken together, structural variations are likely to be responsible for the phenotypic variation of human beings and comprehensive mapping of CNVs can facilitate the understanding of inter-individual phenotypic differences including disease susceptibility and responsiveness to drugs (Feuk et al, 2006; Estivill & Armengol, 2007). Indeed, CNVs have been found to be associated with various types of Mendelian traits and also a substantial number of complex diseases including neurodevelopmental disorders (Buchanan & Scherer, 2008; Lee & Lupski, 2006).
2.2. CNVs in ASD
To assess the role of CNV in ASD, several different whole-genome microarray platforms based on oligonucleotides, SNPs and BAC clones have been used for ASD family studies or case-control studies (Abrahams & Geschwind, 2008; Cook & Scherer, 2008). As a result, lines of evidence have been accumulated that multiple rare de novo CNVs contribute to the susceptibility to ASD. For example, duplications and/or deletions on chromosome 15q11–q13 confer increased risk of ASD (15q11–q13 duplication syndrome, Prader-Willi syndrome and Angelman syndrome). Approximately one fourth of the individuals who have a 22q11.2 deletion and over 90% of individuals with duplication of 17p11.2 show characteristics of ASD (Cohen et al., 2005; Abrahams & Geschwind, 2008;
Fernández et al., 2009). Significant associations have been reported between ASD and CNV of various genes, such as NRXN1 (2p16.3), NLGN3 (Xq13.1), NLGN4 (Xp22.23) and SHANK3 (22q13.3). There have been many reports on CNVs associated with ASDs, but, due to technical limitations and lack of standardized methods for defining the CNVs and CNV regions (CNVRs), there are inconsistencies among studies which should be removed by further GWAS. Table 1 summarizes the major CNVs identified by GWAS in ASD.
Discovery Sample |
Replication Sample |
Study design |
Platform |
CNV Detection method |
Number of CNVs identified |
Strong candidate loci |
CN change |
Reference |
Case |
Control |
Case |
Control |
1496 families with 7,917 subjects |
Unaffected family members |
- |
- |
Family-based |
Affymetrix 10K |
dChip |
254 |
NRXN1 1q21 17p12 22q11.2 |
del |
Szatmari et al., 2007 |
165 families |
99 unaffected families |
-
|
-
|
Family-based |
Agilent 244K 390K ROMA |
HMM |
17 |
SLC4A10, FHIT FHIT FLJ16237 A2BP1 |
del del dup del del |
Sebat et al., 2007 |
180 |
372 |
532 |
465 |
Family-based and Case-control |
Array-CGH |
NimbleGen |
1 |
16p11.2 |
microdel |
Kumar et al., 2008 |
751 multiplex families with 1441 cases
|
1420 (AGRE parents) 2814 (bipolar disporder or NIMH controls) |
512 (CHB) 299 (deCODE) |
434 (CHB) 18,834 (decode) |
Family-based |
Affymetrix 5.0 (AGRE) Affymetrix 500K (controls) |
COPPER/ Birdseye (AGRE) ADM-2(CHB) HMM(deCODE) |
47 |
16p11.2
|
del/dup
|
Weiss et al., 2008 |
397
|
372 |
- |
- |
Family-based |
19K BAC Microarray |
- |
51 |
15q11-q13 22q11 16p11.2 |
dup dup microdel |
Christian et al., 2008 |
427 families
|
500
|
- |
1,152 matched controls |
Case-control |
Affymetrix 500K |
dChip, CNAG, GEMCA |
277 |
16p11.2 SHANK3-NLGN4-NRXN1-PSD DPP6-DPP10-PCDH9 ANKRD11 DPYD PTCHD1 15q24 |
del/dup
|
Marshall et al., 2008 |
859
|
1409
|
1,336 |
1,110 |
Case-control |
Illumina HumanHap550 |
PennCNV |
78,490 |
15q11-13 22q11.21 NRXN1 CNTN4 PARK2 RFWD2 AK123120 UNQ3037 GRID1 NLGN1 GYPELOC44 |
dup dup del del/dup del dup dup del del dup dup |
Glessner et al., 2009 |
912 multiplex families |
1,488 (CHOP) 542 (NINDS) |
859 |
1,051 |
Case-control |
Illumina HumanHap550 |
PennCNV |
"/ 150 |
NRXN1 UBE3A 15q11-q13 BZRAP1 MDGA2 |
del dup del/dup del/dup del |
Bucan et al, 2009 |
28 children |
62 Adults
|
- |
- |
Case-control |
Array-CGH |
Array-CyGHt |
38 |
8p23.1 17p11.2 |
del del |
Cho et al, 2009 |
996
|
1,287 |
- |
3,677 |
Case-control and Family-based |
Illumina 1M |
QuantiSNP iPattern
|
5,478 |
SHANK2 SYNGAP1 DLGAP2 CSNK1D/SLC16A3 NRXN1 22q11.21 DDX53/PTCHD1 |
del del dup dup/del dup/del del del |
Pinto rt al., 2010 |
Table 1.
ACC:Autism Case-Control cohortADM : aberration detection methodAGP: Autism Genome ProjectAGRE: Autism Genetic Resource ExchangeCHB: Children’s Hospital BostonCHOP: Children’s Hospital of PhiladelphiaCNAG: Copy Number Analysis for GeneChipCOPPER: copy-number polymorphism evaluation routinedChip: DNA Chip AnalyzerGEMCA: Genotyping Microarray based CNV AnalysisHMM: hidden Markov modelNIMH: National Institute of Mental HealthNINDS: National Institute of Neurological Disorders and Stroke
Genome-wide CNV association studies of autism
In 2007, two pioneering studies demonstrated the association of CNVs with ASD. The Autism Genome Project Consortium performed linkage and CNV analyses using Affymetrix 10K SNP array for 1,181 ASD families with at least two affected individuals (Autism Genome Project Consortium et al., 2007). Of the 254 highly significant CNVs, the investigators emphasized four CNVs and the most interesting finding was a 300-kb sized CNV loss on chromosome 2p16 identified recurrently in two families. The deletion of this region disrupted the coding exons of the neurexin 1 gene (NRXN1), which interacts with neurologins and involves in synaptogenesis. Therefore, deterioration of the neurexin 1 function by deletion may affect susceptibility to ASD or its phenotypes. The structural variation in the NRXN1 gene was reported from the previous autism studies (Chubykin et al., 2005; Feng et al., 2006). The other three interesting CNVs were 1.1-Mb sized CNV gain on chromosome 1q21, 933-kb sized de novo duplication on 17p12, and duplication on 22q11.2. The duplication on 17p12 is known to cause Charcot-Marie-Tooth 1A (CMT1A) disease (Houlden et al., 2006). In addition, other micro-duplications of the same chromosomal region have been reported in individuals with mental retardation, linguistic delay, autism and related phenotypes (Moog et al., 2004).
Sebat and his colleagues performed array-CGH analysis with 264 families and explored the association of de novo CNVs with ASD, which are not present in their respective parents (Sebat et al., 2007). The authors identified 17 de novo CNVs in 16 subjects. According to their result, the frequency of spontaneous mutation was 10% in the sporadic cases and 3% in the multiplex families, while 1% in unaffected individuals. One of the de novo CNV loci was a 4.3-Mb sized deletion at 22q13.31-q13.33, where SHANK3 gene is located. Recurrent deletion of this region has been previously reported in ASD (Manning et al., 2004). Durand et al. reported that mutations in SHANK3 gene were associated with ASD and abnormal gene dosage of SHANK3 was associated with severe cognitive deficits, linguistic delay and ASD (Durand et al., 2007). SHANK3 is a scaffolding protein found in excitatory synapses directly opposite to the presynaptic active zone. This gene has been suggested to be associated with the neurobehavioral symptoms observed in individuals with 22q13 deletions.
In 2008, four independent studies consistently reported the association of the CNV on 16p11.2 locus with autism. Weiss et al. adopted Affymetrix 5.0 SNP array to find CNVs in 751 multiplex families from the Autism Genetic Resource Exchange (AGRE) (Weiss et al., 2008). They identified 32 high- and 15 low- confidence regions. Among the candidate loci, microdeletion and microduplication on 16p11.2 were validated to be associated with ASD. This association was further confirmed in clinical testing data from Children’s Hospital Boston and in a large population data from Iceland (deCODE genetics data). Kumar et al. screened 180 ASD cases and 372 controls using a 19K whole-genome tiling bacterial artificial chromosome (BAC) array to identify submicroscopic copy number changes specific to autism (Kumar et al., 2008). They observed ~500-kb sized recurrent microdeletion on 16p11.2 in two cases with autism but not in the controls. When they assessed the frequency of this putative autism-associated genomic disorder, 0.6% of the ASD cases showed the alterations while none in controls. The variation was confirmed by FISH, microsatellite analyses and array-CGH. Christian et al. also used the same 19K whole-genome tiling BAC array to identify ASD-associated CNVs in the 397 cases and 372 control set (Christian et al., 2008). Among the 51 candidate CNVs, recurrent CNVs were identified in the loci including 15q11-q13, 22q11, and 16p11.2. They were confirmed by FISH, microsatellite analysis, or quantitative polymerase chain reaction (PCR) analysis. Marshall et al. performed whole-genome screening for 427 ASD cases and 500 controls using Affymetrix 500K SNP arrays (Marshall et al., 2008). Of the 277 CNVs identified only in the cases, the CNVs on 16p11.2 locus appeared in around 1% of the ASD cases, which included both duplications and deletions. There exist SHANK3-NLGN4-NRXN1 postsynaptic density genes, DPP6-DPP10-PCDH9 (synapse complex), ANKRD11, DPYD and PTCHD1 in other associated CNVs.
New CNVs in addition to the known ones have been suggested to be associated with ASD in the subsequent studies. Glessner et al. performed a whole-genome CNV analysis with 859 cases and 1,409 controls using Illumina HumanHap550 BeadChip (Glessner et al., 2009). They generated 78,490 CNV calls and the positive findings were further evaluated in an independent cohort of 1,336 ASD cases and 1,110 controls. Through this approach, they identified several known ASD-associated genes as well as novel candidate CNVs. For example, they identified the CNVs in the loci including 15q11–q13, 22q11.21, NRXN1 and CNTN4, which were previously reported to be associated with autism (Kim et al., 2009; Roohi et al., 2009; Fernandez et al., 2008). However, some of the genes or loci previously known to be associated with ASD such as AUTS2 (Kalscheuer et al., 2007), NLGN3 (Jamain et al., 2003), SHANK3 (Moessner et al., 2007) and 16p11.2(Weiss et al., 2008) were not replicated in their study. Especially 16p11.2, a locus consistently reported to be associated in four previous independent studies, did not show a significant association in this study. Several new susceptibility genes such as NLGN1 and ASTN2 were identified in this study. Both genes encode neuronal cell-adhesion molecules. In Chubykin et al’s report, mutations in neuroligin superfamily members were identified in the individuals with ASD (Chubykin et al., 2005). ASTN1 is a neuronal protein receptor integral in the process of glial-guided granule cell migration during development (Zheng et al., 1996). Furthermore, CNVs of the genes involved in the ubiquitin pathways, such as UBE3A, PARK2, RFWD2 and FBXO40, were observed in the ASD cases but not in the controls. Bucan et al. conducted high -density genotyping of 912 multiplex families from the AGRE collection and 1,488 controls using Illumina HumanHap550 BeadChip (Bucan et al., 2009). They identified more than 150 loci harboring rare variants in multiple unrelated patients and the positive findings were further validated in an independent cohort of 859 ASD cases and 1,051 controls by genomic quantitative PCR. Among the candidate loci, there are previously reported ones such as NRXN1(Marshall et al., 2008), UBE3A (Glessner et al., 2009), and 15q11-q13 (Christian et al., 2008) and novel ones such as BZRAP1 and MDGA2.
In 2009, Cho et al. reported the ASD associated CNVs in east-Asians. They performed whole-genome BAC array-CGH with 28 ASD cases and with 62 controls and identified 38 CNVs including those harboring two significant loci, 8p23.1 and 17p11.2 (Cho et al., 2009). DEFENSIN gene family are located in the 8p23.1 CNV locus and often showed copy number polymorphisms in earlier studies (Linzmeier & Ganz, 2005). Although there have been no direct clues to connect the copy number loss of DEFENSIN gene and ASD, immunological dysfunction has been suggested to be associated with autism (Rutter, 2005).
Most recently, Pinto et al. analyzed the genome-wide features of rare CNVs in autism using Illumina 1M SNP arrays (Pinto et al., 2010). Based on 996 cases and 1,287 controls, they identified 5,478 rare CNVs. By examining parent-child transmission, the authors found the 226 de novo and inherited CNVs which were not present in controls. As a whole, ASD cases were found to carry a higher number of de novo CNVs than controls (1.69 fold, P=3.4X10-4). A number of novel genes such as SHANK2, SYNGAP1, DLGAP2 and the DDX53–PTCHD1 in the CNVs were found to be associated with ASD in this study. Also, through gene set enrichment analysis, cellular proliferation, projection and motility, and GTPase/Ras signaling were found to be affected by the CNVs identified in their study. This approach demonstrated the new paradigm of autism research based on functional pathway and cross-talk.