InTech uses cookies to offer you the best online experience. By continuing to use our site, you agree to our Privacy Policy.

Medicine » Mental and Behavioural Disorders and Diseases of the Nervous System » "Recent Advances in Autism Spectrum Disorders - Volume I", book edited by Michael Fitzgerald, ISBN 978-953-51-1021-7, Published: March 6, 2013 under CC BY 3.0 license. © The Author(s).

Chapter 16

Discovering the Genetics of Autism

By Abdullah K. Alqallaf, Fuad M. Alkoot and Mash’el S. Aldabbous
DOI: 10.5772/53797

Article top

Overview

A puzzle-like representation of the interaction process of the researches and studies for autism spectrum disorders.
Figure 1. A puzzle-like representation of the interaction process of the researches and studies for autism spectrum disorders.
a schematic representation of types of chromosomal rearrangements [67].
Figure 2. a schematic representation of types of chromosomal rearrangements [67].
Principles of the aCGH technology. (a) DNA from the sample to be tested and reference DNA are labeled with a green fluorescence dye (Cy3) and red (Cy5), respectively, and competitively co-hybridized to an array containing genomic DNA targets that have been spotted on a glass slide. The resulting ratio of the fluorescence intensities is proportional to the ratio of the copy numbers of DNA sequences in the test and reference genomes measured in a logarithmic scale. (b) The slides are scanned using a specific microarray scanner shown in (c). (d) The output of the scanning process is the ratio of the fluorescence intensities for each spot represented as a point in the relative copy number profile [66].
Figure 3. Principles of the aCGH technology. (a) DNA from the sample to be tested and reference DNA are labeled with a green fluorescence dye (Cy3) and red (Cy5), respectively, and competitively co-hybridized to an array containing genomic DNA targets that have been spotted on a glass slide. The resulting ratio of the fluorescence intensities is proportional to the ratio of the copy numbers of DNA sequences in the test and reference genomes measured in a logarithmic scale. (b) The slides are scanned using a specific microarray scanner shown in (c). (d) The output of the scanning process is the ratio of the fluorescence intensities for each spot represented as a point in the relative copy number profile [66].
Graphical representation of the generated data using aCGH technology. The red stars represent the raw data as described in (1). The grey solid line represents the true value of 4 variant segments that need to be estimated with intensity levels A
							
								i
							 measured in log2(ratio) and bounded by the breakpoints n
							
								i-1 and n
							
								i
							, respectively.
Figure 4. Graphical representation of the generated data using aCGH technology. The red stars represent the raw data as described in (1). The grey solid line represents the true value of 4 variant segments that need to be estimated with intensity levels A i measured in log2(ratio) and bounded by the breakpoints n i-1 and n i , respectively.
Schematic representation of 3 recurrent copy number variant segments (RCNVs) with different lengths. The x-axis represents the genomic position and the y-axis represents the indices of the samples, C
							
								i
							 is for normal-control samples and A
							
								i
							 is for autistic samples, respectively. The vertical dashed lines represent the RCNVs boundaries. The dark red and dark blue bars represent duplication and deletion for the corresponding chromosomal regions.
Figure 5. Schematic representation of 3 recurrent copy number variant segments (RCNVs) with different lengths. The x-axis represents the genomic position and the y-axis represents the indices of the samples, C i is for normal-control samples and A i is for autistic samples, respectively. The vertical dashed lines represent the RCNVs boundaries. The dark red and dark blue bars represent duplication and deletion for the corresponding chromosomal regions.
Comparison study of the performance of the three tested classifiers. The x-axis represents the number of segments and the y-axis represents the percentage average LOOCV accuracy.
Figure 6. Comparison study of the performance of the three tested classifiers. The x-axis represents the number of segments and the y-axis represents the percentage average LOOCV accuracy.

Discovering the Genetics of Autism

Abdullah K. Alqallaf1, Fuad M. Alkoot2 and Mash’el S. Aldabbous3

1. Introduction

Autism is a complex neurodevelopmental disorder. It is characterized by social isolation, language deficits and repetitive or stereotyped behaviors. Autism spectrum disorder (ASD) has received a great deal of attention in the recent years not only due to the increasing rate of affected children but also because of the social and economical impact of the disorder on their families. Various studies and researches have been proposed to deal with and tackle the ASD. They can be divided into three categories as follows.

  1. The basis and causes of the disorder. Different hypotheses have been proposed in an attempt to determine and discover the originality of autism. Genetic risk factors represented by abnormal chromosomal variations and rearrangements, and non-genetic factors represented by environmental agents that have been claimed to contribute to ASD, such as exposure of children to vaccines, infection, certain foods or heavy metals.

  2. The methodologies and techniques for characterizing and diagnosing the disorder. Several instrumental diagnostic protocols are commonly used in autism research such as the Autism Diagnostic Interview-Revised (ADI-R) and the Autism Diagnostic Observation Schedule (ADOS). The advances in neuro-imaging techniques such as the functional-Magnetic Resonance Imaging (f-MRI) have allowed scientists to model the structural and functional differences in the human brain tissues of the individuals with ASD. The clinical genetics evaluation provide reliable alternative to the interview-based protocols and screening approaches. It allows geneticists to link an estimate of approximately 40% of the cases to genetic contributors.

  3. The treatments and therapies of autistic patients. The available approaches for treatments include applied behavior analysis (ABA), developmental models, structured teaching, speech and language therapy and social skills therapy. When behavioral treatment fails, many medications are used to treat ASD symptoms.

Figure 1 demonstrates the interaction of the autism spectrum disorders researches and studies.

media/image1_w.jpg

Figure 1.

A puzzle-like representation of the interaction process of the researches and studies for autism spectrum disorders.

The advancements of the technologies in the field of genetics provide the opportunities for researchers and scientists to explore in depth the biological information and to convert it into meaningful biological knowledge through computational-based models.

In this chapter, we will investigate the genetics origins of autism and demonstrate the latest techniques and technologies available for diagnosing the complex disorder. We will also propose a robust approach for detecting and identifying the targeted disorder based upon the advantages and strengths of the publically available and commercial approaches while avoiding their weaknesses. The proposed approach is divided into two steps. The preprocessing step is a feature-extraction method used to clearly map and detect the genetic variations and structural rearrangements followed by a statistical-based model as feature-selection to evaluate and measure the statistical and biological significance of the predicted variations. The classification step is to discover the relationship among the tested samples into groups and/or subgroups, and to provide insight into the complex pattern of the genome.

The results suggest that autism is associated with an increased amount of alterations in unstable segments of the genome. The experimental results also show that using high-resolution custom-tiled samples improve the accuracy of our proposed approach in determining previously reported and new genetic contributors that warrant investigation.

This chapter aims at utilizing research to bring benefits to individuals and families affected by autism spectrum disorders and to improve the quality of their life. And this can be done by clear mapping and identifying the biomarkers associated with ASD at the early childhood stages which are essential to provide better treatments and therapies. Finally, the proposed approach presented in this chapter is broadly applicable to case-control studies of genetic diseases beyond the ASD.

The chapter is organized as follows. In section 2, we demonstrate the genetic data generating techniques, data modeling and chromosomal variations that are associated with the targeted disorder, ASD. Section 3 is devoted for the methods used to analyze the genetic data trying to discover the variant regions along the genome and to identify the tested samples. In section 4, we apply molecular test to evaluate the predictive power of the proposed approach. Finally, discussion and conclusion based on the results are presented in section 5.

2. Genetic data

2.1. Genomic structural variations and ASD susceptibility

Genetic alterations in the form of chromosomal rearrangements are genomic structural variations that lead to changes in the DNA copy number such as duplications and deletions of the DNA copies. However, copy number changes do not include other genomic structural variations such as inversions, insertions and reciprocal translocations. Figure 2 demonstrates different types of chromosomal rearrangements.

media/image2.png

Figure 2.

a schematic representation of types of chromosomal rearrangements [67].

Chromosome region Gene Phenotype
 Mendelian Syndromes 6q23.3AHI1Joubert syndrome
7q35-q36.1CNTNAP2Recessive EPI syndrome, ASD, ADHD, TS, OCD
9q34.13TSC1Tuberous Sclerosis type I
10q23.31PTENCowden disease*
11q13.4DHCR7Smith-Lemli-Opitz syndrome
12p13.33CACNA1CTimothy syndrome
15q11.2UBE3AAngelman syndrome
16p13.3TSC2Tuberous Sclerosis type II
17q11.2NF1Neurofibromatosis
Xp21.2DMDDuchenne muscular dystrophy
Xp21.3ARXLIS, XLID, EPI, ASD
Xp22.13CDKL5X-linked infantile spasm syndrome
Xq27.3FMR1Fragile X syndrome
Xq28MECP2Rett syndrome
Rare Variants 1q21.1NBPF9ASD, ID, SCZ, ADHD, EPI
2p16.3NRXN1ASD, ID, language delay, SCZ.
3p13FOXP1ID, ASD, SLI
6q16.3GRIK2Recessive ID
7q11.23FKBP6/CLIP2ASD, ID, language delay
7q31.1FOXP2SLI
11q13.3-q13.4SHANK2ASD, ID
15q11-q13MAGEL2/ NDNASD, EPI, ID
16p11.2VPS35/ORC6ASD, ADHD, ID, EPI, SCZ
16p13.3A2BP1ID, ASD, EPI, SCZ, ADHD
17q11.2SLC6A4ASD, OCD
17q12ACCN1/PNMTASD, SCZ, EPI
22q11.21DiGeorge syndrome, SCZ, ASD, ID.BPAD
22q13.33SHANK3ASD, Phelan McDermid syndrome**
Xq13.1NLGN3ASD
Xp22.11PTCHD1ASD, ID
Xp22.32-p22.31NLGN4XASD, ID, TS, ADHD
Common Alleles 1q42.2DISC1SCZ,BPAD
2q31.1SLC25A12ASD
3p25.3OXTRASD
7q31.2METASD, Diabetes II
7q22.1RELNASD
7q36.3EN2ASD
12q14.2AVPR1AASD
17q21.32ITGB3ASD

Table 1.

Chromosomal regions and genes that are implicated in risk for ASD, and associated genetic disorders and syndroms [68& 69].

Abbreviations: LTD, long-term depression; LTP, long-term potentiation; PPI, prepulse inhibition; E/I, excitatory/inhibitory; PSD, postsynaptic density; ASD, autism spectrum disorders; SCZ, schizophrenia; ADHD, attention deficit hyperactivity disorder; ID, intellectual disability; XLID, X-linked intellectual disability; LIS, lissencephaly; EPI, epilepsy; OCD, obsessive compulsive disorder; TS, Tourette syndrome; SLI, speech and language impairment; USV, ultrasonic vocalization; TF, transcription factor; ECM, extracellular matrix; GPCR, G-protein-coupled receptor;BPAD, Bipolar affective disorder.

*A rare autosomal dominant inherited disorder characterized by multiple tumor-like growths, increased risk of certain forms of cancer, and diverse clinical features including neurologic features such as autism and Lhermitte Duclos disease [39& 40].

** A genetic syndrome caused by disruption of the SHANK3 gene which codes for the shank3 protein. The protein most important role is in the brain. It is involved in processes crucial for learning and memory. It also has an important role in brain development. It is also known as 22q13.3 deletion syndrome and is highly associated with autism.

Human (Homo sapiens) Genome Browser Gateway, http://genome.ucsc.edu/cgi-bin/hgGateway.

A set of chromosomal regions and genes that are implicated with ASD are listed in Table 1. Some of the regions are associated with known Mendelian syndromes. In some individuals affected with these syndromes, ASD occurs as a secondary diagnosis. In other regions and genes, genetic variations causing ASD include a wide range of possibilities each with very low frequency among the cases (rare variants). In some cases the rare variants are found only once in the population. In contrast to rare variants we see that in other chromosomal regions and genes only few common genetic variations (common alleles) account for ASD susceptibility.

2.2. Data Generating

Figure 3 illustrates the process of generating DNA copy number data using Microarray-based comparative genomic hybridization (array CGH) technology.

media/image3.png

Figure 3.

Principles of the aCGH technology. (a) DNA from the sample to be tested and reference DNA are labeled with a green fluorescence dye (Cy3) and red (Cy5), respectively, and competitively co-hybridized to an array containing genomic DNA targets that have been spotted on a glass slide. The resulting ratio of the fluorescence intensities is proportional to the ratio of the copy numbers of DNA sequences in the test and reference genomes measured in a logarithmic scale. (b) The slides are scanned using a specific microarray scanner shown in (c). (d) The output of the scanning process is the ratio of the fluorescence intensities for each spot represented as a point in the relative copy number profile [66].

2.3. Data Modeling

As illustrated in Figure 3, aCGH technology is an experimental approach for genome-wide scanning of differences in DCN samples. It provides a high-resolution method to map and measure relative changes in DCN simultaneously at thousands of genomic loci. In a biological experiment, unknown (test) and reference (normal) DNA samples are labeled with fluorescent dyes Cy3 and Cy5, respectively. Then, they are combined and competitively co-hybridized to an array containing genomic DNA targets that have been spotted on a glass slide. The resulting ratio of the fluorescence intensities is proportional to the ratio of the copy numbers of DNA sequences in the test and reference genomes measured in a logarithmic scale for a certain genomic location. These intensity ratios are informative about DNA copy number changes. We expect to see duplication (gain) for positive ratio, deletion (loss) for negative ratio and normal state for neutral ratio. Due to the logarithmic scale and the probes performance, the data can be approximated as a piecewise function of short and long intervals with different intensity levels that are not equally-spaced along the genome. Moreover, microarray experiments suffer from many sources of error due to human factors, array printer performance, labeling, and hybridization efficiency.

media/image5_w.jpg

Figure 4.

Graphical representation of the generated data using aCGH technology. The red stars represent the raw data as described in (1). The grey solid line represents the true value of 4 variant segments that need to be estimated with intensity levels A i measured in log2(ratio) and bounded by the breakpoints n i-1 and n i , respectively.

According to the data description and properties generated by microarray technology, the DCN cell line can be approximated as a one-dimensional piecewise constant (PWC) discrete-time signal contaminated with some error. A good model of the genetic data generated by the aCGH technology can be model as follows.

where y[n] is the contaminated genetic signal and x[n] is the true value of the genetic variation to be estimated at genomic location n of the length N. ε n is assumed to be modeled as additive wihte Gaussian noise with zero mean and some variance σ2.

As described in (1), Figure 4 illustrates the genetic data in the form of DNA copy number generated by aCGH technology where 4 variant segements are presented with different intensity levels.

3. Methods

3.1. Data Filtering

Although the recent advantecment in microarray technologies and sequencing now make it easy to measure the genetic variations with high-resolution through scanning large number of samples, small changes, particularly at the low copy repeat (LCRs) regions, remain difficult to detect due to different noise conditions. Thus, the challenging problem is to differentiate between the true biological signaling and the noise measurements.

Various methods have been proposed as preprocessing techniques to tackle this problem. These methods have been motivated by either well-known signal processing techniques or statistical-based models.

METHOD COMPUTATIONAL
SMOOTHING TECHNIQUES COMPLEXITY
SIGMA FILTERING (Alqallaf et al., 2007) O(N)
SMOOTHING AND EDGE DETECTION (Huang et al., 2004) O(N)
WAVELETS (Hsu et al., 2005) O(N log N)
STATISTICAL-BASED MODELS
CIRCULAR BINARY SEGMENTATION (Olshen et al., 2004) O(N 2)
HIDDEN MARKOV MODELS (Fridlyand et al., 2004) O(C 2 N)
SPARSE BAYESIAN LEARNING (Pique-Regi et al., 2008) O(N log N)

Table 2.

Comparison based on the computational complexity of the proposed denoising techniques.

In Table 2, we present a comparison study based on the computational cost of the most recent and successful approaches. As can be noticed that the smoothing techniques are well suited to process very large amount of data such as the genetic signals compared to the statistical-based models. However, these techniques include important features such as the variant regions boundaries in the smoothing process.

Here we present our previously proposed method (Alqallaf et al., 2007), Sigma filter (SF). It is a nonlinear method used as a feature extraction to detect the variant segments edges and to smooth the rest of the genetic data. The filter is conceptually simple but effective noise smoothing algorithm. Based on the assumption of the aCGH data modeling, the SF algorithm is well suited to denoise the tested samples before further analysis. SF algorithm is motivated by the sigma probability of the Gaussian distribution, and it smooths the noise by averaging only those neighborhood variant segments which have the intensities within a fixed sigma range of the center data point. Consequently, variant segmets edges are preserved, and subtle details are retained.

3.2. Statistical significance

Few studies in the literature have addressed the power of class discovery of the recurrent copy number variations (CNVs) across multiple samples of the genetic data [52& 53]. However, they did not consider denoising the data prior to applying the statistical analysis.

To reduce the dimensionality of the detected variant regions, we apply a simple statistical-based approach to measure the significance of the candidate gemonic regions. The approach is based on the frequency difference between the case and control samples at each gemonic location. It is used as a feature selection algorithm to select a small subset of variant segments as features for classification. Figure 5 is an illustration of three RCVNs with different sizes of filtered DCN data for multiple samples of normal control (C i ) and autistic (A i ) individuals, respectively. After selecting the informative segments of the genome, we then applied comparative classification algorithms on the reduced data.

media/image6.png

Figure 5.

Schematic representation of 3 recurrent copy number variant segments (RCNVs) with different lengths. The x-axis represents the genomic position and the y-axis represents the indices of the samples, C i is for normal-control samples and A i is for autistic samples, respectively. The vertical dashed lines represent the RCNVs boundaries. The dark red and dark blue bars represent duplication and deletion for the corresponding chromosomal regions.

3.3. Data classification

Based on the collected and processed genetic data, we apply a system of classifiers that are used to identify autistic individuals based on their genetic information. This system will help improve detection, identification and diagnosis of autism, which will benefit both the patients and the society in general and will lead to early diagnosis and treatment.

Generally classifiers are used by researchers faced with the task of classification based on a given data. Classifiers are mathematical models that are able to perform the task of classification or decision making, based on a previously provided data. Classifier’s ability to spot trends and relationships in large data sets makes it well suited for many applications. In the field of medicine classifiers can be used to classify accurately diseases, genes, tumors, and other medical phenomena [54; 55; 56; 57; 58; 59& 60]. Although some attempts were made to use classifiers in genetics [61]. Our attempt is to use three comparative classifiers, namely, k-Nearest Neighbor, Neural Network, and Support Vector Machine, to help in diagnosing patients with ASD.

The leave-one-out cross-validation (LOOCV) is applied to evaluate the proposed classifiers by measuring the classification performance to accurately identify the association between the tested samples and the targeted disorder, ASD. The LOOCV involves using a single variant segment from the original sample as the validation data, and the remaining segments as the training data. This is repeated such that each variant segment in the sample is used once as the validation data.

3.3.1. k-Nearest Neighbor Classifier

The k-Nearest Neighbor (k-NN) classifier [64] is a well known nonparametric classifier. To classify a new input x, the k nearest neighbors are retrieved from the training data. The input x is then labeled with the majority class label corresponding to the k nearest neighbors. For the k-NN classifier, we used the Euclidean distance as the distance metric, and the best k between 1 and 10 was found by performing LOOCV on the training data.

3.3.2. Neural Network

Neural networks are another type of classifier or mathematical models used for classification, regression or decision making. Their structure is inspired by the human neural system and brain. It consists of many neurons, interconnected at different stages. The direction of flow of information is usually from the input stage to the output stage. Each neuron has an input and an output, where an activation function converts a neurons input to its output. The output of each neuron is connected to the next stage through a weighted connection. A learning function determines the value of the weights of all the connections. The weights are updated based on a mathematical function that relates the network together. Therefore, a neural network is considered as an adaptive network that changes its structure during the learning or training phase, based on mathematical functions that relate input data to the corresponding class labels. The sum of all neurons at the different layers and the weighted interconnections make up a complex network that is commonly referred to as a black box.

Before its use to classify a test sample, the neural network is trained on a given data set with known classes or labels. During the training phase the weights are updated to minimize the output error. The selected value of the minimum acceptable error determines when the training stops. For a difficult data where it is impossible to reach the set minimum error, the maximum number of epochs is used as criteria for stopping the training process.

3.3.3. Support Vector Machine

The Support Vector Machine (SVM) belongs to a new generation of learning system based on recent advances in statistical learning theory [65]. A linear SVM, which is used in our system, aims to find the separating hyper-plane with the largest margin, defined as the sum of the distances from a hyper-plane (implied by a linear classifier) to the closest positive and negative exemplars. The expectation is that the larger the margin, the better the generalization of the classifier. In a non-separable case, a linear SVM seeks a trade-off between maximizing the margin and minimizing the number of errors.

media/image7.png

Figure 6.

Comparison study of the performance of the three tested classifiers. The x-axis represents the number of segments and the y-axis represents the percentage average LOOCV accuracy.

Figure 6 illustrates the LOOCV classification accuracies using the tested classifiers, k-NN, NN, and SVM. The x-axis is associated with the number of selected top-ranked variant segments and the y-axis shows the average LOOCV accuracy.

4. Validation of the predicted variant segments

To evaluate our predictive power of our method in detecting and identifying patients with ASD, we use molecular test, quantitative Polymerase Chain Reaction (qPCR). It is a very sensitive and precise tool used for the quantification of nucleic acids. It can detect and quantify very small amounts of specific nucleic acid sequence. It is based on the method of PCR, developed by Kary Mullis in the 1980s. It allows the amplification of specific nucleic acid sequence (DNA) more than a billion-fold. Using qPCR allows scientists to quantify the starting amount of a specific DNA sequence in the sample before the amplification by PCR method [62].

Quantitative PCR is an indispensable tool for researchers in various fields including fundamental biology, molecular diagnostics, biotechnology, and forensic sciences. Critical points and limitations of qPCR-based assays must be considered to increase the reliability of the obtained data. For the detection of qPCR four technologies are commonly used all of which are based on the measurement of fluorescence during the PCR. One principle is based on intercalation of double-stranded DNA-binding dyes (simplest and cheapest). The other three principles are based on the introduction of an additional fluorescence-labeled oligonucleotide (probe). Detectable fluorescence are only released either after cleavage of the probe (hydrolysis probes) or during hybridization of one (molecular beacon) or two (hybridization probes) oligonucleotides to the amplicon. The introduction of an additional probe increases the specificity of the quantified PCR product and allows the development of multiplex reactions. Other technologies have been described for the detection of qPCR [63].

The qPCR method quickly became the first choice when it comes to quantitative analysis of nucleic acid because of many reasons. It is highly sensitive and it allows the detection of less than five copies (one copy in some cases) of a target sequence. It has good reproducibility. In addition, it has broad dynamic quantification range, at least 5 log units. It is also easy to use and has reasonable good value for money (low consumable and instrumentation costs).

For the purpose of this chapter, we are focusing on one of the many applications of qPCR, which is indispensable for research and diagnostics, the genetic variations.

Array CBS SF
11420
21020
32021
41321

Table 3.

Representation of the number of events (CNVs) detected by the circular binary segmentation (CBS) and sigma filtering methods, respectively, for 22 qPCR confirmed CNVs.

Table 3 shows that the number of qPCR-confirmed CNVs detected by the sigma filtering (SF) method is considerably higher than those detected using the circular binary segmentation (CBS), ranging from 4.5% to 36% more for 4 different array experiments. The results show that applying the averaging window of 2Kb allow the algorithms to be well suited for detecting variations in high-density microarray data, especially at the LCR-rich regions.

5. Conclusion

The etiology of Autism spectrum disorders involves genetic and environmental risk factors. In this chapter, we have discussed the genetics basis of the complex disorder, autism. With the recent advances in the new screening technologies to investigate the entire genome such as array comparative genomic hybridization (aCGH) and whole genome sequencing, provide the opportunities insight into the pattern of the genetic variations and reveal their roles in the genetic diseases. In this study, we have demonstrated an overview for the analysis of genetic variations in the form of DNA copy number changes and their association with autism susceptibility.

Through mathematical-based models and computational-based approaches, we analyze the genetic data trying to discover and identify the relationship between the structural chromosomal rearrangements along the genome and the targeted disorder, ASD. In conclusion, the results show strong evidence that the genetic variations contribute in the complex disorder, autism.

References

1 - J. L. Freeman, G. H. Perry, L. Feuk, R. Redon, S. A. Mc Carroll, D. M. Altshuler, H. Aburatani, K. W. Jones, C. Tyler-Smith, ME Hurles, 2006 Copy number variation: new insights in genome diversity. Genome Research 16 8 949 961
2 - J. A. Lee, J. R. Lupski, 2006 Genomic rearrangements and gene copy-number alterations as a cause of nervous system disorders Neuron 52 1 103 121
3 - P. Stankiewicz, J. R. Lupski, 2002 Molecular-evolutionary mechanisms for genomic disorders. Current Opinion in Genetics & Development on ScienceDirect, 12 3 312 319
4 - S. L. Christian, C. W. Brune, J. Sudi, R. A. Kumar, S. Liu, S. Karamohamed, J. A. Badner, S. Matsui, J. Conroy, D. Mc Quaid, 2008 Novel submicroscopic chromosomal abnormalities detected in autism spectrum disorder Biological Psychiatry 63 12 1111 1117
5 - C. R. Marshall, A. Noor, J. B. Vincent, A. C. Lionel, L. Feuk, J. Skaug, M. Shago, R. Moessner, D. Pinto, Y. Ren, 2008 Structural variation of chromosomes in autism spectrum disorder American Journal of Human Genetics, 82 2 477 488
6 - J. Sebat, B. Lakshmi, D. Malhotra, J. Troge, C. Lese-Martin, T. Walsh, B. Yamrom, S. Yoon, A. Krasnitz, J. Kendall, 2007 Strong association of de novo copy number mutations with autism. Science, 316 5823 445 449
7 - G. Kirov, D. Gumus, W. Chen, N. Norton, L. Georgieva, M. Sari, M. C. O’Donovan, F. Erdogan, MJ Owen, H. H. Ropers, 2008 Comparative genome hybridization suggests a role for NRXN1 and APBA2 in schizophrenia Human Molecular Genetics 17 3 458 465
8 - A. J. Sharp, S. Hansen, R. R. Selzer, Z. Cheng, R. Regan, J. A. Hurst, H. Stewart, S. M. Price, E. Blair, R. C. Hennekam, 2006 Discovery of previously unidentified genomic disorders from the duplication architecture of the human genome. Nature Genetics 38 9 1038 1042
9 - A. J. Sharp, H. C. Mefford, K. Li, C. Baker, C. Skinner, R. E. Stevenson, R. J. Schroer, F. Novara, M. De Gregori, R. Ciccone, 2008 A recurrent 15q13.3 microdeletion syndrome associated with mental retardation and seizures Nature Genetics 40 3 322 328
10 - R. M. Cantor, D. H. Geschwind, 2008 Schizophrenia: genome, interrupted. Neuron, 58 2 165 167
11 - H. Stefansson, D. Rujescu, S. Cichon, O. P. Pietilainen, A. Ingason, S. Steinberg, R. Fossdal, E. Sigurdsson, T. Sigmundsson, J. E. Buizer-Voskamp, 2008 Large recurrent microdeletions associated with schizophrenia. Nature, 455 7210 232 236
12 - J. L. Stone, M. C. O’Donovan, H. Gurling, G. K. Kirov, D. H. Blackwood, A. Corvin, N. J. Craddock, M. Gill, C. M. Hultman, P. Lichtenstein, 2008 Rare chromosomal deletions and duplications increase risk of schizophrenia Nature, 455 7210 237 241
13 - T. Walsh, J. M. Mc Clellan, S. E. Mc Carthy, A. M. Addington, S. B. Pierce, G. M. Cooper, AS Nord, M. Kusenda, D. Malhotra, A. Bhandari, 2008 Rare structural variants disrupt multiple genes in neurodevelopmental pathways in schizophrenia Science 320 5875 539 543
14 - X. Estivill, L. Armengol, 2007 Copy number variants and common disorders: filling the gaps and exploring complexity in genome-wide association studies. PLoS Genetics 3 10 1787 1799
15 - M. K. Greer, F. R. Brown, G. S. Pai, S. H. Choudry, A. J. Klein, 1997 Cognitive, adaptive, and behavioral characteristics of Williams syndrome. American Journal of Human Genetics, 74 5 521 525
16 - M. J. Somerville, C. B. Mervis, E. J. Young, E. J. Seo, M. del Campo, S. Bamforth, E. Peregrine, W. Loo, M. Lilley, L. A. Perez-Jurado, 2005 Severe expressive-language delay related to duplication of the Williams-Beuren locus. The New England Journal of Medicine 353 16 1694 1701
17 - Aa N. van der , L. Rooms, G. Vandeweyer, Ende. J. van den, E. Reyniers, M. Fichera, C. Romano, Chiaie. B. Delle, G. Mortier, B. Menten, 2009 Fourteen new cases contribute to the characterization of the 7q11.23 microduplication syndrome European Journal of Medical Genetics 52 2-3 94 100
18 - C. Depienne, D. Heron, C. Betancur, B. Benyahia, O. Trouillard, D. Bouteiller, A. Verloes, E. Le Guern, M. Leboyer, A. Brice, 2007 Autism, language delay and mental retardation in a patient with 7q11 duplication Journal of Medical Genetics, 44 7 452 458
19 - J. Balciuniene, N. Feng, K. Iyadurai, B. Hirsch, L. Charnas, B. R. Bill, M. C. Easterday, J. Staaf, L. Oseth, D. Czapansky-Beilman, 2007 Recurrent 10q22-q23 deletions: a genomic disorder on 10q associated with cognitive and behavioral abnormalities American Journal of Human Genetics, 80 5 938 947
20 - F. Laumonnier, S. Roger, P. Guerin, F. Molinari, R. M’Rad, D. Cahard, A. Belhadj, M. Halayem, A. M. Persico, M. Elia, 2006 Association of a functional deficit of the BKCa channel, a synaptic regulator of neuronal excitability, with autism and mental retardation. American Journal of Psychiatry, 163 9 1622 1629
21 - E. H. Cook, Jr , 2001 Genetics of autism. Child and Adolescent Psychiatric Clinics of North America, 10 2 333 350
22 - I. Helbig, H. C. Mefford, A. J. Sharp, M. Guipponi, M. Fichera, A. Franke, H. Muhle, C. de Kovel, C. Baker, Spiczak. S. von, 2009 15q13.3 microdeletions increase risk of idiopathic generalized epilepsy Nature Genetics 41 2 160 162
23 - D. F. Papolos, G. L. Faedda, S. Veit, R. Goldberg, B. Morrow, R. Kucherlapati, R. J. Shprintzen, 1996 Bipolar spectrum disorders in patients diagnosed with velo-cardio-facial syndrome: does a hemizygous deletion of chromosome 22q11 result in bipolar affective disorder? American Journal of Psychiatry, 153 12 1541 1547
24 - L. Niklasson, P. Rasmussen, S. Oskarsdottir, C. Gillberg, 2001 Neuropsychiatric disorders in the 22q11 deletion syndrome. Genetics in Medicine, 3 1 79 84
25 - A. Antonell, O. de Luis, X. Domingo-Roura, L. A. Perez-Jurado, 2005 Evolutionary mechanisms shaping the genomic structure of the Williams-Beuren syndrome chromosomal region at human 7q11.23. Genome Research 15 9 1179 1188
26 - J. A. Vorstman, M. E. Morcus, S. N. Duijff, P. W. Klaassen, Boer. J. A. Heineman-de, F. A. Beemer, H. Swaab, R. S. Kahn, H. van Engeland, 2006 The 22q11.2 deletion in children: high rate of autistic disorders and early onset of psychotic symptoms Journal of the American Academy of Child & Adolescent Psychiatry, 45 9 1104 1113
27 - I. Hertz-Picciotto, L. A. Croen, R. Hansen, C. R. Jones, J. van de Water, I. N. Pessah, 2006 The CHARGE study: an epidemiologic investigation of genetic and environmental factors contributing to autism Environmental Health Perspectives, 114 7 1119 1125
28 - R. Redon, S. Ishikawa, K. R. Fitch, L. Feuk, G. H. Perry, T. D. Andrews, H. Fiegler, M. H. Shapero, A. R. Carson, W. Chen, 2006 Global variation in copy number in the human genome. Nature 444 7118 444 454
29 - F. Shen, J. Huang, K. R. Fitch, V. B. Truong, A. Kirby, W. Chen, J. Zhang, G. Liu, S. A. Mc Carroll, K. W. Jones, 2008 Improved detection of global copy number variation using high density, non-polymorphic oligonucleotide probes BMC Genetics, 9 27
30 - D. F. Conrad, D. Pinto, R. Redon, L. Feuk, O. Gokcumen, Y. Zhang, J. Aerts, T. D. Andrews, C. Barnes, P. Campbell, 2010 Origins and functional impact of copy number variation in the human genome Nature, 464 7289 704 712
31 - G. H. Perry, A. Ben-Dor, A. Tsalenko, N. Sampas, L. Rodriguez-Revenga, C. W. Tran, A. Scheffer, I. Steinfeld, P. Tsang, N. A. Yamada, 2008 The fine-scale and complex architecture of human copy-number variation American Journal of Human Genetics, 82 3 685 695
32 - J. S. Berg, N. Brunetti-Pierri, S. U. Peters, S. H. Kang, C. T. Fong, J. Salamone, D. Freedenberg, V. L. Hannig, L. A. Prock, D. T. Miller, 2007 Speech delay and autism spectrum behaviors are frequently associated with duplication of the 7q11.23 Williams-Beuren syndrome region. Genetics in Medicine, 9 7 427 441
33 - L. Potocki, W. Bi, D. Treadwell-Deering, C. M. Carvalho, A. Eifert, E. M. Friedman, D. Glaze, K. Krull, J. A. Lee, R. A. Lewis, 2007 Characterization of Potocki-Lupski syndrome (dup(17)(p11.2p11.2)) and delineation of a dosage-sensitive critical interval that can convey an autism phenotype. American Journal of Human Genetics, 80 4 633 649
34 - C. M. Carvalho, J. R. Lupski, 2008 Copy number variation at the breakpoint region of isochromosome 17q Genome Research, 18 11 1724 1732
35 - N. M. Mukaddes, S. Herguner, 2007 Autistic disorder and 22q11.2 duplication. World Journal of Biological Psychiatry, 8 2 127 130
36 - A. Barabash, A. Marcos, I. Ancin, B. Vazquez-Alvarez, C. de Ugarte, P. Gil, C. Fernandez, M. Encinas, J. J. Lopez-Ibor, J. A. Cabranes, 2009 APOE, ACT and CHRNA7 genes in the conversion from amnestic mild cognitive impairment to Alzheimer’s disease Neurobiol Aging, 30 8 1254 1264
37 - D. T. Miller, Y. Shen, L. A. Weiss, J. Korn, I. Anselm, C. Bridgemohan, G. F. Cox, H. Dickinson, J. Gentile, D. J. Harris, 2009 Microdeletion/duplication at 15q13.2q13.3 among individuals with features of autism and other neuropsychiatric disorders Journal of Medical Genetics 46 4 242 248
38 - M. Shinawi, CP Schaaf, S. S. Bhatt, Z. Xia, A. Patel, S. W. Cheung, B. Lanpher, S. Nagl, H. S. Herding, C. Nevinny-Stickel, 2009 A small recurrent deletion within 15q13.3 is associated with a range of neurodevelopmental phenotypes Nature Genetics 41 12 1269 1271
39 - K. D. Tsuchiya, G. Wiesner, S. B. Cassidy, C. Limwongse, J. T. Boyle, S. Schwartz, 1998 Deletion 10q23.2-q23.33 in a patient with gastrointestinal juvenile polyposis and other features of a Cowden-like syndrome. Genes Chromosomes & Cancer, 21 2 113 118
40 - X. P. Zhou, K. Woodford-Richens, R. Lehtonen, K. Kurose, M. Aldred, H. Hampel, V. Launonen, S. Virta, R. Pilarski, R. Salovaara, 2001 Germline mutations in BMPR1A/ALK3 cause a subset of cases of juvenile polyposis syndrome and of Cowden and Bannayan-Riley-Ruvalcaba syndromes. American Journal of Human Genetics 69 4 704 711
41 - E. Fombonne, 2003 Epidemiological Surveys of Autism and Other Pervasive Developmental Disorders: An Update. Journal of Autism and Developmental Disorders 33 4 365 382
42 - B. Kuehn, 2007 CDC: Autism Spectrum Disorders Common. Journal of the American Medical Association, 297 940
43 - B. Abrahams, D. Geschwind, 2008 Advances in autism genetics: on the threshold of a new neurobiology Nature Reviews Genetics. 9 5 341 355
44 - A. Alqallaf, A. Tewfik, S. Selleck, 2009 Genetic variation detection using maximum likelihood estimator. IEEE International Workshop on Genomic Signal Processing and Statistics. 978-1-42444-761-9 Minnesota, USA.
45 - A. Kallioniemi, O. Kallioniemi, D. Sudar, D. Rutovitz, J. Gray, F. Waldman, D. Pinkel, 1992 Comparative genomic hybridization for molecular cytogenetic analysis of solid tumors. Science, 258 5083 818 821
46 - Y. Wang, S. Wang, 2007 A novel stationary wavelet denoising algorithm for array-based DNA copy number data. International Journal of Bioinformatics Research and Applications 3 2 206 222
47 - H. Willenbrock, J. Fridlyand, 2005 A comparison study: applying segmentation to array CGH data for downstream analyses Bioinformatics 21 22 4084 4091
48 - J. Rissanen, 1978 Modeling by Shortest Data Description Automatica 14 465 471
49 - R. Larson, J. Castie, 1982 Principles of Dynamic Programming 1-2 Marcel Dekker Inc., NY.
50 - BioDiscovery Inc. 2012 Nexus: Copy Number Professional software package (BioDiscovery), Inc., El Segundo, CA) available from: http://www.biodiscovery.com.
51 - E. Venkatraman, A. Olshen, 2007 A faster circular binary segmentation algorithm for the analysis of array CGH data Bioinformatics 23 6 657 663
52 - G. Grant, E. Manduchi, V. Cheung, W. Ewens, 1999 Significance testing for direct identity-by-descent mapping. Annals of Human Genetics 63 5 441 454
53 - S. Diskin, T. Eck, J. Greshock, Y. Mosse, T. Naylor, C. Stoeckert, B. Weber, J. Maris, G. Grant, 2006 STAC: A method for testing the significance of DNA copy-number aberrations across multiple array-CGH experiments. Genome Research 16 9 1149 1158
54 - T. Rohlfing, D. Russakoff, C. Maurer, 2004 Performance based classifier combination in atlas based image segmentation using expectation maximization parameter estimation. IEEE transaction on medical imaging, 23 8 983 994
55 - I. Guler, E. Ubeyli, 2005 ECG beat classifier designed by combined neural network model Pattern Recognition 38 199 208
56 - M. . Sampat, A. Bovik, J. Aggarwal, K. Castleman, 2005 Supervised parametric and non parametric classification of chromosome images Pattern Recognition 38 1209 1223
57 - M. Bruijne, M. Nielson, 2004 Shape particle filtering for image segmentation Medical Image Computing and Computer-Assisted Intervention (MICCAI), 1 168 175
58 - H. Shin, S. Sohn, 2005 Selected tree classifier combination based on both accuracy and error diversity Pattern Recognition 38 191 197
59 - A. Tsymbal, P. Cunningham, M. Pechenizkiy, S. Puuronen, 2003 Search strategies for ensemble feature selection in medical diagnosis. 16th IEEE symposium on computer based medical systems, 124 129
60 - H. Kook, L. Gupta, D. Molfese, K. Fadem, 2005 Multi-stimuli multi-channel data and decision fusion strategies for dyslexia prediction using neonatal ERPS Pattern Recognition 38 11 2174 2184
61 - H. Moon, H. Ahn, R. Kodell, C. Lin, S. Baek, J. Chen, 2006 Classification methods for the development of genomic signatures from high-dimensional data Genome Biology 7 12 121
62 - M. Valasek, J. Repa, 2005 The power of real-time PCR, Advances in Physiology Education 29 3 151 159
63 - D. Klein, 2002 Quantification using real-time PCR technology: applications and limitations. Trends in Molecular Medicine 8 6 257 260
64 - B. Dasarathy, 1991 Nearest Neighbor Norms: NN Pattern Classification Techniques. IEEE Computer Society Press.
65 - V. Vapink, 1998 Statistical learning Theory. Wiley-Interscience
66 - M. Shinawi, S. Cheung, 2008 The array CGH and its clinical applications Drug Discovery Today 13 17-18 760 770
67 - M. Schwab, K. Alitalo, K. Klempnauer, H. Varmus, J. Bishop, F. Gilbert, G. Brodeur, M. Goldstein, J. Trent, 1983 Amplified DNA with limited homol- ogy to MYC cellular oncogene is shared by human neuroblastoma cell lines and a neuroblastoma tumour. Nature, 305 5931 245 248
68 - K. Aldinger, J. Plummer, S. Qiu, P. Levitt, 2011 SnapShot: Genetics of Autism. Neuron, 72 418 418
69 - S. Sanders, et al. 2011 Multiple Recurrent De Novo CNVs, Including Duplications of the 7q11.23 Williams Syndrome Region, Are Strongly Associated with Autism Neuron 70 863 885